Overview of MODX Cloud Infrastructure – MODX Cloud Support

Purposefully architected to get the most from PHP apps with unmatched Speed, Scalability, Maintenance, Disaster Recovery and Compliance

MODX Cloud is a managed, secure web hosting platform. We perform constant server maintenance, including reboot-less kernel updates and frequent security patching. These are required to keep your sites safe and performing at peak performance. Customers are responsible for keeping their applications running on our platform up to date.

We prioritize the reliability, performance, and availability of your hosted services. This article explains our use of redundant, hot-swappable components in our servers, our Recovery Time Objective (RTO) and Recovery Point Objective (RPO) policy, and our multi-data center backup and recovery strategy.

Website Traffic

Multi-tenant Platforms

MODX’s multi-tenant platforms are built on top of large bare metal servers with SSDs configured together in RAID. We architect them to handle more daily traffic than most sites will ever see in a month. These highly tuned, always-maintained NGINX web servers are perfect for any PHP app, including WordPress.

MODX Revolution benefits from additional content staging and workflow tools in the MODX Cloud Dashboard. Many users have reported hundreds of simultaneous site visitors during busy holiday seasons on our multi-tenant platforms.

Private Servers

MODX Cloud private servers are the same as our public platform, only on VMs and dedicated to a single customer. They can be specifically tuned for websites—designed to comfortably handle 1000 concurrent visitors in Google Real-Time traffic reports. Well-architected (i.e., well-cached) sites on single nodes with custom caching have sustained 5000 concurrent visitors in Google Real-Time visitor reports. Private servers support custom configurations and additional software, like OpenSearch.

If scaling and ultimate uptime are a requirement, a High-Availability (HA) Cluster will be required. A three-node HA cluster should scale traffic linearly, with 3000+ concurrent visitors depending on how dynamic the page content is, covering almost any future traffic requirements and growth.

Redundant, Hot-Swappable Components

Our servers are equipped with redundant, hot-swappable components. Here's what this means and why it's beneficial:

Redundancy: We use multiple components for critical systems. If one component fails, another can immediately take over, ensuring continuous operation.
Hot-Swappable: These components can be replaced while the server runs without shutting down or restarting the system.

Benefits:

Increased Uptime: If a component fails, the redundant component takes over instantly, minimizing or eliminating downtime.
Easier Maintenance: We can replace faulty components without interrupting your service.
Improved Reliability: Multiple components reduce the risk of system failure due to a single point of failure.
Seamless Upgrades: We can upgrade components without scheduling downtime.

Examples of redundant, hot-swappable components in our infrastructure include:

Power supplies
Solid state hard drives in RAID 10 configurations
Cooling fans
Network interfaces

Host-side Security Measures

MODX is responsible for and implements the following security measures to protect your hosted environment:

Regular Patching: We apply security patches to all managed servers based on Ubuntu’s severity classifications:
• Critical patches: Within 24 hours of release
• High patches: Within 7 days of release
• Medium patches: Within 30 days of release
• Low and Negligible patches: During routine maintenance cycles
For more information on our patching policy, see Understanding Security Patching in MODX’s Managed Ubuntu (Linux) Hosting.
Firewall Protection: Robust firewall rules filter incoming traffic and protect against common attack vectors.
DDoS Protection: Our infrastructure includes measures to mitigate Distributed Denial of Service (DDoS) attacks.
Secure SSH Access: We use key-based authentication for SSH access. Users with root access can only connect when authenticated to our Tailscale VPN, providing an extra layer of security.
Closed Ports: The only ports open in MODX Cloud are those related to serving and building websites: 22, 80, and 443. For additional information on how SSH port 22 is protected, see our KB article on SSH over Port 22: Closed or Open.
Regular Security Audits: We conduct periodic security audits of our infrastructure to identify and address potential vulnerabilities.
Client Isolation: We use environments to isolate each client’s environment, preventing potential issues from affecting other clients. These environments further limit the available shell commands to those necessary for common website maintenance and site-building tasks and cannot use escalated sudo privileges.

Some customers may require additional security measures or hardening, including closing or changing ports, custom configurations, and custom software. For these use cases, we offer private servers—virtual machine instances with starting configurations identical to our large multi-tenant bare metal platforms. Private servers are only accessible to a single account owner and can support custom software and site-specific tuning. Learn more about Private Servers for MODX Cloud customers.

As a reminder, customers are responsible for keeping their applications running on our platform up to date and should take additional security precautions for important websites.

RTO and RPO Policy

RTO (Recovery Time Objective) and RPO (Recovery Point Objective) are key concepts in our disaster recovery strategy.

Because of the redundant, hot-swappable components used for our infrastructure, most failures can be addressed more quickly than our RTO target. Only in the event of a catastrophic data center or hardware loss (e.g., the result of a fire), would the full RTO and RPO be likely to come into play:

RTO: This is the maximum acceptable time for restoring our systems after a disaster or failure. Our RTO is ASAP, but for a complete platform rebuild, it should not take longer than 12 hours.
RPO: This is the maximum acceptable amount of data loss measured in time. Our RPO for server configurations is 20 minutes. The RPO for individual websites will be their latest backups should a catastrophic RAID failure occur.

What this means for you:

In the event of a disaster resulting in a complete loss of a data center or server, we aim to have our systems back online within 12 hours, though we expect much faster times.
At most, you might lose data after the previous nightly backup in such an event.

Multi-Data Center Backup and Recovery Strategy

To ensure the highest level of data protection and service availability, we implement a robust multi-data center strategy:

Geographically Distributed Backups: We store backups in object storage in redundant data centers located in different geographical regions. This approach protects your data from localized disasters affecting a single data center.
Cross-Data Center Recovery: In the event of a catastrophic data center outage, we have the capability to restore services in alternative data centers. This ensures that your services can be returned online even if an entire data center becomes unavailable.
Failover Testing: We regularly conduct failover tests to ensure we can quickly and effectively switch operations to alternative data centers when necessary.

How we achieve our RTO and RPO objectives:

Regular Backups: We perform frequent backups to ensure minimal data loss.
Hot Swappable Systems: Our hot-swappable infrastructure allows for quick recovery of failed hardware on the data center floor.
Multiple Data Centers: We can quickly re-provision and restore sites across data centers.
Comprehensive Disaster Recovery Plan: We have a detailed plan in place for various scenarios, including full data center outages.
Regular Testing: We routinely test our recovery procedures, including cross-data center recoveries, to ensure we can meet our RTO and RPO.

SOC 2 and other Compliance Reports

We have partnered with IBM to operate our data centers for all customer platforms. As a result, we are able to offer compliance reports upon request. Please open a ticket with us with the requested report, the name of the person requesting it, their title, email, phone, and the reason for the report request, such as “vendor evaluation.”

Continuous Improvement

We regularly review and update our infrastructure and policies to provide the best possible service. This includes:

Evaluating new technologies for improved redundancy and performance
Refining our disaster recovery procedures
Adjusting our RTO and RPO based on evolving business needs and technological capabilities
Expanding our network of data centers to enhance geographical distribution and resilience

By leveraging redundant, hot-swappable components, maintaining a strict RTO/RPO policy, and implementing a multi-data center backup and recovery strategy, we strive to provide you with a highly available, reliable hosting environment that can withstand even severe disruptions. This comprehensive approach ensures that your data and services are protected against various potential issues, from individual component failures to large-scale data center outages.

If you have any questions about our infrastructure, disaster recovery capabilities, or policies, please don't hesitate to contact our support team.