28 Sep Scaling PHP CMS in AWS
There’s two major challenge when scaling a traditional CMS like Prestashop / Drupal:
– Centralized session
– File storage synchronized
– Isolate Backend traffic to a single instance
Let’s dive into details for each of them:
This is the easiest one to be taken care of, we will use Redis as the session backend, in AWS, the service is called ElasticCache:
- Create a new Redis node in AWS
- for starting, the micro size instance would be sufficient, just scale up later when traffic has grown
- Get the Endpoint name which should be something like:
- Put the endpoint name into config.local.php and use it as a Redis session backend
- We will use this Redis node as cache backend as well, so edit the cache backend in CMS config.php also
Some other CMS relies on the session.save_handler and session.save_path of php.ini, just edit them in php.ini accordingly would do
File Storage Synchronization
We would use Lsyncd for this, all new files uploaded by Backend instance will be pushed to Frontend instances via Lsyncd.
In order for the Backend server to know the IP addresses of the Frontend instances, we use this tool:
The tool will automatically grab the instances IP from Elastic Load Balancer and update it to lsyncd.conf.
However the tool do not support Application Load Balancer (ALB) or ELBv2, we have updated monitor.php in the project to support it, feel free to contact us if you need the script. 🙂
Isolate Backend Traffic
Since Backend load is constant and do not require scaling, and for the ease of managing the master file sync by Lsyncd, we want to make sure all changes on file level, e.g. new product image uploaded, is done only on the Backend instance.
Before ALB is launched, we will need a Nginx proxy to segregate the traffic, but now we just need to create an ALB, with a simple rule that:
- If the request URL contain admin.php*
- then request forwarded to Backend instance
- else request forwarded to Frontend instances
Pro / Cons of Lsyncd
Lsyncd is the tool we used to make sure all file across all web server (back or front end) are in sync, hence users will not get a image not found 404 error.
The pros is simple, each web server has their own local file system, IOPS is all in local and not shared, avoiding the bottleneck of a shared storage such as EFS / NFS
The cons is coming from the local file system as well, Lsyncd has a startup initial rsync process with each slave instance, hence if we spawn 10 slave instance at once, the master server will be too busy running the initial sync, hence the cons, is we can only scale up 1-2 instance at once, and step it up later.
To simplified the design, we can actually just use EFS / NFS and export it out, this should make the deployment much simpler and easier, but less scalability.
We will discuss about other challenges such as ElasticCache, RDS, ALB, SSL, and the three steps in breaking down a traditional PHP CMS into horizontal scalable components in the upcoming posts.