Cloud computing has been known for its flexibility, it can scale anytime.
I will scale it down when traffic is low to save my cost, with just a button.
I will scale it up when traffic is high to handle my traffic spike, with just a button.
But in real world implementation, is it really just with a button?
There are two types of scaling: Vertically and Horizontally,
of course there are a fresh and new way – Serverless cloud, we will get to that later, due to lack of application support, at least in local market.
Vertical Scaling Myths:
Myth 01: The infamous dragging bar, I want more CPU / RAM, I dragged it to the right, then my cloud server will automatically immediately have more of that
Truth 01: A quick reboot will be needed for more CPU / RAM, if you are own a local storage Cloud (non-EBS, non-iSCSI, etc.), and the hypervisor itself ran out of resource for you to scale, they will have to move your cloud disk to another hypervisor, it could down for minutes to hours to vertically scale it up
Myth 02: I can also drag to the right for more disk space, automatically i can see more space in my Linux / Windows Cloud server, I can scale it back down later when I clean up some space
Truth 02: Just like CPU / RAM, there will be down time, just that it will surely be longer, any operation on the storage level need more time for processing. Depends on your provider, some might even need you to run resizefs or expand the disk manually from OS. And most importantly, scale down is NOT possible, you will always need to manually migrate the data to a new and smaller disk later, the bar can only drag to the right, NOT to the left!
Horizontal Scaling Myths:
Myth 01: Just FTP the source code to my web01 server, and clone it to web02, web03, … and so on, it will just scale up that way
Truth 01: There are a lot to concern about ‘decoupling’ the application to make horizontal scale-able, this often involve weeks or even months of cooperation between the developers and cloud solution architect, with very extensive user testings to make sure all goes well. This is often the biggest obstacle when converting your traditional single server application to cloud-friendly-horizontal-scaling architecture
Myth 02: Just clone my database server db01, db02, db03, … it can then handle three more times of load
Truth 02: Database tier has always been the part that’s not horizontal scale-able, your app can’t write the transaction to db01, without replicating it to db02 and db03, the replication itself will require maintenance and monitoring for data integrity, load balancing and read write segregation on app level will most likely involve code changes as well.
Myth 03: Yea, right, load balancing, i just clone the web servers and plug it to the load balancer then it will work
Truth 03: Most web applications today are stateful, by using cookies for user login sessions, if the user login to the first server, how to keep the session to rest of the web server? If a user uploaded an image to the first web server, how to copy it over to the next server?
For simple vertical scaling, make sure you are:
- Pre-emptive scale it, always get more resources and keep your server load below 50%, you might still not able to handle a sudden 2x traffic spike, but at least spikes below that won’t bring your site down
- Pre-emptive scaling it allow you to do it at your preferred hour, you don’t want to scale it at your daily peak traffic period
For horizontal scaling, you will, in almost all cases, need to consult a solution architect for:
- Session handling
- Static or media file replication
- Object storage off loading
- Cloud Init script
- Database tier design
- Load balancer configuration
- Source code management
- Auto and pre-emptive scaling (more on this in the next article)
A good horizontal scale-able architecture is often like this:
Come talk to us about any questions about cloud, we will be glad to hear from you. 🙂