Scaling on the Cloud – How and When ? Part 2

Scaling on the Cloud – How and When ? Part 2

We have discussed about the HOW of scaling in Part 1, let’s discuss on the WHEN of scaling.

1. Preemptive Scaling

In other words, manual scaling. You are expecting an increase of traffic like:

  • A crazy product promotion with huge discount on your e-commerce site
  • Marketing spending to draw huge traffic, such as LINE events, Facebook campaigns, etc.
  • Your guts telling you your site is going viral tomorrow

What’s good is this manual thing? Well, there are a few:

  • Run load tests on the manual preemptive infra, tell your marketing team, management team with prove that the infra is ready
  • Predictive of spending, you scale it, you know how much the spike is gonna costs you on the infra

If you are using the simple vertical scaling method, you will have to shut down the cloud server,
issue the resize cloud command, then wait for unknown amount of time.

Hence we can imagine that this has to be done on non-peak hours, and prayer will be needed for shorter period of down time.

But if you have a horizontal scaling ready infra:

  1. Fire up new cloud server with the ready made image
  2. Add the server to your load balancers

There’s no downtime since you are just adding in new servers, the base production infra will be up and running all the time.


2. Auto Scaling

Ahhhh, the ultimate, magical auto-scaling.

Auto-scaling is one of the main reasons for applications to move to cloud,

  • it scales up when the load is going up,
  • it scales down when the load is going down.

However implementing auto-scaling blindly will cause a lot of down time,
the scaling policy will need these to be fine-tuned, re-tuned, and maintained all the while as the application changes:

  • Defining ‘high load’, how high is the load then it should be considered high, is it 100% CPU for 1 second be considered ?
  • How often should the policy be executed? Will executing it every minute spawning too many servers before the load is balanced out?
  • When to scale down? How slow should it be slowed down? Too quick it will cause a spike in load or even site down, too slow then it could costs more infra costs

Just like what we learnt from basic computer science – there’s no perfect how or when,
it’s often depending on business requirements, company culture, and policies of the client.

For enterprises that’s need to play safe: Horizontal scaling + Preemptive + Strict Auto-scaling policy

For startups that’s looking for quick starting: Vertical scaling + Preemptive

There are a lot more cases, having problem to decide? Come talk to our cloud solution architects. 🙂