Designing for scale
Traditionally, designing for scale meant carefully sizing your infrastructure for peak usage, and then adding a factor to handle variability in load. At some point when you reach a certain threshold on CPU, memory, disk (capacity and throughput), or network bandwidth, you will repeat the exercise to handle increased loads and initiate a lengthy procurement and provisioning process. Depending on the application, this could mean a scale up (vertical scaling) with bigger machines or scale out (horizontal scaling) with more number of machines being deployed. Once deployed, the new capacity would be fixed (and run continuously) whether the additional capacity was being utilized fully or not.
In cloud applications, it is easy to scale both vertically and horizontally. Additionally, the increase and the decrease in the number of nodes (in horizontal scalability) can be done automatically to improve resource utilization, and manage costs better.
Typically, cloud applications are...