In order to fully benefit from the advantages of modern cloud platforms, we need to consider their characteristic properties when developing applications that should run on these platforms.
One of the main design goals of cloud applications is scalability. On the one hand, this means growing your application's resources as needed in order to efficiently serve all your users. On the other hand, it also means shrinking your resources back to an appropriate level when you do not need them anymore. This allows you to run your application in a cost-efficient manner without having to constantly overprovision for peak workloads.
In order to achieve this, typical cloud deployments often use small virtual machine instances that host an application and scale by adding (or removing) more of these instances. This method of scaling is called horizontal scaling or scale out—as opposed to vertical scaling or scale up, where you would not increase the number of instances, but provision more resources to your existing instances. Horizontal scaling is often preferred to vertical scaling for several reasons. First, horizontal scaling promises unlimited linear scalability. On the other hand, vertical scaling has its limits due to the fact that the number of resources that you can add to an existing server cannot grow infinitely. Secondly, horizontal scaling is often more cost-efficient since you can use cheap commodity hardware (or, in cloud environments, smaller instance types), whereas larger servers often grow exponentially more expensive.
All major cloud providers offer the ability to perform horizontal scaling automatically, depending on your application's current resource utilization. This feature is called auto-scaling. Unfortunately, you do not get horizontal scalability for free. In order to be able to scale out, your application needs to adhere to some very important design goals that often need to be considered from the start, as follows:
- Statelessness: Each instance of a cloud application should not have any kind of internal state (meaning that any kind of data is saved for later use, either in-memory or on the filesystem). In a scale-out scenario, subsequent requests might be served by another instance of the application and, for this reason, must not rely on any kind of state being present from previous requests. In order to achieve this, it is usually necessary to externalize any kind of persistent storage, such as databases and filesystems. Both database services and file storage are often offered as managed services by the cloud provider that you use in your application.
- Ease of deployment: When scaling out, you will need to deploy new instances of your application quickly. Creating a new instance should not require any kind of manual setup, but should be automated as much as possible (ideally completely).
- Resiliency: In a cloud environment, especially when using auto-scaling, instances may be shut down at a moment's notice. Also, most cloud providers do not guarantee an extremely high availability on individual instances (and suggest scaling out instead, optionally across multiple availability zones). For this reason, termination and sudden death (either intentionally, in case of auto-scaling, or unintentionally, in case of failure) is something that we always need to expect in a cloud environment, and the application must handle it accordingly.
Achieving these design goals is not always easy. Cloud providers often support you in this task by offering managed services (for example, highly scalable database services of distributed file storage) that otherwise you would have to worry about yourself. Concerning your actual application, there is the twelve-factor app methodology (which we will cover in more detail in a later section), which describes a set of rules for building scalable and resilient applications.