Horizontal pod autoscaling
Kubernetes can watch over your pods and scale them when the CPU utilization or some other metric crosses a threshold. The autoscaling resource specifies the details (percentage of CPU, how often to check) and the corresponding autoscale controller adjusts the number of replicas, if needed.
The following diagram illustrates the different players and their relationships:
Figure 8.1: HPA interacting with pods
As you can see, the horizontal pod autoscaler (HPA) doesn't create or destroy pods directly. It relies instead on the replication controller or deployment resources. This is very smart because you don't need to deal with situations where autoscaling conflicts with the replication controller or deployments trying to scale the number of pods, unaware of the autoscaler efforts.
The autoscaler automatically does what we had to do ourselves before. Without the autoscaler, if we had a replication controller with replicas set to...