Scaling applications with Horizontal Pod Autoscaler
HPA is a component of Kubernetes that allows you to scale pods (through a Deployment/ReplicaSet) based on metrics rather than manual scaling commands. The metrics are collected by the K8s metrics server, so you will need to have this deployed in your cluster. The following diagram illustrates the general flow.
Figure 18.7 – High-level HPA flow
In the preceding diagram, we can see the following:
- HPA reads metrics from the metrics server.
- There is a control loop that triggers HPA to read the metrics every 15 seconds.
- HPA assesses these metrics against the desired state of the autoscaling configuration and will scale the deployment if needed.
Now we’ve looked at the concepts behind the HPA, let’s configure and test it.
Installing HPA in your EKS cluster
As we’ve discussed, HPA is a feature of K8s, so no installation is necessary; however, it does depend...