Harnessing the Horizontal Pod Autoscaler
As we've seen, Deployments and ReplicaSets allow you to specify a total number of replicas that should be available at a certain time. However, neither of these structures allow automatic scaling – they must be scaled manually.
Horizontal Pod Autoscalers (HPA) provide this functionality by existing as a higher-level controller that can change the replica count of a Deployment or ReplicaSet based on metrics such as CPU and memory usage.
By default, an HPA can autoscale based on CPU utilization, but by using custom metrics this functionality can be extended.
The YAML file for an HPA looks like this:
hpa.yaml
apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: myapp-hpa spec: maxReplicas: 5 minReplicas: 2 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp-deployment...