Autoscaling in Kubernetes
Kubernetes allows you to automatically scale your workloads to adapt to changing demands on your applications. The information gathered from the Kubernetes Metrics server is the data that is used for driving the scaling decisions. In this book, we will be covering two types of scaling action—one that impacts the number of running pods in a Deployment and another that impacts the number of running nodes in a cluster. Both are examples of horizontal scaling. Let's briefly gain an intuition for what both the horizontal scaling of pods and the horizontal scaling of nodes would entail:
- Pods: Assuming that you filled out the
resources:
section ofpodTemplate
when creating a Deployment in Kubernetes, each container within that pod will have therequests
andlimits
fields, as designated by the correspondingcpu
andmemory
fields. When the resources needed to process a workload exceed that which you have allocated, then by adding additional replicas...