Pod resource requests and limits
Before we dive into the topics of autoscaling in Kubernetes, we need to explain a bit more about how you can control the CPU and memory resource (known as compute resources) usage by Pod containers in Kubernetes. Controlling the use of compute resources is important since, in this way, you can enforce resource governance – this allows better planning of the cluster capacity and, most importantly, prevents situations when a single container can consume all compute resources and prevent other Pods from serving the requests.
When you create a Pod, it is possible to specify how much compute resources its containers require and what the limits are in terms of permitted consumption. The Kubernetes resource model provides an additional distinction between two classes of resources: compressible and incompressible. In short, a compressible resource can be easily throttled, without severe consequences. A perfect example of such a resource is the CPU...