Autoscaling Pods vertically using a VerticalPodAutoscaler
In the previous section, we managed requests
and limits
for compute resources manually. Setting these values correctly requires some accurate human guessing, observing metrics, and performing benchmarks to adjust. Using overly high requests
values will result in a waste of compute resources, whereas setting requests
too low might result in Pods being packed too densely and having performance issues. Also, in some cases, the only way to scale the Pod workload is to do it vertically by increasing the amount of compute resources it can consume. For bare-metal machines, this would mean upgrading the CPU hardware and adding more physical RAM. For containers, it is as simple as allowing them more of the compute resource quotas. This works, of course, only up to the capacity of a single Node. You cannot scale vertically beyond that unless you add more powerful Nodes to the cluster.
To help resolve these issues, you can use a VPA...