Mechanisms for autoscaling
There are four primary indicators to watch during the operation of an application that can indicate potential issues with maintaining uptime or responsiveness SLAs. These include CPU load, I/O load (often seen as disk pressure in Kubernetes), request and network load (often seen as network pressure in Kubernetes), and memory load. Understanding these indicators is essential in helping to prepare you to adjust configuration settings, thus leading to scalable supporting infrastructure. Understanding how your application components affect each of these indicators is important as well.
Compute and CPU load
The amount of CPU that’s utilized will vary heavily across physical machines, virtual machines, and even hosted cloud services. With cloud services, even greater amounts of fine-grained adjustments can be made. For example, with Azure App Service, the amount of compute initially available to an App Service is tied directly to the App Service Plan...