Autoscaling is all about adapting your system to demand. This can mean adding more replicas to a deployment, expanding the capacity of existing nodes, or adding new nodes. While scaling your cluster up or down is not a failure, it follows the same pattern as self-healing. You can consider a cluster that is misaligned with demand as unhealthy. If the cluster is underprovisioned, then requests are not handled or wait too long, which can lead to timeouts or just poor performance. If the cluster is overprovisioned, then you're paying for resources you don't need. In both cases, you can consider the cluster as unhealthy, even if the pods and services themselves are up and running.
Just like with self-healing, you first need to detect that you need to scale your cluster, and then you can take the correct action. There are several ways to scale...