When an application reaches its capacity, the most intuitive way to tackle the problem is by adding more power to the application. However, over provisioning resources to an application is also a situation we want to avoid, and we would like to appropriate any excess resources for other applications. For most applications, scaling out is a more recommended way of resolving insufficient resources than scaling up due to physical hardware limitations. In terms of Kubernetes, from a service owner's point of view, scaling in/out can be as easy as increasing or decreasing the pods of a deployment, and Kubernetes has built-in support for performing such operations automatically, namely, the Horizontal Pod Autoscaler (HPA).
Depending on the infrastructure you're using, you can scale the capacity of the cluster in many different ways. There's an add-on...