Scalability is the capacity for a system to handle greater and greater workloads. When we create a system or program, we want to make sure that it is scalable so that it doesn't crash upon receiving too many requests. Scaling can be done in one of two ways:
- Scaling up: Increasing the hardware of your existing workers, such as upgrading from CPUs to GPUs.
- Scaling out: Distributing the workload among many workers. Spark is a common framework for doing this.
Scaling up can be as easy as moving your model to a larger cloud instance. In this section, we'll be focus on how to distribute TensorFlow to scale out our applications.