One of the key problems in machine learning (ML) is understanding how to scale and parallelize the learning across multiple machines. Whether you are training deep learning models, which are very heavy on hardware usage, or just launching machines for creating predictions, it is essential that we select the appropriate hardware configuration, both for cost considerations and runtime performance reasons.
In this chapter, we will cover the following topics:
- Choosing your instance types
- Distributed deep learning