Best practices for building performant ML workloads
Given the compute- and time-intensive nature of ML workloads, it is important to choose the most performant resources appropriate for each individual phase of the workload. Computation, memory, and network bandwidth requirements are unique to each phase of the ML process. Besides the performance of the infrastructure, the performance of the model as measured by metrics such as accuracy is also important. In this section, we will discuss best practices to apply in selecting the most performant resources for building ML workloads on SageMaker.
Let's now look at best practices for building performant ML workloads on AWS in the following sections.
Rightsizing ML resources
SageMaker supports a variety of ML instance types with a varying combination of CPU, GPU, FPGA, memory, storage, and networking capacity. Each instance type, in turn, supports multiple instance sizes. So, you have a range of choices to choose from to...