Running compiled models on Amazon’s Trainium and Inferentia custom hardware
So far in this book, most of the accelerators we evaluated have been GPUs designed and built by NVIDIA. As we learned earlier, NVIDIA’s excellent software enables the lion’s share of deep learning frameworks to run nicely on those same GPUs, which ends up being a primary deciding factor in using GPUs. We also learned earlier how those same GPUs are also available on AWS, notably through our machine learning service, Amazon SageMaker.
However, as you have no doubt realized by this point, the price tag of those same GPUs can be high! Even though AWS has generous enterprise discount programs, such as using reserved instances to save up to 75% (6), you would still benefit from learning about alternatives. Basic economics tells us that when supply increases, such as through alternative accelerators, while demand stays constant, the price drops! This is exactly what we’re thrilled to...