Compiling models with Amazon SageMaker Neo
Embedded software developers have long learned how to write highly optimized code that both runs fast and uses hardware resources frugally. In theory, the same techniques could also be applied to optimize machine learning predictions. In practice, this is a daunting task given the complexity of machine learning libraries and models.
This is the problem that Amazon SageMaker Neo aims to solve.
Understanding Amazon SageMaker Neo
Amazon Neo has two components: a model compiler that optimizes models for the underlying hardware, and a small runtime named Deep Learning Runtime (DLR), used to load optimized models and run predictions (https://aws.amazon.com/sagemaker/neo).
Amazon SageMaker Neo can compile models trained with the following:
- Two built-in algorithms: XGBoost and Image Classification.
- Built-in frameworks: TensorFlow, PyTorch, and Apache MXNet, as well as models in ONNX format. Many operators are supported, and...