Compiling models with Amazon SageMaker Neo
Embedded software developers have long learned how to write highly optimized code that both runs fast and uses hardware resources frugally. In theory, the same techniques could also be applied to optimize machine learning predictions. In practice, this is a daunting task given the complexity of machine learning libraries and models.
This is the problem that Amazon Neo aims to solve.
Understanding Amazon Neo
Amazon Neo has two components: a model compiler that optimizes models for the underlying hardware, and a small runtime named the Deep Learning Runtime (DLR), used to load optimized models and run predictions (https://aws.amazon.com/sagemaker/neo).
Amazon Neo can compile models trained with:
- Two built-in algorithms: XGBoost, and Image Classification.
- Built-in frameworks: TensorFlow/Keras, PyTorch, Apache MXNet/Gluon, as well as models in ONNX format. Many operators are supported, and you can find the full list at...