Chapter 15: Model Interoperability, Hardware Optimization, and Integrations
In the previous chapter, we discovered how to deploy our machine learning scoring either as a batch or real-time scorer, what endpoints are and how we can deploy them, and finally, we had a look at how we can monitor our deployed solutions. In this chapter, we will dive deeper into additional deployment scenarios for ML inferencing, possible other hardware infrastructure we can utilize, and how we can integrate our models and endpoints with other Azure services.
In the first section, we will have a look at how to provide model interoperability by converting ML models into a standardized model format and an inference-optimized scoring framework. Open Neural Network Exchange (ONNX) is a standardized format to serialize and store ML models and acyclic computational graphs and operations efficiently. We will learn what the ONNX framework is, how we can convert ML models from popular ML frameworks to ONNX, and...