Using existing tools to serve models
There are a lot of tools to serve models available now. Some popular tools include the following:
- BentoML (https://www.bentoml.com/)
- MLflow (https://mlflow.org/)
- KServe (https://kserve.github.io/website/0.8/)
- Seldon core (https://www.seldon.io/solutions/open-source-projects/core)
- Cortex (https://www.cortex.dev/)
- TensorFlow Serving (https://www.tensorflow.org/tfx/guide/serving)
- TorchServe (https://pytorch.org/serve/index.html)
- Ray Serve (https://www.ray.io/ray-serve)
- Multi Model Server (MMS) (https://github.com/awslabs/multi-model-server)
- ForestFlow (https://github.com/ForestFlow/ForestFlow)
- DeepDetect (https://www.deepdetect.com/overview/introduction)
- Some examples of app-serving tools are CoreML and TensorFlow.js
There are many other tools being developed and made available to users. We will discuss a few of these tools in detail along with examples in the last part of this book.
Sometimes, developers also use very basic REST APIs developed using Flask or FastAPI if the model is simple and does not need frequent updates. This helps software engineers just follow the web development serving life cycle instead of the complex ML model serving life cycle.
These tools aim to reduce the challenges involved in model serving and make resilient model serving easier. However, the availability of a large number of tools also gives rise to confusion in choosing the best tool.
We have now discussed the advantages and challenges of model serving and also introduced some currently available tools for model serving.