One of the most common subsets of machine learning applications follow the build once, use many times paradigm. This type of application involves what is called the inference phase. In the inference phase, developers have to focus on running the model to serve user needs. Serving user needs might involve taking in input from the user and processing it to return the appropriate output. The following diagram describes a typical high-level machine learning application workflow:
From the preceding diagram, we can see how the inference process fits into the overall picture. In applications that follow the build once, use many times paradigm, there are two distinct phases—training or model building, and inference. Both of these are coupled together via a shared model artifact. Depending on the specific use case, the details...