Deployment strategies – what do we do with these outputs?
Once you’re happy with the models you’ve chosen (including their performance and error rate), you’ve got a good level of infrastructure to support your product and chosen AI model’s use case; you’re ready to go to the last step of the process and deploy this code into production. Keeping up with a deployment strategy that works for your product and organization will be part of the continuous maintenance we’ve outlined in the previous section. You’ll need to think about things such as how often you’ll need to retrain your models and refresh your training data to prevent model decay and data drift. You’ll also need a system for continuously monitoring your model’s performance so this process will be really specific to your product and business, particularly because these periods of retraining will require some downtime for your system.
Deployment is going to be a dynamic process because your models are trying to effectively make predictions of real-world data for the most part, so depending on what’s going on in the world of your data, you might have to give deployment more or less of your attention. For instance, when we were working for an ML property-tech company, we were updating, retraining, and redeploying our models almost daily because we worked with real estate data that was experiencing a huge skew due to rapid changes in migration data and housing price data due to the pandemic. If those models were left unchecked and there weren’t engineers and business leaders on both sides of this product, on the client’s end and internally, we might not have caught some of the egregious liberties the models were making on behalf of under-representative data.
There are also a number of well-known deployment strategies you should be aware of. We will discuss them in the following subsections.
Shadow deployment strategy
In this deployment strategy (often referred to as shadow mode), you’re deploying a new model with new features along with a model that already exists so that the new model that’s deployed is only experienced as a shadow of the model that’s currently in production. This also means that the new model is handling all the requests it’s getting just as the existing model does but it’s not showing the results of that model. This strategy allows you to see whether the shadow model is performing better on the same real-world data it’s getting without interrupting the model that’s actually live in production. Once it’s confirmed that the new model is performing better and that it has no issues running, it will then become the predominant model fully deployed in production and the original model will be retired.
A/B testing model deployment strategy
With this strategy, we’re actually seeing two slightly different models with different features to get a sense of how it’s working in the live environment concurrently. The two models are set up at the same time and the performance is optimized to reward conversion. This is effectively like an experiment where you’re looking at the results of one model over another and you’re starting with some hypothesis or expectation of how one is performing better than another, and then you’re testing that hypothesis to see whether you were right. The differences in your models do, however, have to be slight because if there’s too much variety between the features of the two, you actually won’t understand what’s creating the most success for you.
Canary deployment strategy
Here, we see a more gradual approach to deployment where you actually create subsets of users that will then experience your new model deployment. Here, we’re seeing the number of users that are subjected to your new model gradually increasing over time. This means that you can have a buffer time between groups of users to understand how they’re reacting and interacting with this new model. Essentially, you’re using varying groups of your own users as testers before you release to a new batch so you can catch bugs more gradually as well. It’s a slow but rewarding process if you have the patience and courage.
There are more strategies to choose from but keep in mind that the selection of these strategies will depend on the nature of your product, and what’s most important to your customers and users is your budget, your metrics and performance monitoring, your technical capacity and knowledge, and the timeline you have. Beyond your deployment, you’re going to have to help your business understand how often they should be doing code refactoring and branching as well.
Now that we’ve discussed the different deployment strategies, let’s see what it takes to succeed in AI.