Adopting the DevOps mindset
DevOps is a team mindset that tries to minimize the silos between developers and system operators to shorten the development life cycle of a product. Developers are constantly changing a product to introduce new features and modify existing behaviors. On the other side, system operators need to keep the production systems stable and up and running. In the past, these two groups of people were isolated, and developers were throwing the new piece of software over to the operations team who would try to deploy it in production. As you can imagine, things didn't work that well all the time, causing frictions between those two groups. When it comes to DevOps, one fundamental practice is that a team needs to be autonomous and should contain all required disciplines, both developers and operators.
When it comes to data science, some people refer to the practice as MLOps, but the fundamental ideas remain the same. A team should be self-sufficient, capable of developing all required components for the overall solution, from the data engineering parts that bring in data and the training of the models all the way to operationalizing the model in production. These teams usually work in an agile manner, which embraces an iterative approach, seeking constant improvement based on feedback, as seen in Figure 1.7:
The MLOps team operates on its backlog and performs the iterative steps you saw in the Working on a data science project section. Once the model is ready, the system administrators, who are part of the team, are aware of what needs to be done to take the model into production. The model is monitored closely, and if a defect or performance degradation is observed, a backlog item is created for the MLOps team to address in their next sprint.
In order to minimize the development and deployment life cycle of new features in production, automation needs to be embraced. The goal of a DevOps team is to minimize the number of human interventions in the deployment process and automate as many repeatable tasks as possible.
Figure 1.8 shows the most frequently used components while developing real-time models using the MLOps mindset:
Let's analyze those components:
- ARM templates allow you to automate the deployment of Azure resources. This enables the team to spin up and down development, testing, or even production environments in no time. These artifacts are stored within Azure DevOps in a Git version-control repository. The deployment of multiple environments is automated using Azure DevOps pipelines. You are going to read about ARM templates in Chapter 2, Deploying Azure Machine Learning Workspace Resources.
- Using Azure Data Factory, the data science team orchestrates the pulling and cleansing of the data from the source systems. The data is copied within a data lake, which is accessible from the AzureML workspace. Azure Data Factory uses ARM templates to define its orchestration pipelines, templates that are stored within the Git repository to track changes and be able to deploy in multiple environments.
- Within the AzureML workspace, data scientists are working on their code. Initially, they start working on Jupyter notebooks. Notebooks are a great way to prototype some ideas, as you will see in Chapter 7, The AzureML Python SDK. As the project progresses, the scripts are exported from the notebooks and are organized into coding scripts. All those code artifacts are version-controlled into Git, using the terminal and commands such as the ones seen in Figure 1.9:
- When a model is trained, if it is performing better than the model that is currently in production, it is registered within AzureML, and an event is emitted. This event is captured by the AzureML DevOps plugin, which triggers the automatic deployment of the model in the test environment. The model is tested within that environment, and if all tests pass and no errors have been logged in Application Insights, which is monitoring the deployment, the artifacts can be automatically deployed to the next environment, all the way to production.
The ability to ensure both code and model quality plays a crucial role in this automation process. In Python, you can use various tools, such as Flake8, Bandit, and Black, to ensure code quality, check for common security issues, and consistently format your code base. You can also use the pytest
framework to write your functional testing, where you will be testing the model results against a golden dataset. With pytest
, you can even perform integration testing to verify that the end-to-end system is working as expected.
Adopting DevOps is a never-ending journey. The team will become better every time you repeat the process. The trick is to build trust in the end-to-end development and deployment process so that everyone is confident to make changes and deploy them in production. When the process fails, understand why it failed and learn from your mistakes. Create the mechanisms that will prevent future failures and move on.