2. Versioning
By now, we understand that the whole ML system changes if the code, model, or data changes. Thus, it is critical to track and version these three elements individually. But what strategies can we adopt to track the code, model, and data separately?
- The code is tracked by Git, which helps us create a new commit (a snapshot of the code) on every change added to the codebase. Also, Git-based tools usually allow us to make releases, which typically pack multiple features and bug fixes. While the commits contain unique identifiers that are not human-interpretable, a release follows more common conventions based on their major, minor, and patch versions. For example, in a release with version “v1.2.3,” 1 is the major version, 2 is the minor version, and 3 is the patch version. Popular tools are GitHub and GitLab.
- To version the model, you leverage the model registry to store, share, and version all the models used within your system. It usually...