Machine learning training pipeline
A machine learning pipeline is the end-to-end process of training one or more machine learning models and then deploying them to a live environment. It may involve stages such as data collection, model training, validation, deployment, monitoring, and iterative improvement, with a focus on scalability, efficiency, and robustness.
Various steps during the offline training are shown in Figure A.1. Please note that some of the steps may not be necessary depending on the problem at hand.
Figure A.1 – High-level steps in a machine learning training pipeline
The following is the sequence of steps involved in building a model that can handle data imbalance:
- Gather data: The first step involves gathering the necessary data for training the machine learning model. This data can be sourced from various places such as databases, files, APIs, or through web scraping. Immediately after gathering, it’s often...