There are three key parts to most machine learning classifiers, which are as follows:
- The training pipeline
- The neural network setup and training outputs
- The usage pipeline
The training pipeline obtains data, stages it, cleanses it, homogenizes it, and puts it in a format acceptable to the neural network. Do not be surprised if the training pipeline takes 80% to 85% of your effort initially—this is the reality of most machine learning work. Generally, the more realistic the training data, the more time spent on the training pipeline. In enterprise settings, the training pipeline might be an ongoing effort being enhanced perpetually. This is especially true as datasets get larger.
The second part, the neural network setup, and training, can be quick for routine problems and can be a research-grade effort for harder problems. You may find yourself making small...