The data cycle
The data forms a key component for model building and the learning process. The data needs to be collected, cleaned, converted, and then fed to the model for learning. The overall data life cycle is shown as follows:
One of the critical requirements for modeling is having good and balanced data. This helps in higher accuracy models and better usage of the available algorithms. A data scientist's time is mostly spent on cleansing the data before building the model.
We have seen the training and testing before deployment of the model. For testing, the results are captured as evaluation metrics, which helps us decide if we should use a particular model or change it instead.
We will see the evaluation metrics next.