Data exploration
In this stage, your data is selected and you can explore the quantity, quality, sparsity, and format of the data. You can find the number of data points in each class if you have categorical output in supervised learning, distribution of features, confidence in output variables, if available, and other characteristics of the data you get out of the data selection stage. This process helps you identify issues with your data that need to be fixed in data wrangling, which is the next step in the life cycle, or opportunities for improving your data by revising your data selection process.