Exploratory data analysis
Oftentimes, you will find yourself provided with a dataset that you know very little about. Throughout this book, we’ve shown ways to manually sift through data, but there are also tools out there that can help automate potentially tedious tasks and help you grasp the data in a shorter amount of time.
YData Profiling
YData Profiling bills itself as the “leading package for data profiling, that automates and standardizes the generation of detailed reports, complete with statistics and visualizations.” While we discovered how to manually explore data back in the chapter on visualization, this package can be used as a quick-start to automatically generate many useful reports and features.
To compare this to some of the work we did in those chapters, let’s take another look at the vehicles dataset. For now, we are just going to pick a small subset of columns to keep our YData Profiling minimal; for large datasets, the performance...