Exploratory data analysis
As we have learned throughout this book, our first step should be to engage in some exploratory data analysis (EDA) to get familiar with our data. In the interest of brevity, this section will include a subset of the EDA that's available in each of the notebooks—be sure to check out the respective notebooks for the full version.
Tip
While we will use pandas
code to perform our EDA, be sure to check out the pandas-profiling
package (https://github.com/pandas-profiling/pandas-profiling), which can be used to quickly perform some initial EDA on the data via an interactive HTML report.
Let's start with our imports, which will be the same across the notebooks we will use in this chapter:
>>> %matplotlib inline >>> import matplotlib.pyplot as plt >>> import numpy as np >>> import pandas as pd >>> import seaborn as sns
We will start our EDA with the wine quality data before moving on to...