Visualizing the flights dataset
Exploratory data analysis can be guided by visualizations, and pandas provides a great interface for quickly and effortlessly creating them. One strategy when looking at a new dataset is to create some univariate plots. These include bar charts for categorical data (usually strings) and histograms, boxplots, or KDEs for continuous data (always numeric).
In this recipe, we do some basic exploratory data analysis on the flights dataset by creating univariate and multivariate plots with pandas.
How to do it…
- Read in the flights dataset:
>>> flights = pd.read_csv('data/flights.csv') >>> flights MONTH DAY WEEKDAY ... ARR_DELAY DIVERTED CANCELLED 0 1 1 4 ... 65.0 0 0 1 1 1 4 ... -13.0 0 0 2 1 1 4 ... 35.0 0 0 3 1 1 4 ... -7.0 0 ...