Another important visual in exploratory data analysis is the box plot, also known as the box-and-whisker plot. It's built based on the five-number summary, which is the minimum, first quartile, median, third quartile, and maximum values. In a standard box plot, these values are represented as follows:
It's a very convenient way of comparing several distributions. In general, the whiskers of the plot generally extend to the extreme points. Alternatively, you can cut them with the 1.5 interquartile range. Let's check our CRIM and RM features:
In [60]: %matplotlib notebook
%matplotlib notebook
import matplotlib.pyplot as plt
from scipy import stats
samples = dataset.data
fig, (ax1,ax2) = plt.subplots(1,2, figsize =(8,3))
axs = [ax1, ax2]
list_features = ['CRIM', 'RM']
ax1...