Examining both distribution shape and outliers with violin plots
Violin plots combine histograms and boxplots in one plot. They show the IQR, median, and whiskers, as well as the frequency of observations at all ranges of values. It is hard to visualize how that is possible without seeing an actual violin plot. We generate a few violin plots on the same data we used for boxplots in the previous recipe, to make it easier to grasp how they work.
Getting ready
We will work with the NLS data. You need Matplotlib and Seaborn installed on your computer to run the code in this recipe.
How to do it…
We do violin plots to view both the spread and shape of the distribution on the same graphic. We then do violin plots by groups:
- Load
pandas
,matplotlib
, andseaborn
, and the NLS data:import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns nls97 = pd.read_csv("data/nls97f.csv", low_memory=False) nls97.set_index...