Comparing continuous values across categories
The previous sections discussed looking at a single column. This section will show how to compare continuous variables in different categories. We will look at mileage numbers in different brands: Ford, Honda, Tesla, and BMW.
How to do it…
- Make a mask for the brands we want and then use a group by operation to look at the mean and standard deviation for the
city08
column for each group of cars:>>> mask = fueleco.make.isin( ... ["Ford", "Honda", "Tesla", "BMW"] ... ) >>> fueleco[mask].groupby("make").city08.agg( ... ["mean", "std"] ... ) mean std make BMW 17.817377 7.372907 Ford 16.853803 6.701029 Honda 24.372973 9.154064 Tesla 92.826087 5.538970
- Visualize the
city08
values for each make with seaborn:>>> g = sns.catplot( ... x="make", y=...