Showing summary statistics for a pandas series
There are a large number of pandas series methods for generating summary statistics. We can easily get the mean, median, maximum, or minimum values for a series with the mean
, median
, max
, and min
methods, respectively. The incredibly handy describe
method will return all of these statistics, as well as several others. We can also get the series value at any percentile using quantile
. These methods can be used across all values for a series, or just for selected values. This will be demonstrated in this recipe.
Getting ready
We will continue working with the overall GPA column from the NLS.
How to do it...
Let's take a good look at the distribution of the overall GPA for the DataFrame and for the selected rows. To do this, follow these steps:
- Import
pandas
andnumpy
and load the NLS data:>>> import pandas as pd >>> import numpy as np >>> nls97 = pd.read_csv("data/nls97b.csv"...