Using more complicated aggregation functions with groupby
In the previous recipe, we created a groupby
DataFrame object and used it to run summary statistics by groups. We use chaining in this recipe to create the groups, choose the aggregation variable(s), and select the aggregation function(s), all in one line. We also take advantage of the flexibility of the groupby
object, which allows us to choose the aggregation columns and functions in a variety of ways.
Getting ready
We will work with the National Longitudinal Survey of Youth (NLS) data in this recipe.
Data note
The NLS, administered by the United States Bureau of Labor Statistics, are longitudinal surveys of individuals who were in high school in 1997 when the surveys started. Participants were surveyed each year through 2018. The surveys are available for public use at nlsinfo.org.
How to do it…
We do more complicated aggregations with groupby
than we did in the previous recipe, taking advantage...