Creating features with aggregation primitives
Throughout this chapter, we’ve created features automatically by transforming existing variables into new features. For example, we extracted date and time parts from datetime variables, we counted the number of words, characters, and punctuation in a text, and we combined numerical features into new variables. To create these features, we worked with transform primitives.
Featuretools also supports aggregation primitives. These primitives take related observations as input and return a single value. For example, if we have a numerical variable, price, related to an invoice, an aggregation primitive would take all the price observations for a single invoice and return a single value, such as the mean price or the sum (that is, the total amount paid), for that particular invoice.
Tip
The Featuretools aggregation functionality is the equivalent of groupby
in pandas, followed by pandas functions such as mean
, sum
, std
, and...