Feature engineering
To be able to properly analyze the data as well as to model the clusters, we will need to clean and structure the data—a step that is commonly referred to as feature engineering—as we need to restructure some of the variables according to our plan of analysis.
In this section, we will be performing the next steps to clean and structure some of the dataset features, with the goal of simplifying the existing variables and creating features that are easier to understand and describe the data properly:
- Create an
Age
variable for a customer by using theYear_Birth
feature, indicating the birth year of the respective person. - Create a
Living_With
feature to simplify the marital status, to describe the living situation of couples. - Create a
Children
feature to indicate the total number of children in a household—that is, kids and teenagers. - Aggregate spending by product type to better capture consumer behaviors.
- Indicate parenthood...