Data Preparation and Feature Engineering
Once you have loaded and cleaned your data, you need to prepare it so that it's in a format that you can use to perform data analysis. Along with this, you need to identify features that will help you understand your data better and provide significant insights. These processes involve modifying already existing features and transforming them into new features.
For example, in the previous exercise, we saw that the dataset contains a date
column consisting of day, month, and year. We can use this information to determine which months of the year were most popular for the online retail store. In order to do this, we need to modify the date
column by breaking it down into columns such as day, month, year, and so on.
When preparing data for machine learning models, categorical features must be transformed into a numerical format so that the models can learn from them. However, since we are just going to be analyzing the data, we can...