Introduction
You are working in a marketing company that takes projects from various clients. Your team has been given a project where you have to predict the percentage of conversions for a Black Friday sale that the team is going to plan. The percentage of conversion as per the client refers to the number of people who actually buy products vis-à-vis the number of people who initially signed up for updates regarding the sale by visiting the website. Your first instinct is to go for a regression model for predicting the percentage conversion. However, you have millions of rows of data with hundreds of columns. In scenarios like these, it's very common to encounter issues of multi-collinearity where two or more features effectively convey the same information. This can then end up affecting the robustness of the model. This is where solutions such as Recursive Feature Selection (RFE) can be of help.
In the previous chapter, you learned how to prepare data for regression...