In this section, we are going to discuss how to build and evaluate regression models using machine learning algorithms in R. By the end of this section, we will have built a predictive model using a linear regression algorithm to predict the CLV, more specifically, the expected 3 month customer value. We will be using a handful of R packages, such as dplyr, reshape2, and caTools, to analyze, transform, and prepare the data for building machine learning models to predict the expected 3 month customer value. For those readers who would like to use Python instead of R for this exercise, you can refer to the previous section.
For this exercise, we will be using one of the publicly available datasets from the UCI Machine Learning Repository, which can be found at this link: http://archive.ics.uci.edu/ml/datasets/online+retail. You can follow this...