Methods for churn prediction
In the previous section, we have completed our task of describing the business use case, and that of preparing our Spark computing platform and our datasets. In this section, we need to select our analytical methods or predictive models (equations) for this churn prediction project, that is, to map our business use case to machine learning methods.
As per the research done over a period of many years, customer satisfaction professionals believe that product and services features affect the quality of services, which affects customer satisfaction, finally affecting customer churns. Therefore, we should somehow incorporate this piece of knowledge into our model design or equation specification.
From an analytical perspective, there are many suitable models for modelling and predicting customer churns, and among them, the most commonly used are logistic regression and decision trees. For this exercise, we will use both, and then use evaluation to determine which...