Spark for attrition prediction
In this section, we will start with a real use case and then describe how to prepare Apache Spark for this attrition prediction project.
The use case
NIY University is a private university and wants to improve its student retention using predictive modeling with Big Data. According to ACT's research (refer to http://www.act.org/research/policymakers/pdf/retain_2015.pdf), the average retention rate for American colleges was only about 68% in 2015, and it is even lower for two-year public colleges at 54.7% and for private two-year colleges at 63.4%. That is, about 32% of students left school before graduation, and the attrition is even at greater for two-year public colleges at 45.3% and for two-year private colleges at 36.6%. As student attrition costs both colleges and students a lot, using Big Data to predict students' attrition and designing interventions to prevent them has a lot of value.
The university has a lot of information about student demographics and...