A business case for churn detection
Telecom companies lose more than 30% of customers annually as a result of customer churn in the US and Europe. The cost of acquiring a new customer is eight times than that of retaining an existing customer. This makes a strong business case for churn detection, a task which is ideal for Hadoop.
Analyzing telecom data with Hadoop to detect customer churn possess a unique set of challenges that stem from the massive datasets that need to be transformed and analyzed. The storage of this data is expensive due to its sheer volume, and the pre-processing of raw data before analysis is a time- and computing-intensive task. Hadoop offers low-cost storage for data processing, and it can efficiently deal with structured, semi-structured, and unstructured datasets, which makes Hadoop a useful technology for churn prediction.
In this chapter, we will use Hadoop MapReduce to analyze the data so that we can predict which customers are likely to churn. In order to do...