Call Data Record Analytics using Hive
Call Data Records (CDR) are special types of records that are used in the telecom domain to keep track of calls made by individuals. We can use Hive to analyze these records in order to give special offers to customers.
Note
You can read more about CDR at https://en.wikipedia.org/wiki/Call_detail_record.
Getting ready
To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Hive installed on it. Here, I am using Hive 1.2.1.
How to do it...
First of all, let's consider a situation where we have the following type of dataset with us. To analyze it, we first need to create a Hive table and load data into it:
CALLER_PHONE_NO|RECEIVER_PHONE_NUMBER|START_TIME|END_TIME|CALL_TYPE 11111|22222|2015-01-12 01:20:00|2015-01-12 01:30:00|VOICE 11111|22222|2015-02-12 01:35:00|2015-02-12 01:35:30|VOICE 11111|22222|2015-02-12 02:20:00|2015-02-12 02:20:00|SMS 33333|44444|2015-01-12 01:20:00|2015-01-12 01:30:00|VOICE 11111|33333|2015-05...