Adding support for a new writable data type in Hadoop
In this recipe, we are going to learn how to introduce a new data type in Map Reduce programs and then use it.
Getting ready
To perform this recipe, you should have a running Hadoop cluster as well as an eclipse that's similar to an IDE.
How to do it...
Hadoop allows us to add new custom data types ,which are made up of one or more primary data types. In the previous recipe, you must have noticed that when you handled the log data structure, you had to remember the sequence in which each data component was placed. This can get very nasty when it comes to complex programs. To avoid this, we will introduce a new data type in which WritableComparable
can be used efficiently.
To add a new data type, we need to implement the WritableComparable
interface, which is provided by Hadoop. This interface provides three methods, readFields(DataInput in)
, write(DataOut out)
, and compareTo(To)
, which we will need to override with our own custom implementation...