Other programming abstractions
Hadoop is not just extended by additional functionality; there are tools to provide entirely different paradigms for writing the code used to process your data within Hadoop.
Pig
We mentioned Pig (http://pig.apache.org) in Chapter 8, A Relational View on Data with Hive, and won't say much else about it here. Just remember that it is available and may be useful if you have processes or people for whom a data flow definition of the Hadoop processes is a more intuitive or better fit than writing raw MapReduce code or HiveQL scripts. Remember that the major difference is that Pig is an imperative language (it defines how the process will be executed), while Hive is more declarative (defines the desired results but not how they will be produced).
Cascading
Cascading is not an Apache project but is open source and is available at http://www.cascading.org. While Hive and Pig effectively define different languages with which to express data processing, Cascading provides...