An overview of Pig
Historically, the Pig toolkit consisted of a compiler that generated MapReduce programs, bundled their dependencies, and executed them on Hadoop. Pig jobs are written in a language called Pig Lat in and can be executed in both interactive and batch fashions. Furthermore, Pig Latin can be extended using User Defined Functions (UDFs) written in Java, Python, Ruby, Groovy, or JavaScript.
Pig use cases include the following:
Data processing
Ad hoc analytical queries
Rapid prototyping of algorithms
Extract Transform Load pipelines
Following a trend we have seen in previous chapters, Pig is moving towards a general-purpose computing architecture. As of version 0.13 the
ExecutionEngine interface (org.apache.pig.backend.executionengine
) acts as a bridge between the frontend and the backend of Pig, allowing Pig Latin scripts to be compiled and executed on frameworks other than MapReduce. At the time of writing, version 0.13 ships with MRExecutionEngine (org.apache.pig.backend.hadoop...