Development and debugging aids
There are three important commands that can help develop, debug, and optimize Pig scripts.
The DESCRIBE command
The DESCRIBE
command gives the schema of a relation. This command is useful when you are a Pig Latin beginner and want to understand how operators transform the data. The output corresponding to the groupByCountry
relation in the previous script code to find the population of the country is given as follows:
groupByCountry: {group: chararray,generateRecords: {(cc::cname: chararray,ccity::cityName: chararray,ccity::population: long)}}
The DESCRIBE
output has the Pig syntax. In the preceding example, groupByCountry
is a Bag data type that contains a group element and another bag, generateRecords
.
The EXPLAIN command
EXPLAIN
, on a relation, shows how the Pig script will be executed. It is useful when trying to optimize Pig scripts or debug errors. It shows the logical, physical, and MapReduce plans of the relation. The following screenshot shows the MapReduce...