Writing a user-defined function in Pig
In this recipe, we will learn how to write user-defined functions (UDFs) in order to have our own custom filters.
Getting ready
To perform this recipe, you should have a running Hadoop cluster as well as the latest version of Pig installed on it. We will also need an IDE, such as Eclipse, to write the Java class.
How to do it...
In this recipe, we are going to write user-defined functions for the dataset we have been considering in this chapter. Our dataset is an employee dataset, so let's assume that we want to convert all the names present in our dataset into uppercase. To do this, we will write a user-defined function to convert the lowercase letters into uppercase letters.
Writing a UDF is very simple: we need to write a class that extends the EvalFunc
Pig class. In order to have this and other Hadoop classes in our class path, first of all, we need to create a maven project, and add the following dependencies to the POM.xml
project:
<dependency...