Time for action – adding a new User Defined Function (UDF)
Let us show how to create and invoke some custom Java code via a new UDF.
Save the following code as
City.java
:package com.kycorsystems ; import java.util.regex.Matcher ; import java.util.regex.Pattern ; import org.apache.hadoop.hive.ql.exec.UDF ; import org.apache.hadoop.io.Text ; public class City extends UDF { private static Pattern pattern = Pattern.compile( "[a-zA-z]+?[\\. ]*[a-zA-z]+?[\\, ][^a-zA-Z]") ; public Text evaluate( final Text str) { Text result ; String location = str.toString().trim() ; Matcher matcher = pattern.matcher(location) ; if (matcher.find()) { result = new Text( location.substring(matcher.start(), matcher.end()-2)) ; } else { result = new Text("Unknown") ; } return result ; } }
Compile this file:
$ javac -cp hive/lib/hive-exec-0.8.1.jar:hadoop...