Using the built-in User-defined Aggregation Function (UDAF)
Hive provides a set of functions to do aggregation on a dataset. These functions operate on a range of data (rows) and provide the cumulative or relative result.
How to do it…
The built-in functions could be used directly in the query. The following are some of the examples of aggregated functions available in Hive:
Function Name |
Return Type |
Description |
---|---|---|
|
DOUBLE |
It is used to calculate the average of all values of a particular column. |
|
DOUBLE |
It is used to calculate the average of unique values of a particular column. |
|
ARRAY |
It will return a list of all values of a particular column in an array. |
|
ARRAY |
It will return a list of unique values of a particular column in an array. Duplicate values are eliminated. |
|
DOUBLE |
It is used to calculate the Pearson coefficient of correlation between two columns. |
|
BIGINT |
It will return the total... |