Advanced aggregation – GROUPING SETS
Hive has offered the
GROUPING SETS
keywords to implement advanced multiple GROUP BY
operations against the same set of data. Actually, GROUPING SETS
is a shorthand way of connecting several GROUP BY
result sets with UNION ALL
. The GROUPING SETS
keyword completes all processes in one stage of jobs, which is more efficient than GROUP BY
and UNION ALL
having multiple stages. A blank set ()
in the GROUPING SETS
clause calculates the overall aggregation. The following are a few examples to show the equivalence of GROUPING SETS
. For better understanding, we can say that the outer level of GROUPING SETS
defines on what data UNION ALL
is to be implemented. The inner level defines on what data GROUP BY
is to be implemented in each UNION ALL
.
SELECT name, work_place[0] AS main_place, count(employee_id) AS emp_id_cnt FROM employee_id GROUP BY name, work_place[0] GROUPING SETS((name, work_place[0])); || SELECT name, work_place[0] AS main_place, count(employee_id...