Aggregating data with GROUP BY and HAVING
Aggregation is a concept with which you should already be familiar thanks to the discussion of Python using pandas in Chapter 3. Just like in Python, aggregation in SQL is about summarizing or grouping data in a way that makes it more useful, understandable, and manageable. GROUP BY
and HAVING
are two crucial components in SQL that help accomplish this.
The GROUP BY statement
Much like how grouping is performed in Python using pandas, the GROUP BY statement in SQL is used with aggregate functions (such as COUNT
, SUM
, AVG
, MAX
, and MIN
) to group the result set by one or more columns. Thus, using GROUP BY
should be familiar to you! The syntax is as follows:
SELECT column1, column2, columnN aggregate_function(columnX) FROM table GROUP BY columns(s);
Aggregate values are best managed by using aliases. An alias is simply a nickname for a calculated or aggregated field or temporary table. Simply use the term AS
, like so:
SELECT column1...