SQL GROUP BY

Collapse rows into groups and compute aggregate statistics per group.

What GROUP BY does

GROUP BY collapses all rows sharing the same value into a single summary row. You then apply an aggregate function (COUNT, SUM, AVG, etc.) to compute a value per group.

The five aggregate functions

Function	What it computes	Example
`COUNT(*)`	Number of rows	How many orders?
`SUM(col)`	Total of a numeric column	Total revenue
`AVG(col)`	Average of a numeric column	Average order value
`MIN(col)`	Smallest value	Cheapest product
`MAX(col)`	Largest value	Most expensive order

COUNT orders per customer

SQL

SELECT customer_id, COUNT(*) AS order_count FROM orders GROUP BY customer_id ORDER BY order_count DESC;

SUM and AVG

SQL — revenue by product

SELECT product, COUNT(*) AS sales_count, SUM(amount) AS total_revenue, AVG(amount) AS avg_order_value, MAX(amount) AS biggest_order FROM orders GROUP BY product ORDER BY total_revenue DESC;

Group by multiple columns

SQL — revenue by country AND product

SELECT c.country, o.product, SUM(o.amount) AS revenue FROM orders o JOIN customers c ON o.customer_id = c.id GROUP BY c.country, o.product ORDER BY c.country, revenue DESC;

The SELECT rule

Every column in SELECT must be either:

Listed in GROUP BY, OR
Wrapped in an aggregate function (COUNT, SUM, AVG…)

Breaking this rule is a syntax error in standard SQL. MySQL is lenient (wrongly so) — other databases reject it.

COUNT(*) vs COUNT(column)

SQL

SELECT COUNT(*) AS total_rows, -- counts every row incl. NULLs COUNT(phone) AS has_phone, -- skips NULLs COUNT(DISTINCT city) AS unique_cities -- distinct non-NULL values FROM customers;

Execution order reminder FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. WHERE runs before grouping (filters individual rows). HAVING runs after grouping (filters groups).

← JOINS HAVING →