▶ Try SQL

SQL GROUP BY

Collapse rows into groups and compute aggregate statistics per group.

What GROUP BY does

GROUP BY collapses all rows sharing the same value into a single summary row. You then apply an aggregate function (COUNT, SUM, AVG, etc.) to compute a value per group.

The five aggregate functions

FunctionWhat it computesExample
COUNT(*)Number of rowsHow many orders?
SUM(col)Total of a numeric columnTotal revenue
AVG(col)Average of a numeric columnAverage order value
MIN(col)Smallest valueCheapest product
MAX(col)Largest valueMost expensive order

COUNT orders per customer

SQL
SELECT customer_id, COUNT(*) AS order_count FROM orders GROUP BY customer_id ORDER BY order_count DESC;

SUM and AVG

SQL — revenue by product
SELECT product, COUNT(*) AS sales_count, SUM(amount) AS total_revenue, AVG(amount) AS avg_order_value, MAX(amount) AS biggest_order FROM orders GROUP BY product ORDER BY total_revenue DESC;

Group by multiple columns

SQL — revenue by country AND product
SELECT c.country, o.product, SUM(o.amount) AS revenue FROM orders o JOIN customers c ON o.customer_id = c.id GROUP BY c.country, o.product ORDER BY c.country, revenue DESC;

The SELECT rule

Every column in SELECT must be either:
  1. Listed in GROUP BY, OR
  2. Wrapped in an aggregate function (COUNT, SUM, AVG…)
Breaking this rule is a syntax error in standard SQL. MySQL is lenient (wrongly so) — other databases reject it.

COUNT(*) vs COUNT(column)

SQL
SELECT COUNT(*) AS total_rows, -- counts every row incl. NULLs COUNT(phone) AS has_phone, -- skips NULLs COUNT(DISTINCT city) AS unique_cities -- distinct non-NULL values FROM customers;
Execution order reminder FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. WHERE runs before grouping (filters individual rows). HAVING runs after grouping (filters groups).