How to Use GROUP BY and HAVING in SQL: The Tutorial That Finally Made It Click

After this tutorial, you’ll be able to look at any “summarize this data” request and know within seconds which clause handles it: GROUP BY, WHERE, HAVING, or some combination of the three. You’ll also be able to explain why a query that filters on an aggregate value fails with WHERE and succeeds with HAVING, which is the single most common GROUP BY error in real production code.

Rather than working through GROUP BY as one abstract concept, this post ranks the five patterns you’ll actually use, from the one you’ll write constantly to the one you’ll reach for only occasionally. Ranking them this way mirrors how the learning should happen: master the first pattern completely before worrying about the fifth.

1. The Single-Column Aggregate — Your Foundation Pattern

This is the pattern every other one builds on, and it’s worth over-practicing before moving anywhere else.

The shape: SELECT a column you want to group by, plus an aggregate function like COUNT, SUM, or AVG applied to another column, FROM your table, GROUP BY that same first column. If you’re counting orders per customer, you’d select the customer column and COUNT of order ID, group by the customer column, and the result gives you one row per customer with their order count sitting right next to their name.

The rule that trips up almost everyone starting out: every column in your SELECT list has to either appear in the GROUP BY clause or be wrapped in an aggregate function. There’s no middle option. SQL needs to know, for every column you’re displaying, whether it’s the thing you’re grouping by or a value calculated across the group — it can’t guess, and it won’t let you select a raw column that isn’t part of either category.

This pattern alone answers a surprising share of business questions: total sales per region, order count per customer, average rating per product. Get comfortable here first.

2. WHERE Before Grouping — Filtering the Raw Rows

Once you can group and aggregate, the next pattern to master is filtering the individual rows before they ever get grouped together.

WHERE runs first, conceptually, cutting your table down to only the rows you care about, and only after that filtered set exists does GROUP BY do its work. If you want total sales per region but only counting transactions from this year, WHERE handles the year filter — you’re not filtering groups here, you’re filtering the raw transaction rows that eventually feed into those groups.

A concrete case: SELECT region, SUM of sale amount, FROM sales, WHERE sale date is in the current year, GROUP BY region. Notice WHERE sits before GROUP BY in the query, and that ordering isn’t cosmetic — it reflects the actual sequence SQL follows internally, filtering rows first and aggregating what’s left second.

This pattern is ranked second rather than first because it’s a straightforward addition once pattern one is solid. The syntax barely changes; you’re just inserting a filtering condition upstream of the grouping logic that’s already comfortable.

3. HAVING After Grouping — The Clause Most People Misuse

Here’s where the real conceptual leap happens, and it’s the pattern responsible for more confused Stack Overflow questions than almost anything else in basic SQL.

HAVING filters groups after they’ve already been formed and aggregated — not individual rows, groups. If you want to see only the regions whose total sales exceed some threshold, you can’t use WHERE for that, because at the point WHERE runs, no aggregation has happened yet and there’s no “total sales per region” value in existence to filter on. That total only comes into being once GROUP BY has done its job, which means only HAVING, running after GROUP BY, has access to it.

The query: SELECT region, SUM of sale amount, FROM sales, GROUP BY region, HAVING SUM of sale amount greater than some number. Try swapping HAVING for WHERE in that query and most databases will throw an error, because WHERE simply doesn’t know what “SUM of sale amount” means at the row-filtering stage — that value doesn’t exist until after grouping.

The clean way to keep this straight: WHERE filters candidates before they’re grouped; HAVING filters the groups themselves, after aggregation has already produced a value for each one. Once that distinction is fixed in your head, choosing between the two stops being a guessing game.

Nothing stops you from using WHERE and HAVING in the same query, incidentally — filter the raw rows first with WHERE, group what’s left, then filter those resulting groups with HAVING. It’s an extremely common combination, not an either-or choice.

4. Multi-Column GROUP BY — Breaking Data Down Further

Once single-column grouping and both filtering clauses feel natural, the next pattern extends GROUP BY to more than one column at a time.

Instead of total sales per region, imagine total sales per region per product category — a finer-grained breakdown than either column could give you alone. The syntax adds a second column to the GROUP BY clause: GROUP BY region, product category. SQL now treats each unique combination of region and category as its own group, producing one row per combination rather than one row per region overall.

This is ranked fourth rather than earlier because the conceptual work is already done by this point — you’re not learning a new idea, just applying the same grouping logic across more than one column simultaneously. The SELECT list rule from pattern one still applies without modification: every non-aggregated column you display needs to show up in that GROUP BY list, and now there are simply two of them instead of one.

Where this pattern earns its keep is any report with a natural hierarchy — sales by region and month, signups by source and week, defects by factory and shift. Anywhere “broken down by more than one thing at once” describes the request, this is the pattern that answers it.

5. HAVING Combined With ORDER BY and LIMIT — Surfacing What Matters

The fifth pattern is the least frequent of the five, but it’s the one that turns a plain aggregate query into an actual reporting tool people rely on.

Once HAVING has trimmed your groups down to the ones that meet some threshold, ORDER BY and LIMIT let you sort and cap that result — showing, say, only the top five regions by total sales among those that already cleared your HAVING condition. The clause order in the query stays consistent with how SQL processes it: SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY, then LIMIT, each one operating on the result of whatever came before it.

A realistic version: SELECT region, SUM of sale amount, FROM sales, WHERE sale date is in the current year, GROUP BY region, HAVING SUM of sale amount greater than some threshold, ORDER BY that summed amount descending, LIMIT 5. Read start to finish, that’s “filter this year’s transactions, group by region, keep only regions above the threshold, sort those survivors from highest to lowest, and show me just the top five.”

This is ranked last not because it’s difficult, but because it’s a composition of everything above it rather than a new idea in its own right. Once patterns one through four are solid, this one is just assembly.

Choosing the Right Pattern for Your Question

Your question sounds like…	Pattern to reach for
“Show me total/count/average per [category]”	Pattern 1: single-column GROUP BY
“…but only counting recent/specific rows”	Pattern 2: WHERE before GROUP BY
“…but only show groups above/below some total”	Pattern 3: HAVING after GROUP BY
“Break it down by two categories at once”	Pattern 4: multi-column GROUP BY
“Show me the top N groups meeting that condition”	Pattern 5: HAVING with ORDER BY and LIMIT

Most real reporting questions map cleanly onto one row of that table, and a fair number combine two or three rows into a single query. Once you can categorize a business request this way before writing a single line of SQL, the query itself tends to fall out almost automatically — the hard part was never the syntax, it was knowing which clause was built to answer which kind of question.

What’s the aggregate report you’re trying to build right now — and does it need to filter rows, filter groups, or both? Walk me through the question you’re answering and I can tell you exactly which of these five patterns fits.