SQL Subqueries: Beginner to Intermediate Guide

A lot of people learning SQL assume a subquery is just a more complicated way to write a JOIN — some kind of advanced technique you reach for once JOINs feel too basic. That’s backwards. A subquery solves a different problem entirely: it answers a question you need answered before your main query can even run, like “what’s the average order value across the whole company?” or “which customer IDs belong to a specific region?” JOINs combine tables side by side. Subqueries answer a smaller question first, then hand that answer to a bigger query.

Once that distinction is clear, subqueries stop feeling like a syntax puzzle and start feeling like a sequence of small, logical steps. That’s exactly how this guide is organized — one step at a time, starting from the simplest case and building toward the patterns that trip people up most often.

Step 1: Understand What a Subquery Actually Is

A subquery is a complete SELECT statement nested inside another SQL statement. It runs first, produces a result, and that result gets used by the outer query as if it were a value, a list, or a small table.

Picture it as a query answering a question for another query. If your outer query needs to know “which orders were above the company’s average order value,” the subquery’s entire job is to calculate that average. The outer query never has to know how the average was calculated — it just uses the number the subquery hands back.

This is the mental model worth holding onto through everything that follows: a subquery isolates a smaller question, answers it independently, and passes that answer upward.

Step 2: Write a Subquery That Returns a Single Value

The simplest subquery returns exactly one value — one row, one column. These are called scalar subqueries, and they typically show up inside a WHERE clause, compared against a column using a standard operator like =, >, or <.

Take the “orders above the company average” example. The inner query is straightforward: SELECT AVG(order_amount) FROM orders. Wrap that inside a WHERE clause on the outer query — WHERE order_amount > (SELECT AVG(order_amount) FROM orders) — and now every row returned by the outer query has an order amount higher than the company-wide average.

Notice what didn’t happen here: no JOIN, no GROUP BY, no separate query run manually and pasted back in as a hardcoded number. The subquery calculates that average on the fly, every time the full query runs, so it stays accurate even as the underlying data changes.

Step 3: Move to Subqueries That Return a List of Values

Not every question has a single-value answer. Sometimes you need to check a column against a whole list of values, and that’s where the IN operator pairs naturally with a subquery.

Suppose you want every order placed by customers located in California. If customer location lives in a separate customers table, the subquery can pull just the relevant customer IDs: SELECT customer_id FROM customers WHERE state = ‘CA’. The outer query then checks orders against that list: SELECT * FROM orders WHERE customer_id IN (SELECT customer_id FROM customers WHERE state = ‘CA’).

This pattern — IN paired with a subquery returning a list — comes up constantly once you start looking for it. Anytime a business question sounds like “rows where some column matches any value from this other set,” a list-returning subquery is usually the tool being asked for.

Step 4: Learn Where Subqueries Are Allowed to Live

Subqueries aren’t limited to WHERE clauses. They can appear in several places, and each placement serves a different purpose.

In the SELECT list, a subquery can add a calculated column to every row of your outer query’s result — for instance, showing each product’s price next to the average price across all products, calculated by a scalar subquery placed directly in the column list.

In the FROM clause, a subquery acts as a temporary table. You write a full SELECT statement, wrap it in parentheses, give it an alias, and the outer query treats it exactly like any other table — filtering it, joining it, or aggregating it further.

In the WHERE clause, as covered above, a subquery filters rows based on a value or a list it calculates.

Recognizing these three placements matters more than memorizing exact syntax for each. Once you can look at a subquery and immediately identify which of these three roles it’s playing, reading unfamiliar queries gets noticeably faster.

Step 5: Use a Subquery as a Temporary Table

Subqueries in the FROM clause deserve a closer look, since this is where beginners often stall.

Say you want each customer’s total spending, but only for customers who have spent over $500 total. You can’t filter on a SUM directly in a WHERE clause — WHERE calculations happen before aggregation runs. Instead, build the aggregation as a subquery: SELECT customer_id, SUM(order_amount) AS total_spent FROM orders GROUP BY customer_id. Wrap that in parentheses, alias it as something like customer_totals, and now your outer query can filter on total_spent directly: SELECT * FROM (SELECT customer_id, SUM(order_amount) AS total_spent FROM orders GROUP BY customer_id) AS customer_totals WHERE total_spent > 500.

This two-stage approach — aggregate first inside the subquery, then filter on that aggregate in the outer query — solves a limitation that trips up a surprising number of intermediate SQL writers: the confusion between WHERE and HAVING. A subquery sidesteps that confusion entirely by giving you a clean, already-aggregated table to filter against with ordinary WHERE logic.

Step 6: Understand Correlated Subqueries

Every subquery covered so far has been independent — it runs once, produces a result, and that result gets used by the outer query. A correlated subquery works differently: it references a column from the outer query, which means it has to run once per row of the outer query rather than just once overall.

A common use case: finding every order that’s above average for that specific customer, rather than above the company-wide average. The subquery becomes: SELECT AVG(order_amount) FROM orders o2 WHERE o2.customer_id = o1.customer_id — note the reference to o1, the outer query’s table alias. The full query: SELECT * FROM orders o1 WHERE order_amount > (SELECT AVG(order_amount) FROM orders o2 WHERE o2.customer_id = o1.customer_id).

Because that inner AVG calculation depends on which customer the outer row belongs to, the database recalculates it separately for every single row being evaluated. That’s slower than an independent subquery on larger tables, and it’s worth knowing this cost exists before reaching for a correlated subquery out of habit rather than necessity.

Step 7: Recognize When EXISTS Is a Better Fit Than IN

EXISTS is a variation on the correlated subquery pattern, used specifically to check whether any matching row exists at all, without caring what value that row holds.

Suppose you want every customer who has placed at least one order. Written with EXISTS: SELECT * FROM customers c WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id). The subquery here isn’t returning a meaningful value — SELECT 1 is a common convention specifically because the actual selected value doesn’t matter. EXISTS only cares whether the subquery returns any row at all.

EXISTS often performs better than IN once the list a subquery would generate gets large, since the database can stop looking as soon as it finds one matching row instead of building out a complete list first. As a rough rule: reach for IN when you’re checking against a fixed or clearly small list, and reach for EXISTS when you’re really just asking “does at least one related row exist,” particularly on larger tables.

Step 8: Compare Subqueries Against JOINs for the Same Task

A fair question at this point: if a JOIN can often produce the same result as a subquery, which one should you actually reach for?

JOINs tend to be the better choice when you need columns from both tables displayed together in the final result. Subqueries tend to be the better choice when you only need one table’s data in the output, and the other table is just there to filter or calculate something in the background. The California customer example from Step 3 is a good illustration — the output only needs order columns, so a subquery filtering by customer ID keeps the query focused, rather than joining in a whole customers table just to discard most of its columns afterward.

Neither approach is universally faster or objectively more correct. Query optimizers in modern databases often rewrite one into the other internally anyway. Choosing between them usually comes down to which version reads more clearly for the specific question being asked — and readability, for a query someone else will maintain later, counts for more than people tend to give it credit for.

Step 9: Practice Reading a Query From the Inside Out

A habit worth building deliberately: whenever a query looks intimidating because of nested parentheses, start reading from the innermost subquery outward rather than top to bottom.

Find the deepest, most nested SELECT statement first. Work out exactly what that piece returns on its own — a single value, a list, or a small table. Then step outward one layer at a time, treating each subquery’s result as a known, fixed input for the layer surrounding it. By the time you reach the outermost SELECT, the whole query has been reduced to a handful of simple, already-understood pieces instead of one overwhelming block of nested logic.

This reading strategy scales well beyond simple two-level nesting. Queries with subqueries inside subqueries inside subqueries look far less alarming once inside-out reading becomes automatic rather than something you have to remind yourself to do.

Step 10: Build the Habit, Not Just the Syntax

Subqueries reward the same shift that eventually makes every other SQL concept click: replacing memorized syntax patterns with a clear picture of what problem the syntax is solving. A scalar subquery answers a single-value question. A list subquery answers a “matches any of these” question. A FROM-clause subquery builds a temporary table to filter or aggregate further. A correlated subquery answers a question that changes row by row. EXISTS answers a simple yes-or-no question about related data.

Here’s a quick reference for matching a question type to the right subquery pattern:

Question Type	Subquery Pattern
Compare against a single calculated value	Scalar subquery in WHERE
Match against a list of values	Subquery with IN
Filter or aggregate a derived result set	Subquery in FROM
Answer changes per outer row	Correlated subquery
Just check if a related row exists	EXISTS

Try picking one query you’ve already written with a JOIN and rewriting it as a subquery instead, just to see how the logic shifts. Which pattern from that table matches the last tricky question you had to answer in SQL?