How SQL Indexes Actually Speed Up Your Queries (And When They Slow You Down Instead)

For a long time, “just add an index” was advice I repeated to others without genuinely understanding the mechanism behind it, treating it as a fix that worked simply because every performance guide said it did. That worked fine until I added an index to a write-heavy table and watched insert performance get meaningfully worse instead of better, which is when I actually had to understand what an index was doing rather than just that it existed.

The Core Mental Model: A Sorted Structure, Not Magic

Without an index, finding rows matching a condition requires the database to check every single row in the table, one at a time, the same way finding one specific fact in a book with no index requires reading every page front to back until you find it.

An index is a separate, sorted structure — commonly a B-tree — that maps column values directly to their physical row locations. This lets the database navigate almost directly to matching rows, the same way a book’s index lets you jump straight to the right page instead of reading the entire book to find one fact.

A Concrete Example of Why This Matters at Scale

Consider a users table with one million rows, and a query filtering for a specific email address. Without an index on that column, the database may need to check up to one million rows in the worst case to find the match. With an index on that same column, the database can navigate the sorted structure in roughly twenty comparisons, since each step through a sorted structure of this kind roughly halves the remaining search space. The practical difference between checking up to a million rows and checking around twenty is exactly why this single change can turn a multi-second query into one returning instantly.

What Actually Gets Indexed and How to Create One

Creating a basic index looks like this: CREATE INDEX idx_users_email ON users(email). This creates a sorted structure specifically for the email column on the users table.

Indexes can also cover multiple columns together, called a composite index, and the order of columns within a composite index genuinely matters. An index on (customer_id, order_date) efficiently serves queries filtering by customer_id alone, or by customer_id and order_date together, but does not efficiently serve a query filtering by order_date alone, since the structure is sorted first by customer_id and only sorted by date within each customer_id grouping.

Why More Indexes Isn’t Automatically Better

This is the part that working-but-not-fully-understanding the concept hid from me for a while. Every index must be updated whenever the underlying data changes through an insert, update, or delete, since the sorted structure needs to stay accurate. This means each additional index adds genuine write overhead and additional storage, even on the rows and columns that particular index has nothing to do with. A table experiencing frequent writes, with many indexes added over time without ever being reviewed, can see meaningfully slower insert and update performance specifically because of this ongoing maintenance cost, exactly what happened on the table that originally prompted me to actually learn this mechanism properly.

When an Index Won’t Actually Help, Even Though It Exists

Applying a function to the indexed column in your WHERE clause — filtering on LOWER(email) rather than email directly, for example — generally cannot use a standard index built on the raw column values, since the index is sorted by the raw values, not by whatever the function would produce.

Using a leading wildcard in a LIKE pattern — searching for anything ending in a specific string, like ‘%gmail.com’ — generally cannot use a standard index efficiently, since the sorted structure has no meaningful starting point to jump to when a match could begin anywhere within the string. A trailing wildcard instead, like ‘john%’, generally can use the index normally, since the structure can jump directly to where values starting with “john” begin.

Indexing a column with very low selectivity — a boolean column where the overwhelming majority of rows share the same value, for example — often provides little practical benefit, since the index doesn’t meaningfully narrow down the search space. The query optimizer will frequently skip an existing index entirely in this situation and perform a full scan anyway, correctly, since that genuinely is faster when most rows match regardless.

How to Check Whether Your Index Is Actually Being Used

Rather than assuming an index is helping simply because it exists, the EXPLAIN command (or EXPLAIN ANALYZE, depending on your specific database) shows the actual execution plan your database chose for a given query, revealing directly whether it performed an index scan or fell back to a full table scan. This is the genuinely reliable way to confirm an index is contributing to a specific query’s performance, rather than guessing based on the index’s mere existence.

A Worked Example

Consider a query finding all orders placed by a specific customer within the last thirty days. Without an index on customer_id, this requires scanning the entire orders table. Adding an index on customer_id lets the optimizer jump directly to that customer’s rows. If this same kind of query frequently filters by both customer_id and order_date together, a single composite index on (customer_id, order_date) serves that specific combined pattern more efficiently than maintaining two separate single-column indexes would.

When to Reach for an Index, and When to Leave It Alone

If a column is frequently used in WHERE clauses, JOIN conditions, or ORDER BY clauses on a table with a meaningful number of rows, and that column has reasonably high selectivity — most values are fairly distinct rather than a small handful of repeated values — adding an index is very likely worth the write overhead it introduces. If a table is small, queried rarely, or the column in question has very low selectivity, an index is more likely to add ongoing maintenance cost without meaningfully improving how fast your actual queries run.

The Investment This Concept Deserves

Once “a sorted structure that lets the database jump directly to matching rows instead of scanning everything” replaced “add an index and things get faster” as my actual mental model, I could reason about specific cases — the low-selectivity column, the leading-wildcard search, the write-heavy table — rather than treating index creation as an unconditional best practice to apply everywhere without checking whether it genuinely fit the situation.

Are you trying to decide whether a specific slow query or a specific table would actually benefit from an index? Describe the query and the table’s rough size and write frequency, and I can help you work through whether an index is the right fix here.