Mid-level 8 min · March 05, 2026
SQL Query Optimisation

OFFSET Pagination: 50ms to 45s — Keyset Fix

O(offset) pagination: 50ms to 45s on 5M rows.

N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Drawn from code that ran under real load.

Follow
Production
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer
  • Optimisation priority: 1) correct indexes, 2) query rewrites, 3) schema changes — in that order
  • Run EXPLAIN ANALYZE first — never guess, always measure
  • SELECT * is lazy and harmful: fetches unused columns, breaks covering indexes, increases network overhead
  • N+1 query problem: fetching a list then looping to query each item individually — fix with a single JOIN
  • OFFSET pagination is slow at high offsets: use keyset pagination (WHERE id > last_seen_id) instead
  • Functions on indexed columns in WHERE clauses silently disable the index — rewrite as range conditions
✦ Definition~90s read
What is SQL Query Optimisation?

SQL query optimisation is the systematic process of reducing the time and resources a database consumes to return a result set. It's not about writing clever SQL — it's about understanding how the database engine actually executes your query, then restructuring the query or schema to exploit that engine's strengths.

Every production database eventually hits the same wall: queries that ran fine on test data with 10,000 rows suddenly time out in production with 50 million.

The core problem it solves is that naive SQL often forces the database to do far more work than necessary: scanning entire tables, building temporary data structures, or reading data from disk when it could be served from memory. In production systems, a single poorly-optimized query can spike CPU to 100%, lock rows for seconds, and cascade into application-wide timeouts — which is why this skill separates junior engineers from senior ones.

The optimization hierarchy starts with measurement, not intuition. Before touching any query, you must capture its execution plan (via EXPLAIN ANALYZE in PostgreSQL, SET STATISTICS IO ON in SQL Server, or EXPLAIN FORMAT=JSON in MySQL) and baseline metrics like latency, rows examined vs. returned, and buffer hits.

Common anti-patterns include non-sargable WHERE clauses (e.g., WHERE YEAR(created_at) = 2024 instead of WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'), missing or wrong indexes, and forcing join algorithms that don't match data sizes. The fix for OFFSET pagination — which degrades from 50ms to 45s as you page deeper — is a classic example: keyset pagination (using `WHERE id > ?

LIMIT ?`) eliminates the need to scan and discard earlier rows, turning a linear-time operation into constant-time.

Indexing strategy is the lever that makes or breaks query performance. A B-tree index on (status, created_at) can turn a 10M-row sequential scan into a few dozen index seeks, but only if the query predicates are left-prefixed and sargable. Join algorithms matter too: nested loop joins excel when one side is tiny (e.g., 100 rows), hash joins dominate when both sides are large and unsorted, and merge joins shine when data is already sorted by the join key.

The real skill is predicting which algorithm the optimizer will choose and coercing it when it guesses wrong — often by adjusting statistics, adding hints, or rewriting the join order. In practice, 80% of query performance problems are solved by proper indexing and sargable predicates; the remaining 20% require understanding execution plans at the operator level.

Plain-English First

Every production database eventually hits the same wall: queries that ran fine on test data with 10,000 rows suddenly time out in production with 50 million. Dashboards freeze. APIs return 504s. On-call engineers get paged at 2 AM. The culprit is almost always a query the database is executing inefficiently — either because it is doing unnecessary work, or because it is missing a shortcut (an index) to do the necessary work faster.

A slow query isn’t a bug — it’s a design failure. SQL query optimisation is the process of reshaping how your database retrieves data, not by rewriting logic, but by understanding execution plans, join algorithms, and indexing strategies. Without it, you’re just throwing hardware at a software problem, and eventually even your 128-core server will choke on a table scan at 3 AM.

What SQL Query Optimisation Actually Is

SQL query optimisation is the process of restructuring queries and schema access patterns to minimise the database's work per result. It's not about writing clever SQL — it's about understanding how the engine reads data: full table scans, index seeks, nested loops, sorts, and memory grants. The core mechanic is reducing the number of rows the database must touch before returning your answer.

At its heart, optimisation exploits three properties: selectivity (how many rows an index eliminates), access path (sequential vs. random I/O), and cardinality estimates (the planner's guess at row counts). A query that returns 10 rows from a 10M-row table can take 50ms with a covering index or 45 seconds with a full scan and sort. The difference is not syntax — it's whether the engine can seek directly to the rows or must read and discard millions.

Use optimisation when latency matters under load. In production, the worst queries are not the complex ones — they're the simple ones that accidentally scan large tables because of a missing index or an unfiltered sort. The goal is always: make the database do O(log n) work instead of O(n).

Indexes Are Not Free
Adding an index speeds reads but slows writes and consumes memory. Optimise for the hot path, not every possible query.
Production Insight
A paginated user list query with OFFSET 1000000 caused a 45-second response under peak load.
The symptom was a growing sort spill to disk as the offset increased, consuming all tempdb I/O.
Rule: never use OFFSET for deep pagination — use keyset (seek) pagination with a WHERE clause on the last seen key.
Key Takeaway
Optimisation is about reducing rows touched, not rewriting SQL.
Understand your index's selectivity and the planner's cardinality estimate.
Measure before and after — a 10x improvement is common; a 2x improvement is often noise.
OFFSET Pagination: 50ms to 45s — Keyset Fix THECODEFORGE.IO OFFSET Pagination: 50ms to 45s — Keyset Fix Diagram showing the performance degradation of OFFSET vs keyset pagination OFFSET Pagination Rows scanned increases with page number Performance Degradation From 50ms to 45s at high offsets Keyset Pagination Uses WHERE on indexed column Stable Performance Constant time regardless of page depth ⚠ OFFSET causes full table scan on each page Replace with keyset pagination using indexed sort column THECODEFORGE.IO
thecodeforge.io
OFFSET Pagination: 50ms to 45s — Keyset Fix
Sql Query Optimisation

The Optimisation Hierarchy — Always Measure First

Optimisation without measurement is guesswork. The correct workflow: run EXPLAIN ANALYZE on the slow query, identify the most expensive node, apply one targeted fix, run EXPLAIN ANALYZE again to verify the improvement. Never apply multiple changes simultaneously — you need to isolate which change had impact.

  1. Indexes — adding the right index is the highest-impact, lowest-effort fix. A single index can reduce a 30-second query to 2 milliseconds. Always exhaust indexing options before rewriting queries.
  2. Query rewrites — eliminate unnecessary work: replace SELECT * with column lists, eliminate correlated subqueries, rewrite N+1 patterns as single JOINs, switch OFFSET pagination to keyset pagination.
  3. Schema changes — denormalisation, partitioning, archiving old data. Higher effort, sometimes necessary for extreme scale, but only after indexing and query rewrites are exhausted.

The tool for all three levels is EXPLAIN ANALYZE — it drives every decision.

optimisation_workflow.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- STEP 1: Identify slow queries with pg_stat_statements
-- (requires pg_stat_statements extension)
SELECT
    query,
    calls,
    total_exec_time / calls AS avg_ms,
    rows / calls             AS avg_rows
FROM pg_stat_statements
ORDER BY avg_ms DESC
LIMIT 10;

-- STEP 2: Run EXPLAIN ANALYZE on the worst offender
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.order_id, c.name, o.total
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '24 hours';

-- STEP 3: Apply targeted fix based on plan
-- If Seq Scan on orders(status, created_at):
CREATE INDEX idx_orders_status_created
    ON orders(status, created_at DESC)
    WHERE status = 'pending';  -- partial index: only pending rows

-- STEP 4: Verify improvement
EXPLAIN (ANALYZE, BUFFERS)
SELECT o.order_id, c.name, o.total
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '24 hours';
Output
-- Before index: Seq Scan, actual time=4820ms
-- After partial index: Index Scan using idx_orders_status_created
-- actual time=3.2ms
-- 1,500x improvement from one targeted index
The EXPLAIN-First Rule
  • pg_stat_statements identifies your slowest queries by avg execution time
  • EXPLAIN ANALYZE shows exactly what the database is doing
  • One change at a time — apply, measure, validate before the next change
  • An index that looks correct may not be used — EXPLAIN ANALYZE proves whether it is
Production Insight
pg_stat_statements is the most valuable extension in PostgreSQL — install it on every production database and query it weekly to find slow query regressions.
Apply optimisations one at a time — multiple simultaneous changes make it impossible to attribute improvement.
Always measure the before and after: note the actual time from EXPLAIN ANALYZE before and after each change.
Key Takeaway
Measure first with EXPLAIN ANALYZE — never optimise by intuition.
The hierarchy: indexes first, query rewrites second, schema changes last.
One change at a time — isolate impact before the next fix.

Common Anti-Patterns and How to Fix Them

Several query patterns consistently cause performance problems at scale. Recognising them by sight lets you fix them immediately.

SELECT * fetches every column from the table, including large text or BLOB columns you may not need. It prevents covering indexes from working (the engine must fetch the heap to get columns not in the index), increases network transfer between database and application, and makes query plans harder to cache effectively. Always list the specific columns you need.

Functions on indexed columns in WHERE silently disable the index. WHERE YEAR(created_at) = 2024 cannot use an index on created_at — the function transforms the value before comparison. The fix is always a range: WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'.

The N+1 problem is the most damaging ORM-related pattern: fetch a list of customers (1 query), then loop through and fetch each customer's orders separately (N queries). On a list of 100 customers, that's 101 queries. On a list of 1000, it's 1001. Fix with a single JOIN or eager loading.

anti_patterns_fixed.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- ANTI-PATTERN 1: SELECT *
-- BAD: fetches all columns including large blobs, breaks covering indexes
SELECT * FROM orders WHERE customer_id = 42;

-- GOOD: only the columns you need
SELECT order_id, total, status, created_at
FROM orders
WHERE customer_id = 42;

-- ANTI-PATTERN 2: Function on indexed column
-- BAD: YEAR() disables the index on created_at
SELECT * FROM orders WHERE YEAR(created_at) = 2024;

-- GOOD: range query uses the index
SELECT order_id, total FROM orders
WHERE created_at >= '2024-01-01'
  AND created_at <  '2025-01-01';

-- ANTI-PATTERN 3: N+1 queries
-- BAD: one query for customers, then one per customer for orders
-- SELECT * FROM customers;  -- returns 1000 customers
-- then for each customer:
-- SELECT * FROM orders WHERE customer_id = :id;  -- 1000 separate queries!

-- GOOD: single JOIN fetches everything in one query
SELECT
    c.customer_id, c.name,
    o.order_id, o.total, o.created_at
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.customer_id
WHERE c.customer_id IN (1, 2, 3, ..., 1000);  -- or use a subquery

-- ANTI-PATTERN 4: OFFSET pagination
-- BAD: OFFSET 50000 scans and discards 50000 rows
SELECT order_id, total FROM orders ORDER BY order_id LIMIT 20 OFFSET 50000;

-- GOOD: keyset pagination -- O(1) regardless of position
SELECT order_id, total FROM orders
WHERE order_id < :last_seen_id  -- last ID from previous page
ORDER BY order_id DESC
LIMIT 20;
Output
-- N+1 fix: 1001 queries -> 1 query
-- OFFSET at page 2500: 50000ms -> 2ms with keyset pagination
-- SELECT * -> named columns: query plan becomes cacheable, covering index usable
N+1 Queries Are an ORM Silent Failure
ORMs make N+1 queries easy to write accidentally — loading a list of objects and accessing a relationship inside a loop triggers a query per iteration. Enable query logging in development and count queries per request — any endpoint making >10 database queries for a single page load is a candidate for N+1 review. The fix is almost always eager loading (include/prefetch) or restructuring to a single JOIN.
Production Insight
SELECT * is the single easiest performance improvement in a legacy codebase — grep for it, replace with column lists, and watch query times drop immediately.
Offset pagination is one of the most common hidden performance time bombs — it looks fine in development (page 1) but degrades silently in production as data grows and users page deeper.
Enable pg_stat_statements and monitor the top 10 slowest queries weekly — most production databases have 3-5 queries that consume 80% of query time.
Key Takeaway
SELECT * wastes bandwidth, breaks covering indexes, and prevents query plan caching.
Functions on WHERE columns kill index usage — always use range conditions instead.
N+1 = 1001 queries instead of 1 — the most common ORM performance trap, fixed with a single JOIN.

Indexing Strategy for Query Optimisation

Most query optimisation problems are indexing problems. Before rewriting any query, verify that the right indexes exist and are being used.

The process: identify the slow query, run EXPLAIN ANALYZE, look for Seq Scan on the WHERE columns, create the appropriate index, run EXPLAIN ANALYZE again to verify it is used.

Partial indexes are underused but extremely effective: CREATE INDEX idx_pending_orders ON orders(created_at) WHERE status = 'pending'. This index only contains rows where status is 'pending' — it is much smaller than a full index on created_at, fits in cache more easily, and is highly selective for the specific query.

Composite index column order determines usability: put equality conditions first, range conditions last. An index on (status, created_at) supports WHERE status = 'pending' AND created_at > '2024-01-01' using both columns. An index on (created_at, status) would only use the created_at portion for a range query on created_at.

indexing_strategy.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- DIAGNOSIS: find tables with the most sequential scans
SELECT
    relname         AS table_name,
    seq_scan,
    seq_tup_read,
    idx_scan,
    seq_scan - idx_scan AS scan_gap
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan
  AND n_live_tup > 10000
ORDER BY seq_tup_read DESC;

-- Standard composite index: equality first, range last
CREATE INDEX idx_orders_status_created
    ON orders(status, created_at DESC);

-- Partial index: only the rows you query (much smaller)
CREATE INDEX idx_active_sessions
    ON sessions(user_id, last_active)
    WHERE is_active = true;  -- only active sessions indexed

-- Covering index: all SELECT columns in the index
CREATE INDEX idx_order_summary
    ON orders(customer_id, status)
    INCLUDE (order_id, total, created_at);  -- SELECT columns in leaf nodes

-- Index for sort: ORDER BY without a filesort
CREATE INDEX idx_orders_created_desc
    ON orders(created_at DESC);
-- Now: SELECT order_id FROM orders ORDER BY created_at DESC LIMIT 20
-- uses index scan in sorted order -- no Sort node in EXPLAIN

-- Verify index usage after creation
SELECT indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
WHERE relname = 'orders'
ORDER BY idx_scan DESC;
Output
-- pg_stat_user_tables: tables with high seq scan rate
table_name | seq_scan | idx_scan | scan_gap
order_events | 184020 | 1200 | 182820 <- needs index audit
-- After covering index:
IndexName | idx_scan
idx_order_summary | 45200 <- being used heavily
Use pg_stat_user_tables to Find Tables That Need Indexes
The pg_stat_user_tables view tracks seq_scan and idx_scan counts since the last statistics reset. Tables where seq_scan >> idx_scan on large tables are candidates for index review. Sort by seq_tup_read DESC to find the tables generating the most sequential read work.
Production Insight
The pg_stat_user_tables seq_scan/idx_scan ratio is the best passive indicator for tables that need indexing — check it weekly.
Partial indexes often outperform full indexes for queries with a common equality filter — they are smaller, fit in cache better, and are more selective.
After adding an index, wait 1-2 hours of production traffic then check idx_scan in pg_stat_user_indexes — if it is zero, the index is not being used and should be investigated or dropped.
Key Takeaway
Index the columns your WHERE clauses filter on — composite index order matters (equality first, range last).
Partial indexes for common filter values are smaller and faster than full-column indexes.
Verify every new index is being used with pg_stat_user_indexes — an unused index is pure write overhead.

Join Algorithms — Nested Loop, Hash Join, Merge Join

When you write a JOIN between two tables, the query planner chooses an algorithm to combine rows. Knowing how each algorithm works helps you interpret EXPLAIN output and understand why some joins are orders of magnitude slower than others.

Nested Loop Join: For each row in the outer table, scan the inner table to find matches. If the inner side has an index, it becomes a fast index lookup — ideal when one side is small. Without an index, it degenerates into a full scan per outer row, leading to O(n * m) cost. EXPLAIN shows 'Nested Loop' and the loops count reveals how many times the inner side was scanned.

Hash Join: Build a hash table from the smaller table, then scan the larger table and probe the hash. Works best for equi-joins on large sets when one side fits in memory. The planner chooses this when tables are large and an index is not expected to help. EXPLAIN will show 'Hash Join' with a 'Hash' child node.

Merge Join: Sort both tables on the join key (or exploit existing index order) then merge them in a single pass. Preferable when both sides are already sorted (e.g., by index), or when the join result needs to be sorted. Without pre-sorted data, the sort cost can dominate. EXPLAIN shows 'Merge Join' with optional 'Sort' children.

The planner picks the algorithm based on estimated row counts, available indexes, and system configuration. When you see an unexpected choice, investigate statistics freshness or consider forcing a different join with join_collapse_limit (PostgreSQL) or optimizer hints.

join_algorithms.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- Example: small orders table (10 rows) joined to large customers (1M rows)
-- Planner likely chooses Nested Loop with index on customers.customer_id
EXPLAIN (ANALYZE, BUFFERS)
SELECT o.order_id, c.name
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
WHERE o.status = 'pending';
-- Plan: Nested Loop
--   -> Seq Scan on orders (cost=... rows=10)
--   -> Index Scan using customers_pkey on customers (cost=... rows=1)
-- loops: each outer row triggers one index lookup = 10 lookups total.

-- Example: large orders (5M) joined to large customers (2M) with no index on fk
-- Planner switches to Hash Join to avoid Nested Loop blowup
EXPLAIN (ANALYZE, BUFFERS)
SELECT o.order_id, c.name
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
WHERE o.created_at > '2024-01-01';
-- Plan: Hash Join
--   -> Seq Scan on orders (filter: created_at)  -- build inner table
--   -> Hash
--         -> Seq Scan on customers (full scan to build hash)
-- Then probe hash for each matching order row.

-- When both sides are large and already sorted by the join key:
-- Merge Join avoids building a hash and handles range joins well.
-- Ensure an index exists on the join keys to skip explicit sort.
Output
-- Nested Loop with 10 outer rows: 10 index lookups, fast (<1ms)
-- Nested Loop with 100k outer rows and no inner index: 100k * full scans = disaster (minutes)
-- Hash Join: builds hash (100ms) + probe (10ms) = good for equi-joins
-- Merge Join: no hash building, but sort overhead if not indexed
Production Insight
When a production join suddenly slows down after data growth, check if the planner switched from a Hash Join to a Nested Loop (or vice versa). A Nested Loop with high loops count and no index on the inner side is the number one join performance killer. Fix by adding an index on the foreign key column.
Key Takeaway
Nested Loop is efficient for small outer sets with inner index; Hash Join scales to large equi-joins; Merge Join shines when data is pre-sorted. Know which algorithm your EXPLAIN shows.

Sargability — Search Argumentable Conditions

Sargability (Search ARGument ABLE) is the property that determines whether a WHERE condition can use an index. If the column appears inside a function or is wrapped in an expression, the index is unusable. The database must evaluate the function on every row, forcing a full table scan.

  1. Functions on columns: WHERE YEAR(created_at) = 2024 → fix: WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'
  2. Arithmetic on columns: WHERE price * 1.1 > 100 → fix: WHERE price > 100 / 1.1
  3. Implicit type conversion: WHERE varchar_col = 12345 → fix: WHERE varchar_col = '12345' (use explicit literal type)
  4. LIKE patterns with leading wildcard: WHERE name LIKE '%john%' → cannot be made sargable; consider full-text search.
  5. Subqueries in WHERE NOT IN: WHERE id NOT IN (SELECT ...) → often rewritten as NOT EXISTS for better performance.

The checklist: before writing a WHERE clause

sargable_conditions.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- BAD (not sargable): function on column
SELECT * FROM orders WHERE EXTRACT(YEAR FROM created_at) = 2024;

-- GOOD (sargable): range condition
SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';

-- BAD: arithmetic on column
SELECT * FROM products WHERE price * 1.1 > 50;

-- GOOD: move arithmetic to constant side
SELECT * FROM products WHERE price > 50 / 1.1;

-- BAD: implicit type conversion (varchar column compared to integer)
SELECT * FROM users WHERE user_id = 12345;  -- user_id is VARCHAR

-- GOOD: match the column type
SELECT * FROM users WHERE user_id = '12345';

-- BAD: leading wildcard (never sargable)
SELECT * FROM customers WHERE name LIKE '%smith%';

-- Alternative: full-text search with GIN index
-- CREATE INDEX idx_customers_name_trgm ON customers USING gin(name gin_trgm_ops);

-- BAD: NOT IN with subquery (often becomes anti-join)
SELECT * FROM orders WHERE customer_id NOT IN (SELECT id FROM suspended_customers);

-- GOOD: NOT EXISTS (often better plan)
SELECT * FROM orders WHERE NOT EXISTS (SELECT 1 FROM suspended_customers WHERE id = orders.customer_id);
Output
-- Before fix: YEAR(created_at)=2024 uses Seq Scan (50M rows scanned, 20s)
-- After fix: created_at range uses Index Scan (ms)
-- NOT IN -> NOT EXISTS: plan changes from HashAggregate to Anti Join, much faster
Sargability Checklist
  • No functions on the indexed column (YEAR, DATE_TRUNC, UPPER, etc.)
  • No arithmetic on the indexed column (price * 1.1, total + tax)
  • Literal types exactly match column types (avoid implicit conversion)
  • LIKE patterns start without a wildcard (name LIKE 'prefix%')
  • Use range conditions instead of BETWEEN? BETWEEN is sargable on dates
  • Prefer NOT EXISTS over NOT IN when subquery may have NULLs
Production Insight
Sargability violations are the silent killer of index usage. A single TO_CHAR(created_at, 'YYYY-MM') in a dashboard query can bring a 50M-row table to its knees. Code review every new query for sargability — it takes 10 seconds to spot and saves hours of debugging.
Key Takeaway
A WHERE clause is sargable if the indexed column appears alone on one side of the operator, without functions, arithmetic, or type conversion.

Selectivity vs Cardinality — How the Planner Chooses

The query planner lives and dies by its estimates of row counts. Two concepts drive those estimates: cardinality and selectivity.

Cardinality is the number of distinct values in a column. A primary key has cardinality equal to the row count. A boolean column has cardinality 2. High cardinality (many distinct values) means an index scan is likely efficient — few rows match each value. Low cardinality (few distinct values) means an index scan is wasteful because many rows match each value and a sequential scan would be faster.

Selectivity is the fraction of rows that pass a filter. A predicate like status = 'pending' on a table where 90% of rows have status 'pending' has low selectivity (0.9). A predicate like id = 42 on a primary key has high selectivity (1/rowcount). The planner multiplies selectivity by total row count to estimate row output.

The planner uses these estimates to choose join algorithms and access methods: - High selectivity (small fraction of rows): index scan is preferred. - Low selectivity (large fraction of rows): sequential scan is preferred even if an index exists. - For joins, the planner uses estimated cardinalities to choose build vs probe side in hash joins, and whether to sort for merge joins.

When statistics are stale, selectivity estimates are wrong, leading to bad plans. Always run ANALYZE after bulk inserts or after noticing a regression.

selectivity_cardinality.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
-- Check cardinality and selectivity for a column
SELECT
    n_distinct AS distinct_values,
    CASE
        WHEN n_distinct > 0 THEN round(reltuples / n_distinct)::bigint
        ELSE NULL
    END AS avg_rows_per_value,
    reltuples AS total_rows,
    avg_width
FROM pg_class c
JOIN pg_attribute a ON a.attrelid = c.oid
JOIN pg_stats s ON s.tablename = c.relname AND s.attname = a.attname
WHERE c.relname = 'orders' AND a.attname = 'status';

-- Estimate selectivity: fraction of rows matching a value
-- For status = 'pending', if n_distinct = 3 and reltuples = 5,000,000
-- selectivity = 1/3 ≈ 0.33, estimated rows = 1.67M
-- The planner will choose Seq Scan because 33% of table is too many for index.

-- Verify statistics are current:
SELECT relname, n_live_tup, last_analyze, last_autoanalyze
FROM pg_stat_user_tables
WHERE relname = 'orders';
Output
-- orders.status: n_distinct = 3, reltuples = 5M, avg_rows_per_value = 1.67M
-- Since 1.67M >> index threshold, planner uses Seq Scan for status filters
-- Contrast orders.order_id: n_distinct = 5M, avg_rows_per_value = 1
-- index scan is always chosen for equality on order_id
Cardinality Drives Planner Choices
  • High cardinality (e.g., primary key): few rows per value → index scan is cheap and effective
  • Low cardinality (e.g., boolean): many rows per value → index scan may be slower than sequential scan
  • The planner decides based on estimated rows matching, not on whether an index exists
  • Stale statistics cause wrong cardinality estimates → broken query plans
Production Insight
When a query on a low-cardinality column (like status or is_active) suddenly slows down, check if a bad plan is being cached (PostgreSQL plans are not cached per query text; but parameterized queries can be re-planned). More often, a data skew issue: 99% of rows have status='completed' and 1% have 'pending', but the planner thinks it's evenly distributed. Use custom statistics (CREATE STATISTICS) to give the planner histograms on correlated columns.
Key Takeaway
Selectivity (fraction of rows matching a predicate) determines whether the planner uses an index or sequential scan. Cardinality (distinct values) drives the selectivity estimate. Keep statistics fresh.
Planner Decision Flow Based on Selectivity
Yes: High selectivityNo: Low selectivityQuery with WHERE filterEstimate selectivityselectivity < 0.05?Use Index ScanUse Seq ScanJoin?Hash Join if large both sides,Merge Join if sorted,else Hash Join

Avoid `SELECT *` Like It's a Production Pager at 3 AM

SELECT * is the duct tape of SQL. It works until it doesn't. When you blast every column back to the app, you're burning I/O, memory, and network bandwidth on data nobody asked for.

The real cost? Index-only scans die. The moment you add a column that's not in a covering index, the planner falls back to clustered index scans or heap lookups. Your carefully crafted composite index becomes a waste of disk space.

The fix is surgical: name the columns you actually need. If the query changes, update the list. Yes, it's more typing. Delete your ORM magic and write the damn column list.

Bonus: Your code becomes self-documenting. Future you won't have to grep the application layer to figure out what the query actually returns.

SelectStarPostmortem.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
// io.thecodeforge — database tutorial

-- Before: 17 columns, 3 MB of data, 480 ms
SELECT *
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2024-01-01';

-- After: 5 columns, 240 KB of data, 62 ms
SELECT o.order_id, o.total, o.status,
       c.customer_name, c.email
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2024-01-01';
Output
-- Before: 480 ms, 1,200,000 rows, 3 MB transferred
-- After: 62 ms, 1,200,000 rows, 240 KB transferred
-- Speedup: 7.7x, bandwidth savings: 92%
Production Trap:
ORMs like Entity Framework and Hibernate love SELECT *. One dev adds a BLOB column to the table, and suddenly your 'working' endpoint takes 30 seconds. Always inspect the generated SQL in production.
Key Takeaway
Name every column explicitly. If the list is long, that's a smell — maybe your table is a god object.

Kill Correlated Subqueries Before They Kill Your SLA

Correlated subqueries look elegant in a textbook. In production, they're a row-by-row operations disguised as SQL. The database executes the inner query for every single row in the outer result set. For a table with 100K rows, that's 100K executions.

The fix is almost always a JOIN or a window function. JOINs let the optimizer build a hash or merge plan. Window functions compute aggregates over ordered partitions in a single pass. Both scale logarithmically or linearly, not quadratically.

If you must use a subquery, verify it gets flattened into a JOIN by checking the execution plan. If you see "DEPENDENT SUBQUERY" in EXPLAIN output, you've got a problem.

Pro tip: EXISTS often outperforms IN for correlated subqueries because it short-circuits on the first match. But nothing beats rewriting it as a proper JOIN.

SubquerySlayer.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// io.thecodeforge — database tutorial

-- Painful: Correlated subquery, 4.2 seconds
SELECT o.order_id, o.total
FROM orders o
WHERE o.total > (
    SELECT AVG(o2.total)
    FROM orders o2
    WHERE o2.customer_id = o.customer_id
);

-- Lean: Window function, 0.3 seconds
WITH customer_avg AS (
    SELECT order_id, total,
           AVG(total) OVER (PARTITION BY customer_id) AS avg_total
    FROM orders
)
SELECT order_id, total
FROM customer_avg
WHERE total > avg_total;
Output
-- Before: 4200 ms, nested loop with dependent subquery
-- After: 300 ms, single sequential scan + window aggregate
-- Speedup: 14x
Senior Shortcut:
Any time you write a subquery that references a column from the outer query, stop. Rewrite it as a JOIN or window function. Your DBA will buy you a beer.
Key Takeaway
Correlated subqueries are anti-patterns. Use window functions or JOINs instead. Check EXPLAIN for 'DEPENDENT SUBQUERY' and treat it as a bug.

EXISTS Over IN — The Short-Circuit That Saves Milliseconds

IN with a subquery looks clean. But the database materializes the entire subquery result set into a temporary structure before checking each row. For a subquery returning 50K rows, that's 50K values in memory before the first row comparison.

EXISTS is lazy. It returns TRUE the instant it finds a single matching row. No materialization. No temp table. The planner can also use semi-join strategies, pushing predicates down into the subquery before execution.

The performance gap widens with NULL handling. IN behaves weirdly with NULLs — an empty set combined with NULL returns FALSE, not NULL. EXISTS avoids this entire class of bugs.

Exception: If the inner query has a DISTINCT or aggregate, IN might match your intent better. But even then, consider rewriting as a JOIN with DISTINCT once and test both paths.

ExistsOverIn.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// io.thecodeforge — database tutorial

-- Slow: IN materializes 50K IDs
SELECT customer_name, email
FROM customers
WHERE customer_id IN (
    SELECT customer_id
    FROM orders
    WHERE order_total > 1000
);

-- Fast: EXISTS short-circuits
SELECT customer_name, email
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
    AND o.order_total > 1000
);
Output
-- IN: 1240 ms, Materialize 53,412 IDs, Hash Join
-- EXISTS: 180 ms, Nested Loop Semi-Join, stops at first match per row
-- Speedup: 6.9x
Quick Rule:
Use EXISTS for correlated checks (does a row exist?). Use IN only for static, small lists like WHERE status IN ('active', 'pending'). Never mix them in the same query.
Key Takeaway
EXISTS outperforms IN for subqueries because it short-circuits on the first match. Default to EXISTS unless you need to handle NULLs explicitly.
● Production incidentPOST-MORTEMseverity: high

An OFFSET Pagination Query Degraded from 50ms to 45 Seconds as the Table Grew

Symptom
Customer support reported that the orders history page was timing out for long-time customers with many orders. New customers saw no issue. The problem grew linearly worse over time.
Assumption
OFFSET is a standard SQL feature for pagination. The developer assumed performance was consistent across all page numbers.
Root cause
LIMIT 20 OFFSET 10000 requires the database to scan and discard 10,000 rows before returning 20. At OFFSET 10,000 on a 5-million-row table, the query performs a significant portion of a full table scan on every page load. This is O(offset) complexity — each page deeper into the result set is slower.
Fix
Rewrote to keyset pagination: WHERE order_id < :last_seen_id ORDER BY order_id DESC LIMIT 20. The query uses the primary key index and returns any page in consistent sub-millisecond time regardless of how deep into the result set.
Key lesson
  • OFFSET pagination has O(offset) complexity — it gets slower as users page deeper
  • Keyset pagination (cursor-based) is O(1) regardless of position in the result set
  • Test pagination performance at page 1,000, not just page 1
Production debug guideThe systematic approach — measure first, fix second4 entries
Symptom · 01
Slow query — need to identify root cause
Fix
Run EXPLAIN ANALYZE on the exact query. Start from the innermost node. Look for: Seq Scan on large tables (missing index), loops=N > 1000 (N+1 or Nested Loop), Sort without index backing, estimated rows far from actual rows (stale statistics).
Symptom · 02
EXPLAIN shows correct plan but query is still slow
Fix
Add BUFFERS to EXPLAIN: EXPLAIN (ANALYZE, BUFFERS). High blks_hit / (blks_hit + blks_read) ratio = data mostly in cache (fast). Low ratio = reading from disk (slow). If disk-bound, consider: more RAM for shared_buffers, partitioning, or archiving old data.
Symptom · 03
Query was fast last week but slow now with no code changes
Fix
Data volume change or statistics staleness. Check: SELECT n_live_tup, last_autoanalyze FROM pg_stat_user_tables WHERE relname = 'table'. Run ANALYZE if stale. Check if row count grew dramatically — a plan optimised for 100K rows may be wrong for 1M rows.
Symptom · 04
Application has N+1 queries — one query per item in a list
Fix
Identify the pattern in application logs: the same query appearing N times with different ID values. Rewrite as a single JOIN or IN clause. For ORMs, use eager loading: include(), joinedload(), or prefetch_related() depending on the framework.
★ Query Optimisation Quick WinsThe highest-impact changes in order of effort
Slow query on a large table
Immediate action
EXPLAIN ANALYZE — identify the most expensive node
Commands
EXPLAIN ANALYZE SELECT ...your slow query...;
SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'your_table';
Fix now
If Seq Scan on WHERE column: CREATE INDEX idx_table_col ON table(col);
Pagination query slowing down at high page numbers+
Immediate action
Switch from OFFSET to keyset pagination
Commands
-- BEFORE: LIMIT 20 OFFSET 10000 (slow, scans 10000 rows)
-- AFTER: WHERE id < :last_seen_id ORDER BY id DESC LIMIT 20
Fix now
Index on the keyset column is required: CREATE INDEX idx_orders_id ON orders(id);
Anti-PatternProblemImpactFix
SELECT *Fetches all columns including unused and large onesHigh network transfer, breaks covering indexesList specific columns in SELECT
Function on WHERE columnDisables the index on that columnFull table scan even with indexRewrite as range: WHERE col >= x AND col < y
N+1 queriesOne query per item in a loop100-1000x more queries than necessarySingle JOIN or ORM eager loading
OFFSET paginationScans and discards offset rowsO(offset) complexity — degrades with page depthKeyset pagination: WHERE id < last_seen
Correlated subquery in WHERERe-executes inner query per outer rowO(n×m) complexityReplace with JOIN to pre-aggregated CTE
Missing index on FK columnFull scan on child table for every parent DELETELocks and slow deletionsCREATE INDEX on every foreign key column

Key takeaways

1
Measure with EXPLAIN ANALYZE before every optimisation
never guess at the cause of a slow query.
2
Indexes first
a single well-chosen index is typically a 100-1000x improvement over any query rewrite.
3
SELECT *, N+1 queries, and OFFSET pagination are the three most common query anti-patterns
fix these first in any legacy codebase.
4
pg_stat_statements is the best production monitoring tool for SQL performance
install it and query it weekly.

Common mistakes to avoid

3 patterns
×

Adding indexes speculatively without running EXPLAIN ANALYZE first

Symptom
Indexes are added based on guesses about which columns are slow — some are used, some are not; write performance degrades but the specific slow query may not improve
Fix
Always run EXPLAIN ANALYZE on the specific slow query before adding an index. The plan tells you exactly which columns need indexing. Unused indexes identified by idx_scan = 0 in pg_stat_user_indexes should be dropped.
×

Using OFFSET pagination on high-volume tables

Symptom
Page 1 of search results loads in 50ms, page 500 takes 45 seconds — the problem grows worse as data accumulates and users page deeper
Fix
Switch to keyset pagination: WHERE sort_col < :last_value ORDER BY sort_col DESC LIMIT n. Requires an index on the sort column and storing the last seen value from the previous page. This is O(1) regardless of position.
×

Ignoring N+1 queries because each individual query is fast

Symptom
Each query takes 2ms but 1,000 queries per page load = 2,000ms response time; the problem compounds with user volume
Fix
Enable query logging and count queries per request. Any endpoint making >10 queries for a single page load deserves review. Rewrite as a single JOIN with all needed data. In ORMs, use eager loading to prefetch relationships.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Walk me through how you would diagnose and fix a slow SQL query in produ...
Q02SENIOR
What is the N+1 query problem and how do you detect and fix it?
Q03SENIOR
Why is OFFSET pagination slow at large offsets, and what is the alternat...
Q01 of 03SENIOR

Walk me through how you would diagnose and fix a slow SQL query in production.

ANSWER
My first step is always measurement, not guessing. I run EXPLAIN ANALYZE on the exact query from production — including the actual parameter values if possible. I read the plan from the innermost node outward, looking for the most expensive node by actual time. The most common findings: a Seq Scan on a large table (usually a missing index or a function on an indexed column), a Nested Loop with loops=N in the thousands (often stale statistics causing the planner to underestimate table size), or a Sort node without index backing. I apply one fix at a time — typically adding an index first since it is the highest-impact lowest-effort change. I run EXPLAIN ANALYZE again to verify the plan improved and the estimated-vs-actual rows are now aligned. If the plan looks correct but the query is still slow, I add BUFFERS to check cache hit ratios, which reveals whether the problem is disk I/O.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
What is pg_stat_statements and how do I use it?
02
How do I find queries that are doing too many sequential scans?
03
Is query optimisation different between PostgreSQL and MySQL?
N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Drawn from code that ran under real load.

Follow
Verified
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
🔥

That's SQL Advanced. Mark it forged?

8 min read · try the examples if you haven't

Previous
SQL CTEs — Common Table Expressions
8 / 16 · SQL Advanced
Next
SQL EXPLAIN and Execution Plans