OFFSET Pagination: 50ms to 45s — Keyset Fix
O(offset) pagination: 50ms to 45s on 5M rows.
- Optimisation priority: 1) correct indexes, 2) query rewrites, 3) schema changes — in that order
- Run EXPLAIN ANALYZE first — never guess, always measure
- SELECT * is lazy and harmful: fetches unused columns, breaks covering indexes, increases network overhead
- N+1 query problem: fetching a list then looping to query each item individually — fix with a single JOIN
- OFFSET pagination is slow at high offsets: use keyset pagination (WHERE id > last_seen_id) instead
- Functions on indexed columns in WHERE clauses silently disable the index — rewrite as range conditions
Every production database eventually hits the same wall: queries that ran fine on test data with 10,000 rows suddenly time out in production with 50 million. Dashboards freeze. APIs return 504s. On-call engineers get paged at 2 AM. The culprit is almost always a query the database is executing inefficiently — either because it is doing unnecessary work, or because it is missing a shortcut (an index) to do the necessary work faster.
Every production database eventually hits the same wall: queries that ran fine on test data with 10,000 rows suddenly time out in production with 50 million. Dashboards freeze. APIs return 504s. On-call engineers get paged at 2 AM. The culprit is almost always a query the database is executing inefficiently.
SQL query optimisation is a disciplined process, not guesswork. The correct workflow: measure with EXPLAIN ANALYZE, identify the most expensive node, apply the targeted fix, measure again. This guide walks through the most common optimisation opportunities in order of impact — starting with the fixes that cost the least effort and deliver the most improvement.
The Optimisation Hierarchy — Always Measure First
Optimisation without measurement is guesswork. The correct workflow: run EXPLAIN ANALYZE on the slow query, identify the most expensive node, apply one targeted fix, run EXPLAIN ANALYZE again to verify the improvement. Never apply multiple changes simultaneously — you need to isolate which change had impact.
The optimisation hierarchy in order of impact and effort:
- Indexes — adding the right index is the highest-impact, lowest-effort fix. A single index can reduce a 30-second query to 2 milliseconds. Always exhaust indexing options before rewriting queries.
- Query rewrites — eliminate unnecessary work: replace SELECT * with column lists, eliminate correlated subqueries, rewrite N+1 patterns as single JOINs, switch OFFSET pagination to keyset pagination.
- Schema changes — denormalisation, partitioning, archiving old data. Higher effort, sometimes necessary for extreme scale, but only after indexing and query rewrites are exhausted.
The tool for all three levels is EXPLAIN ANALYZE — it drives every decision.
- pg_stat_statements identifies your slowest queries by avg execution time
- EXPLAIN ANALYZE shows exactly what the database is doing
- One change at a time — apply, measure, validate before the next change
- An index that looks correct may not be used — EXPLAIN ANALYZE proves whether it is
Common Anti-Patterns and How to Fix Them
Several query patterns consistently cause performance problems at scale. Recognising them by sight lets you fix them immediately.
SELECT * fetches every column from the table, including large text or BLOB columns you may not need. It prevents covering indexes from working (the engine must fetch the heap to get columns not in the index), increases network transfer between database and application, and makes query plans harder to cache effectively. Always list the specific columns you need.
Functions on indexed columns in WHERE silently disable the index. WHERE YEAR(created_at) = 2024 cannot use an index on created_at — the function transforms the value before comparison. The fix is always a range: WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'.
The N+1 problem is the most damaging ORM-related pattern: fetch a list of customers (1 query), then loop through and fetch each customer's orders separately (N queries). On a list of 100 customers, that's 101 queries. On a list of 1000, it's 1001. Fix with a single JOIN or eager loading.
Indexing Strategy for Query Optimisation
Most query optimisation problems are indexing problems. Before rewriting any query, verify that the right indexes exist and are being used.
The process: identify the slow query, run EXPLAIN ANALYZE, look for Seq Scan on the WHERE columns, create the appropriate index, run EXPLAIN ANALYZE again to verify it is used.
Partial indexes are underused but extremely effective: CREATE INDEX idx_pending_orders ON orders(created_at) WHERE status = 'pending'. This index only contains rows where status is 'pending' — it is much smaller than a full index on created_at, fits in cache more easily, and is highly selective for the specific query.
Composite index column order determines usability: put equality conditions first, range conditions last. An index on (status, created_at) supports WHERE status = 'pending' AND created_at > '2024-01-01' using both columns. An index on (created_at, status) would only use the created_at portion for a range query on created_at.
An OFFSET Pagination Query Degraded from 50ms to 45 Seconds as the Table Grew
- OFFSET pagination has O(offset) complexity — it gets slower as users page deeper
- Keyset pagination (cursor-based) is O(1) regardless of position in the result set
- Test pagination performance at page 1,000, not just page 1
include(), joinedload(), or prefetch_related() depending on the framework.Key takeaways
Common mistakes to avoid
3 patternsAdding indexes speculatively without running EXPLAIN ANALYZE first
Using OFFSET pagination on high-volume tables
Ignoring N+1 queries because each individual query is fast
Interview Questions on This Topic
Walk me through how you would diagnose and fix a slow SQL query in production.
Frequently Asked Questions
That's SQL Advanced. Mark it forged?
3 min read · try the examples if you haven't