Mid-level 3 min · March 05, 2026

OFFSET Pagination: 50ms to 45s — Keyset Fix

O(offset) pagination: 50ms to 45s on 5M rows.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Optimisation priority: 1) correct indexes, 2) query rewrites, 3) schema changes — in that order
  • Run EXPLAIN ANALYZE first — never guess, always measure
  • SELECT * is lazy and harmful: fetches unused columns, breaks covering indexes, increases network overhead
  • N+1 query problem: fetching a list then looping to query each item individually — fix with a single JOIN
  • OFFSET pagination is slow at high offsets: use keyset pagination (WHERE id > last_seen_id) instead
  • Functions on indexed columns in WHERE clauses silently disable the index — rewrite as range conditions
Plain-English First

Every production database eventually hits the same wall: queries that ran fine on test data with 10,000 rows suddenly time out in production with 50 million. Dashboards freeze. APIs return 504s. On-call engineers get paged at 2 AM. The culprit is almost always a query the database is executing inefficiently — either because it is doing unnecessary work, or because it is missing a shortcut (an index) to do the necessary work faster.

Every production database eventually hits the same wall: queries that ran fine on test data with 10,000 rows suddenly time out in production with 50 million. Dashboards freeze. APIs return 504s. On-call engineers get paged at 2 AM. The culprit is almost always a query the database is executing inefficiently.

SQL query optimisation is a disciplined process, not guesswork. The correct workflow: measure with EXPLAIN ANALYZE, identify the most expensive node, apply the targeted fix, measure again. This guide walks through the most common optimisation opportunities in order of impact — starting with the fixes that cost the least effort and deliver the most improvement.

The Optimisation Hierarchy — Always Measure First

Optimisation without measurement is guesswork. The correct workflow: run EXPLAIN ANALYZE on the slow query, identify the most expensive node, apply one targeted fix, run EXPLAIN ANALYZE again to verify the improvement. Never apply multiple changes simultaneously — you need to isolate which change had impact.

  1. Indexes — adding the right index is the highest-impact, lowest-effort fix. A single index can reduce a 30-second query to 2 milliseconds. Always exhaust indexing options before rewriting queries.
  2. Query rewrites — eliminate unnecessary work: replace SELECT * with column lists, eliminate correlated subqueries, rewrite N+1 patterns as single JOINs, switch OFFSET pagination to keyset pagination.
  3. Schema changes — denormalisation, partitioning, archiving old data. Higher effort, sometimes necessary for extreme scale, but only after indexing and query rewrites are exhausted.

The tool for all three levels is EXPLAIN ANALYZE — it drives every decision.

optimisation_workflow.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
-- STEP 1: Identify slow queries with pg_stat_statements
-- (requires pg_stat_statements extension)
SELECT
    query,
    calls,
    total_exec_time / calls AS avg_ms,
    rows / calls             AS avg_rows
FROM pg_stat_statements
ORDER BY avg_ms DESC
LIMIT 10;

-- STEP 2: Run EXPLAIN ANALYZE on the worst offender
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT o.order_id, c.name, o.total
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '24 hours';

-- STEP 3: Apply targeted fix based on plan
-- If Seq Scan on orders(status, created_at):
CREATE INDEX idx_orders_status_created
    ON orders(status, created_at DESC)
    WHERE status = 'pending';  -- partial index: only pending rows

-- STEP 4: Verify improvement
EXPLAIN (ANALYZE, BUFFERS)
SELECT o.order_id, c.name, o.total
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
WHERE o.status = 'pending'
  AND o.created_at > NOW() - INTERVAL '24 hours';
Output
-- Before index: Seq Scan, actual time=4820ms
-- After partial index: Index Scan using idx_orders_status_created
-- actual time=3.2ms
-- 1,500x improvement from one targeted index
The EXPLAIN-First Rule
  • pg_stat_statements identifies your slowest queries by avg execution time
  • EXPLAIN ANALYZE shows exactly what the database is doing
  • One change at a time — apply, measure, validate before the next change
  • An index that looks correct may not be used — EXPLAIN ANALYZE proves whether it is
Production Insight
pg_stat_statements is the most valuable extension in PostgreSQL — install it on every production database and query it weekly to find slow query regressions.
Apply optimisations one at a time — multiple simultaneous changes make it impossible to attribute improvement.
Always measure the before and after: note the actual time from EXPLAIN ANALYZE before and after each change.
Key Takeaway
Measure first with EXPLAIN ANALYZE — never optimise by intuition.
The hierarchy: indexes first, query rewrites second, schema changes last.
One change at a time — isolate impact before the next fix.

Common Anti-Patterns and How to Fix Them

Several query patterns consistently cause performance problems at scale. Recognising them by sight lets you fix them immediately.

SELECT * fetches every column from the table, including large text or BLOB columns you may not need. It prevents covering indexes from working (the engine must fetch the heap to get columns not in the index), increases network transfer between database and application, and makes query plans harder to cache effectively. Always list the specific columns you need.

Functions on indexed columns in WHERE silently disable the index. WHERE YEAR(created_at) = 2024 cannot use an index on created_at — the function transforms the value before comparison. The fix is always a range: WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'.

The N+1 problem is the most damaging ORM-related pattern: fetch a list of customers (1 query), then loop through and fetch each customer's orders separately (N queries). On a list of 100 customers, that's 101 queries. On a list of 1000, it's 1001. Fix with a single JOIN or eager loading.

anti_patterns_fixed.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
-- ANTI-PATTERN 1: SELECT *
-- BAD: fetches all columns including large blobs, breaks covering indexes
SELECT * FROM orders WHERE customer_id = 42;

-- GOOD: only the columns you need
SELECT order_id, total, status, created_at
FROM orders
WHERE customer_id = 42;

-- ANTI-PATTERN 2: Function on indexed column
-- BAD: YEAR() disables the index on created_at
SELECT * FROM orders WHERE YEAR(created_at) = 2024;

-- GOOD: range query uses the index
SELECT order_id, total FROM orders
WHERE created_at >= '2024-01-01'
  AND created_at <  '2025-01-01';

-- ANTI-PATTERN 3: N+1 queries
-- BAD: one query for customers, then one per customer for orders
-- SELECT * FROM customers;  -- returns 1000 customers
-- then for each customer:
-- SELECT * FROM orders WHERE customer_id = :id;  -- 1000 separate queries!

-- GOOD: single JOIN fetches everything in one query
SELECT
    c.customer_id, c.name,
    o.order_id, o.total, o.created_at
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.customer_id
WHERE c.customer_id IN (1, 2, 3, ..., 1000);  -- or use a subquery

-- ANTI-PATTERN 4: OFFSET pagination
-- BAD: OFFSET 50000 scans and discards 50000 rows
SELECT order_id, total FROM orders ORDER BY order_id LIMIT 20 OFFSET 50000;

-- GOOD: keyset pagination -- O(1) regardless of position
SELECT order_id, total FROM orders
WHERE order_id < :last_seen_id  -- last ID from previous page
ORDER BY order_id DESC
LIMIT 20;
Output
-- N+1 fix: 1001 queries -> 1 query
-- OFFSET at page 2500: 50000ms -> 2ms with keyset pagination
-- SELECT * -> named columns: query plan becomes cacheable, covering index usable
N+1 Queries Are an ORM Silent Failure
ORMs make N+1 queries easy to write accidentally — loading a list of objects and accessing a relationship inside a loop triggers a query per iteration. Enable query logging in development and count queries per request — any endpoint making >10 database queries for a single page load is a candidate for N+1 review. The fix is almost always eager loading (include/prefetch) or restructuring to a single JOIN.
Production Insight
SELECT * is the single easiest performance improvement in a legacy codebase — grep for it, replace with column lists, and watch query times drop immediately.
Offset pagination is one of the most common hidden performance time bombs — it looks fine in development (page 1) but degrades silently in production as data grows and users page deeper.
Enable pg_stat_statements and monitor the top 10 slowest queries weekly — most production databases have 3-5 queries that consume 80% of query time.
Key Takeaway
SELECT * wastes bandwidth, breaks covering indexes, and prevents query plan caching.
Functions on WHERE columns kill index usage — always use range conditions instead.
N+1 = 1001 queries instead of 1 — the most common ORM performance trap, fixed with a single JOIN.

Indexing Strategy for Query Optimisation

Most query optimisation problems are indexing problems. Before rewriting any query, verify that the right indexes exist and are being used.

The process: identify the slow query, run EXPLAIN ANALYZE, look for Seq Scan on the WHERE columns, create the appropriate index, run EXPLAIN ANALYZE again to verify it is used.

Partial indexes are underused but extremely effective: CREATE INDEX idx_pending_orders ON orders(created_at) WHERE status = 'pending'. This index only contains rows where status is 'pending' — it is much smaller than a full index on created_at, fits in cache more easily, and is highly selective for the specific query.

Composite index column order determines usability: put equality conditions first, range conditions last. An index on (status, created_at) supports WHERE status = 'pending' AND created_at > '2024-01-01' using both columns. An index on (created_at, status) would only use the created_at portion for a range query on created_at.

indexing_strategy.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- DIAGNOSIS: find tables with the most sequential scans
SELECT
    relname         AS table_name,
    seq_scan,
    seq_tup_read,
    idx_scan,
    seq_scan - idx_scan AS scan_gap
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan
  AND n_live_tup > 10000
ORDER BY seq_tup_read DESC;

-- Standard composite index: equality first, range last
CREATE INDEX idx_orders_status_created
    ON orders(status, created_at DESC);

-- Partial index: only the rows you query (much smaller)
CREATE INDEX idx_active_sessions
    ON sessions(user_id, last_active)
    WHERE is_active = true;  -- only active sessions indexed

-- Covering index: all SELECT columns in the index
CREATE INDEX idx_order_summary
    ON orders(customer_id, status)
    INCLUDE (order_id, total, created_at);  -- SELECT columns in leaf nodes

-- Index for sort: ORDER BY without a filesort
CREATE INDEX idx_orders_created_desc
    ON orders(created_at DESC);
-- Now: SELECT order_id FROM orders ORDER BY created_at DESC LIMIT 20
-- uses index scan in sorted order -- no Sort node in EXPLAIN

-- Verify index usage after creation
SELECT indexrelname, idx_scan, idx_tup_read
FROM pg_stat_user_indexes
WHERE relname = 'orders'
ORDER BY idx_scan DESC;
Output
-- pg_stat_user_tables: tables with high seq scan rate
table_name | seq_scan | idx_scan | scan_gap
order_events | 184020 | 1200 | 182820 <- needs index audit
-- After covering index:
IndexName | idx_scan
idx_order_summary | 45200 <- being used heavily
Use pg_stat_user_tables to Find Tables That Need Indexes
The pg_stat_user_tables view tracks seq_scan and idx_scan counts since the last statistics reset. Tables where seq_scan >> idx_scan on large tables are candidates for index review. Sort by seq_tup_read DESC to find the tables generating the most sequential read work.
Production Insight
The pg_stat_user_tables seq_scan/idx_scan ratio is the best passive indicator for tables that need indexing — check it weekly.
Partial indexes often outperform full indexes for queries with a common equality filter — they are smaller, fit in cache better, and are more selective.
After adding an index, wait 1-2 hours of production traffic then check idx_scan in pg_stat_user_indexes — if it is zero, the index is not being used and should be investigated or dropped.
Key Takeaway
Index the columns your WHERE clauses filter on — composite index order matters (equality first, range last).
Partial indexes for common filter values are smaller and faster than full-column indexes.
Verify every new index is being used with pg_stat_user_indexes — an unused index is pure write overhead.
● Production incidentPOST-MORTEMseverity: high

An OFFSET Pagination Query Degraded from 50ms to 45 Seconds as the Table Grew

Symptom
Customer support reported that the orders history page was timing out for long-time customers with many orders. New customers saw no issue. The problem grew linearly worse over time.
Assumption
OFFSET is a standard SQL feature for pagination. The developer assumed performance was consistent across all page numbers.
Root cause
LIMIT 20 OFFSET 10000 requires the database to scan and discard 10,000 rows before returning 20. At OFFSET 10,000 on a 5-million-row table, the query performs a significant portion of a full table scan on every page load. This is O(offset) complexity — each page deeper into the result set is slower.
Fix
Rewrote to keyset pagination: WHERE order_id < :last_seen_id ORDER BY order_id DESC LIMIT 20. The query uses the primary key index and returns any page in consistent sub-millisecond time regardless of how deep into the result set.
Key lesson
  • OFFSET pagination has O(offset) complexity — it gets slower as users page deeper
  • Keyset pagination (cursor-based) is O(1) regardless of position in the result set
  • Test pagination performance at page 1,000, not just page 1
Production debug guideThe systematic approach — measure first, fix second4 entries
Symptom · 01
Slow query — need to identify root cause
Fix
Run EXPLAIN ANALYZE on the exact query. Start from the innermost node. Look for: Seq Scan on large tables (missing index), loops=N > 1000 (N+1 or Nested Loop), Sort without index backing, estimated rows far from actual rows (stale statistics).
Symptom · 02
EXPLAIN shows correct plan but query is still slow
Fix
Add BUFFERS to EXPLAIN: EXPLAIN (ANALYZE, BUFFERS). High blks_hit / (blks_hit + blks_read) ratio = data mostly in cache (fast). Low ratio = reading from disk (slow). If disk-bound, consider: more RAM for shared_buffers, partitioning, or archiving old data.
Symptom · 03
Query was fast last week but slow now with no code changes
Fix
Data volume change or statistics staleness. Check: SELECT n_live_tup, last_autoanalyze FROM pg_stat_user_tables WHERE relname = 'table'. Run ANALYZE if stale. Check if row count grew dramatically — a plan optimised for 100K rows may be wrong for 1M rows.
Symptom · 04
Application has N+1 queries — one query per item in a list
Fix
Identify the pattern in application logs: the same query appearing N times with different ID values. Rewrite as a single JOIN or IN clause. For ORMs, use eager loading: include(), joinedload(), or prefetch_related() depending on the framework.
★ Query Optimisation Quick WinsThe highest-impact changes in order of effort
Slow query on a large table
Immediate action
EXPLAIN ANALYZE — identify the most expensive node
Commands
EXPLAIN ANALYZE SELECT ...your slow query...;
SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'your_table';
Fix now
If Seq Scan on WHERE column: CREATE INDEX idx_table_col ON table(col);
Pagination query slowing down at high page numbers+
Immediate action
Switch from OFFSET to keyset pagination
Commands
-- BEFORE: LIMIT 20 OFFSET 10000 (slow, scans 10000 rows)
-- AFTER: WHERE id < :last_seen_id ORDER BY id DESC LIMIT 20
Fix now
Index on the keyset column is required: CREATE INDEX idx_orders_id ON orders(id);
Anti-PatternProblemImpactFix
SELECT *Fetches all columns including unused and large onesHigh network transfer, breaks covering indexesList specific columns in SELECT
Function on WHERE columnDisables the index on that columnFull table scan even with indexRewrite as range: WHERE col >= x AND col < y
N+1 queriesOne query per item in a loop100-1000x more queries than necessarySingle JOIN or ORM eager loading
OFFSET paginationScans and discards offset rowsO(offset) complexity — degrades with page depthKeyset pagination: WHERE id < last_seen
Correlated subquery in WHERERe-executes inner query per outer rowO(n×m) complexityReplace with JOIN to pre-aggregated CTE
Missing index on FK columnFull scan on child table for every parent DELETELocks and slow deletionsCREATE INDEX on every foreign key column

Key takeaways

1
Measure with EXPLAIN ANALYZE before every optimisation
never guess at the cause of a slow query.
2
Indexes first
a single well-chosen index is typically a 100-1000x improvement over any query rewrite.
3
SELECT *, N+1 queries, and OFFSET pagination are the three most common query anti-patterns
fix these first in any legacy codebase.
4
pg_stat_statements is the best production monitoring tool for SQL performance
install it and query it weekly.

Common mistakes to avoid

3 patterns
×

Adding indexes speculatively without running EXPLAIN ANALYZE first

Symptom
Indexes are added based on guesses about which columns are slow — some are used, some are not; write performance degrades but the specific slow query may not improve
Fix
Always run EXPLAIN ANALYZE on the specific slow query before adding an index. The plan tells you exactly which columns need indexing. Unused indexes identified by idx_scan = 0 in pg_stat_user_indexes should be dropped.
×

Using OFFSET pagination on high-volume tables

Symptom
Page 1 of search results loads in 50ms, page 500 takes 45 seconds — the problem grows worse as data accumulates and users page deeper
Fix
Switch to keyset pagination: WHERE sort_col < :last_value ORDER BY sort_col DESC LIMIT n. Requires an index on the sort column and storing the last seen value from the previous page. This is O(1) regardless of position.
×

Ignoring N+1 queries because each individual query is fast

Symptom
Each query takes 2ms but 1,000 queries per page load = 2,000ms response time; the problem compounds with user volume
Fix
Enable query logging and count queries per request. Any endpoint making >10 queries for a single page load deserves review. Rewrite as a single JOIN with all needed data. In ORMs, use eager loading to prefetch relationships.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Walk me through how you would diagnose and fix a slow SQL query in produ...
Q02SENIOR
What is the N+1 query problem and how do you detect and fix it?
Q03SENIOR
Why is OFFSET pagination slow at large offsets, and what is the alternat...
Q01 of 03SENIOR

Walk me through how you would diagnose and fix a slow SQL query in production.

ANSWER
My first step is always measurement, not guessing. I run EXPLAIN ANALYZE on the exact query from production — including the actual parameter values if possible. I read the plan from the innermost node outward, looking for the most expensive node by actual time. The most common findings: a Seq Scan on a large table (usually a missing index or a function on an indexed column), a Nested Loop with loops=N in the thousands (often stale statistics causing the planner to underestimate table size), or a Sort node without index backing. I apply one fix at a time — typically adding an index first since it is the highest-impact lowest-effort change. I run EXPLAIN ANALYZE again to verify the plan improved and the estimated-vs-actual rows are now aligned. If the plan looks correct but the query is still slow, I add BUFFERS to check cache hit ratios, which reveals whether the problem is disk I/O.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
What is pg_stat_statements and how do I use it?
02
How do I find queries that are doing too many sequential scans?
03
Is query optimisation different between PostgreSQL and MySQL?
🔥

That's SQL Advanced. Mark it forged?

3 min read · try the examples if you haven't

Previous
SQL CTEs — Common Table Expressions
8 / 16 · SQL Advanced
Next
SQL EXPLAIN and Execution Plans