SQL Indexes — Avoid 90-Minute Foreign Key Lock
An unindexed foreign key on an 8M-row child table caused a 90-minute hold during a 50,000-row deletion.
20+ years shipping high-throughput database systems. Lessons pulled from things that broke in production.
- An index is a sorted B-tree mapping column values to row locations — turns O(n) full table scans into O(log n) lookups
- Clustered index = data physically stored in B-tree leaf nodes (one per table, usually the PK)
- Non-clustered index = separate structure with pointers to heap rows (multiple per table)
- Leftmost prefix rule: index on (A, B) helps queries on A or (A,B) — never on B alone
- Covering index = INCLUDE clause adds columns to leaf nodes — zero heap access, fastest possible read
- Biggest trap: function calls on indexed columns (WHERE YEAR(created_at) = 2024) silently disable the index
Think of a SQL database like a massive 1,000-page textbook. If you want to find the chapter on 'Photosynthesis,' you don't start at page one and read every sentence until you find it (that's a Full Table Scan). Instead, you flip to the Index at the back, find the word 'Photosynthesis,' see it's on page 412, and jump straight there. An SQL index is that exact 'map'—it's a smaller, sorted list that tells the database exactly where the data lives so it doesn't have to search the whole 'book.'
Indexes are the single most impactful optimization available in relational databases. A query that takes 30 seconds on a table with 10 million rows can return in 2 milliseconds after adding the right index. The difference is the gap between scanning every row on disk and jumping directly to the matching ones via a balanced tree structure.
However, indexes aren't free. The cost is write overhead and disk space. Every index you add slows down INSERT, UPDATE, and DELETE operations because the database must maintain the sorted order of the index in real-time. The 'Forge' philosophy is about precision: adding the exact indexes your critical queries need without bloating the table with redundant metadata.
Why SQL Indexes Are Not Optional
An SQL index is a separate data structure — typically a B-tree — that maps column values to row locations, enabling the database to find rows without scanning the entire table. Without an index, every query performs a sequential scan (O(n)), which becomes catastrophic beyond a few thousand rows. With a B-tree index, point lookups drop to O(log n), and range scans become efficient by traversing leaf nodes in order.
Indexes work by maintaining a sorted copy of the indexed column(s) along with pointers to the heap or clustered index. This imposes write overhead: each INSERT, UPDATE, or DELETE must update every relevant index, so adding indexes blindly can degrade write throughput by 2–5x. The database query planner chooses an index only when it estimates the index will reduce cost — typically when the query filters or sorts on leading columns of the index. Composite indexes follow a leftmost prefix rule: a query on (a, b, c) can use an index on (a, b, c) or (a, b) but not on (b, c) alone.
Use indexes on columns that appear in WHERE, JOIN, ORDER BY, or GROUP BY clauses, especially when those columns have high cardinality (many distinct values). In production systems, a missing index is the most common cause of sudden query degradation — a single unindexed foreign key can turn a 10ms JOIN into a 90-minute lock storm under load. Indexes are not free, but the cost of not having them is far higher.
How a B-Tree Index Works
Most relational databases (PostgreSQL, MySQL, SQL Server) use a B-tree (Balanced Tree) structure for indexing. A B-tree keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time.
When you query a column, the database starts at the root node and follows pointers down to the leaf nodes. Each leaf node contains the indexed value plus a 'Record Identifier' (RID) or a Primary Key value that points to the actual row data in the 'Heap' or Clustered Index.
B-Tree Index Structure: Root, Branch, and Leaf Nodes
A B-tree index is a multi-level tree consisting of three types of pages (nodes): root, internal (branch), and leaf. The root node is the single entry point – it contains a small set of keys that divide the data into ranges. When you search for a value, the engine reads the root, follows a pointer to the appropriate branch node, which further narrows the range, and finally reaches the leaf node containing the actual index entry. The number of levels (the height of the tree) depends on the number of entries and the page size. For a B-tree of order k (each node can hold up to k keys), the height grows as log base k of (number of rows), so even with millions of rows the tree is rarely more than 3–5 levels deep.
Leaf nodes are the most important: they store the indexed column value along with a pointer to the actual row (either a page identifier and offset, or the clustered index key). In a non-clustered index, leaf nodes store only the indexed columns and the row locator. In a clustered index, leaf nodes contain all table columns (the data itself). Branch nodes contain routing information only – keys and pointers to lower-level nodes.
A B-tree remains balanced by design – all leaf nodes are at the same depth. When a node overflows, it splits into two nodes, and a new key is promoted upward. This automatic rebalancing ensures consistent O(log n) performance, but the split operations also contribute to write overhead and potential bloat over time.
Visualising the hierarchy makes debugging easier. When EXPLAIN ANALYZE reports 'Index Scan using idx_orders_customer_id' but with a high number of 'Heap Fetches', the issue often lies in leaf node page density (fragmentation) rather than the tree height.
pageinspect in PostgreSQL to examine the actual number of internal levels and detect skewed node splits. If a specific range of keys causes more splits, consider using a fill factor lower than 90% to reserve space for frequent inserts.Composite Indexes and the Leftmost Prefix Rule
A composite index (multi-column index) is sorted primarily by the first column, then the second, and so on. This leads to the Leftmost Prefix Rule: an index on $(A, B)$ can be used for queries filtering by $(A)$ or $(A, B)$, but it is completely useless for a query filtering only by $(B)$.
Choosing the right column order is vital. Generally, you put the column with the highest cardinality (most unique values) first, or the column most frequently used in WHERE clauses.
- First column = primary sort key — queries must include it to use the composite index
- Second column = tie-breaker — only useful after the first column filters
- Equality conditions first, range conditions last — maximises usability
- INCLUDE columns add data to leaves without changing sort order — for covering indexes only
Specialized Indexes: GIN and GiST for JSON and Text Search
While B-tree indexes excel at equality and range queries on simple scalar columns, they cannot efficiently handle complex data types like JSONB, arrays, tsvectors, or geometric shapes. For such cases, PostgreSQL provides two generalized index methods: GIN (Generalized Inverted Index) and GiST (Generalized Search Tree).
GIN indexes are designed for composite types: JSONB, arrays, range types, and full-text search (tsvector). A GIN index stores a mapping from each possible key (e.g., each JSONB key/value pair, each array element, each lexeme in a tsvector) to a list of row locations. Queries using the @>, ?, ?|, ?& operators on JSONB become fast index scans instead of table-wide scans. GIN indexes are relatively expensive to build and maintain (each insert may touch many index entries), and they consume more storage than a B-tree on the same column. However, for read-heavy workloads that query JSONB with containment or existence operators, GIN is the only viable option.
GiST indexes are more versatile: they support range, point, polygon, and full-text search (though GIN is usually faster for text). GiST is also used for nearest-neighbor ordering and overlapping containment queries. When you need to find all rows that contain a point within a polygon or rows with a tsvector that matches a tsquery, a GiST index can accelerate those operations.
Picking the right one: For JSONB, prefer GIN with jsonb_path_ops operator class for smaller index and faster lookups on @> queries. For full-text search, GIN on a tsvector column is the standard. For geometric or range types, GiST is the natural choice. Both index types support concurrent creation just like B-trees.
When Indexes Hurt: Write Amplification and Bloat
Every time you INSERT a row into a table with 5 indexes, the database must perform 6 writes (1 to the table, 5 to the indexes). This is known as Write Amplification. Furthermore, indexes can become 'fragmented' over time as rows are deleted and updated, leading to bloated files and degraded performance.
Index Maintenance: Fill Factor, Bloat, and Reindexing
Even well-chosen indexes degrade over time. Three key maintenance parameters and operations keep indexes healthy: fill factor, periodic reindexing, and bloat monitoring.
Fill Factor controls how much free space is left on each B-tree page when the index is built or rebuilt. The default in PostgreSQL is 90% for B-trees (10% free space). This space is reserved for future updates to existing rows that might cause the indexed value to change, or for inserts that fall into the same page range. Without that free space, every insert or update that targets a full page forces a page split — dividing the page into two half-full pages and promoting a key upward. Page splits increase index height slowly and cause fragmentation. Setting a fill factor of 70% for a table with frequent updates to indexed columns reduces splits but increases index size and storage cost.
Bloat occurs when page splits, dead tuples (after UPDATE/DELETE), and incomplete vacuum leave the index containing many empty or nearly empty pages. A bloated index is larger than necessary, uses more memory in shared buffers, and increases I/O for scans. You can measure bloat with the pgstattuple extension or by comparing the index size to the number of indexed values.
Reindexing rebuilds the entire index from scratch, creating a fresh copy with the correct fill factor and no bloat. In PostgreSQL, REINDEX INDEX or REINDEX TABLE locks the table. For production, use REINDEX INDEX CONCURRENTLY (available since PostgreSQL 12) to rebuild without blocking writes. Alternatively, use pg_repack or pg_squeeze for online reorganization. The frequency of reindexing depends on write volume — a table that sees 50% of its indexed columns updated monthly may benefit from quarterly reindexing.
A visual guide to fill factor and page splits: the diagram below shows two pages before and after an insert triggers a split with different fill factors.
Creating an Index: Three Patterns That Won't Bite You Later
You don't just CREATE INDEX and walk away. The wrong index is worse than no index — it burns CPU on writes and fools your optimizer into bad plans. Three patterns cover 90% of real cases. Single-column indexes for high-cardinality filters like customer IDs or timestamps. Multi-column (composite) indexes when you filter on multiple columns together — but respect the leftmost prefix rule or the index is dead weight. Unique indexes for enforcement, not performance. They prevent duplicates but add a uniqueness check on every write. Never index a boolean column with 50% true/50% false — the database will ignore it anyway. Always check selectivity before creating. If a column has fewer than 100 distinct values across a million rows, a full scan is cheaper.
Confirming, Removing, and Altering Indexes Without Downtime
You need eyes on your indexes before you drop anything. Use SHOW INDEXES or system catalog views to see size, scan count, and last usage. In PostgreSQL, pg_stat_user_indexes tells you which indexes have zero scans — that's your drop list. Removing an index frees disk and speeds writes. But dropping the wrong one crateres query performance. Always check if any foreign key or unique constraint depends on it. Altering an index — renaming, rebuilding, or changing fill factor — often requires an exclusive lock. On large tables, do it in a maintenance window or use CONCURRENTLY variants. Renaming indexes follows a naming convention: idx_tablename_columnname. It saves hours of confusion during incident response. Never name an index 'idx_final_v3' — you will forget what it does by next quarter.
An Unindexed Foreign Key Caused a 90-Minute Table Lock During a Nightly Deletion Job
- Foreign key columns always need an index — they are implicitly queried on every parent-table DELETE and UPDATE
- Use CREATE INDEX CONCURRENTLY in production to avoid blocking reads and writes during index creation
- Run SELECT indexrelname, idx_scan FROM pg_stat_user_indexes weekly — idx_scan = 0 after 30 days means the index is adding write overhead for zero benefit
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 1234;SELECT indexrelname, idx_scan, idx_tup_read FROM pg_stat_user_indexes WHERE relname = 'orders';Key takeaways
EXPLAIN) to ensure your indexes are actually being utilized by the optimizer.Common mistakes to avoid
5 patternsIndexing low-cardinality columns (status, gender, is_active)
Applying functions to indexed columns in WHERE clauses
Getting column order wrong in composite indexes — range column first
Not indexing foreign key columns
Accumulating unused indexes on high-write tables
Interview Questions on This Topic
Explain the Leftmost Prefix rule. Can an index on (A, B, C) be used for a query on (A, C)?
Frequently Asked Questions
20+ years shipping high-throughput database systems. Lessons pulled from things that broke in production.
That's SQL Advanced. Mark it forged?
8 min read · try the examples if you haven't