SQL Indexes — Avoid 90-Minute Foreign Key Lock
An unindexed foreign key on an 8M-row child table caused a 90-minute hold during a 50,000-row deletion.
- An index is a sorted B-tree mapping column values to row locations — turns O(n) full table scans into O(log n) lookups
- Clustered index = data physically stored in B-tree leaf nodes (one per table, usually the PK)
- Non-clustered index = separate structure with pointers to heap rows (multiple per table)
- Leftmost prefix rule: index on (A, B) helps queries on A or (A,B) — never on B alone
- Covering index = INCLUDE clause adds columns to leaf nodes — zero heap access, fastest possible read
- Biggest trap: function calls on indexed columns (WHERE YEAR(created_at) = 2024) silently disable the index
Think of a SQL database like a massive 1,000-page textbook. If you want to find the chapter on 'Photosynthesis,' you don't start at page one and read every sentence until you find it (that's a Full Table Scan). Instead, you flip to the Index at the back, find the word 'Photosynthesis,' see it's on page 412, and jump straight there. An SQL index is that exact 'map'—it's a smaller, sorted list that tells the database exactly where the data lives so it doesn't have to search the whole 'book.'
Indexes are the single most impactful optimization available in relational databases. A query that takes 30 seconds on a table with 10 million rows can return in 2 milliseconds after adding the right index. The difference is the gap between scanning every row on disk and jumping directly to the matching ones via a balanced tree structure.
However, indexes aren't free. The cost is write overhead and disk space. Every index you add slows down INSERT, UPDATE, and DELETE operations because the database must maintain the sorted order of the index in real-time. The 'Forge' philosophy is about precision: adding the exact indexes your critical queries need without bloating the table with redundant metadata.
How a B-Tree Index Works
Most relational databases (PostgreSQL, MySQL, SQL Server) use a B-tree (Balanced Tree) structure for indexing. A B-tree keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time.
When you query a column, the database starts at the root node and follows pointers down to the leaf nodes. Each leaf node contains the indexed value plus a 'Record Identifier' (RID) or a Primary Key value that points to the actual row data in the 'Heap' or Clustered Index.
Composite Indexes and the Leftmost Prefix Rule
A composite index (multi-column index) is sorted primarily by the first column, then the second, and so on. This leads to the Leftmost Prefix Rule: an index on $(A, B)$ can be used for queries filtering by $(A)$ or $(A, B)$, but it is completely useless for a query filtering only by $(B)$.
Choosing the right column order is vital. Generally, you put the column with the highest cardinality (most unique values) first, or the column most frequently used in WHERE clauses.
- First column = primary sort key — queries must include it to use the composite index
- Second column = tie-breaker — only useful after the first column filters
- Equality conditions first, range conditions last — maximises usability
- INCLUDE columns add data to leaves without changing sort order — for covering indexes only
When Indexes Hurt: Write Amplification and Bloat
Every time you INSERT a row into a table with 5 indexes, the database must perform 6 writes (1 to the table, 5 to the indexes). This is known as Write Amplification. Furthermore, indexes can become 'fragmented' over time as rows are deleted and updated, leading to bloated files and degraded performance.
An Unindexed Foreign Key Caused a 90-Minute Table Lock During a Nightly Deletion Job
- Foreign key columns always need an index — they are implicitly queried on every parent-table DELETE and UPDATE
- Use CREATE INDEX CONCURRENTLY in production to avoid blocking reads and writes during index creation
- Run SELECT indexrelname, idx_scan FROM pg_stat_user_indexes weekly — idx_scan = 0 after 30 days means the index is adding write overhead for zero benefit
Key takeaways
EXPLAIN) to ensure your indexes are actually being utilized by the optimizer.Common mistakes to avoid
5 patternsIndexing low-cardinality columns (status, gender, is_active)
Applying functions to indexed columns in WHERE clauses
Getting column order wrong in composite indexes — range column first
Not indexing foreign key columns
Accumulating unused indexes on high-write tables
Interview Questions on This Topic
Explain the Leftmost Prefix rule. Can an index on (A, B, C) be used for a query on (A, C)?
Frequently Asked Questions
That's SQL Advanced. Mark it forged?
3 min read · try the examples if you haven't