Mid-level 6 min · March 06, 2026

DBMS Interview Questions — Lost Update Race Conditions

Two transfers read the same balance under Read Committed; second commit overwrote the first.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • DBMS interview questions test your grasp of how data is stored, retrieved, and protected — not just SQL syntax.
  • Core topics: ACID, normalization, indexing, joins, transactions, and concurrency control.
  • Focus on tradeoffs: normalization vs denormalization, index read vs write cost, isolation vs performance.
  • Production insight: Misunderstanding isolation levels causes data anomalies like dirty reads and lost updates.
  • Biggest mistake: Memorizing definitions without being able to apply them to real scenarios.
Plain-English First

Think of a database management system like a giant, super-organized library. The DBMS is the librarian — it decides where every book (data) is stored, who's allowed to read it, how to find it in seconds even among millions of books, and what happens if two people try to check out the same book at the same time. When interviewers ask DBMS questions, they're really asking: do you understand how this librarian makes decisions, and why those decisions matter?

Every serious backend or full-stack role has at least a handful of DBMS questions tucked into the interview. Why? Because how data is stored, retrieved, and protected is the backbone of almost every application ever built. A poorly understood database concept in production doesn't just cause slow queries — it causes data loss, race conditions, and outages at 3am. Interviewers use DBMS questions to quickly separate engineers who 'use' databases from engineers who 'understand' them.

The problem most candidates have isn't that they don't know SQL. It's that they've memorized definitions without ever connecting them to real scenarios. They can recite 'ACID stands for Atomicity, Consistency, Isolation, Durability' but freeze when asked 'What actually happens to your data if the server crashes mid-transaction?' That gap between definition and intuition is exactly what interviewers are probing for.

By the end of this article, you'll be able to answer intermediate DBMS interview questions with confident, scenario-grounded explanations. You'll understand the WHY behind normalization, indexing, transactions, and joins — and you'll have concrete examples to pull from the moment an interviewer says 'Tell me about...'

ACID Properties — What They Actually Guarantee (and When They Fail)

ACID is the most asked-about DBMS concept in interviews, and the most superficially answered. Let's fix that.

Atomicity means a transaction is all-or-nothing. If you're transferring $500 from Account A to Account B, that's two writes. Atomicity guarantees that if the second write fails, the first is rolled back — you never end up with money disappearing into thin air.

Consistency means the database moves from one valid state to another. If your schema says an order must have a valid customer_id, a transaction that violates that constraint is rejected entirely.

Isolation is the subtle one. It controls what a transaction can see about other in-flight transactions. SQL standard defines four isolation levels — Read Uncommitted, Read Committed, Repeatable Read, and Serializable — each trading performance for safety. Most databases default to Read Committed.

Durability means once a transaction is committed, it survives crashes. The database writes to a transaction log (WAL — Write-Ahead Log) before confirming success, so even if the server dies immediately after your commit, the data is recoverable on restart.

The key interview insight: ACID is a contract, not a guarantee that nothing goes wrong. It's a guarantee about what the system will do to handle things going wrong.

acid_transaction_demo.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
-- Scenario: Bank transfer between two accounts
-- This demonstrates Atomicity — both updates happen or neither does

-- Setup: create accounts table with a balance constraint
CREATE TABLE bank_accounts (
    account_id   INT PRIMARY KEY,
    holder_name  VARCHAR(100) NOT NULL,
    balance      DECIMAL(10, 2) NOT NULL CHECK (balance >= 0)  -- Consistency constraint
);

INSERT INTO bank_accounts VALUES (1, 'Alice Johnson', 1000.00);
INSERT INTO bank_accounts VALUES (2, 'Bob Martinez',  500.00);

-- Begin a transaction (Atomicity wrapper)
BEGIN TRANSACTION;

    -- Step 1: Deduct from Alice
    UPDATE bank_accounts
    SET    balance = balance - 500.00
    WHERE  account_id = 1;

    -- Step 2: Credit Bob
    UPDATE bank_accounts
    SET    balance = balance + 500.00
    WHERE  account_id = 2;

    -- If EITHER update fails (e.g. Alice's balance would go negative,
    -- violating CHECK constraint), the whole block rolls back.
    -- Neither account changes. That's Atomicity + Consistency working together.

COMMIT;  -- Durability: this is now written to the WAL and survives a crash

-- Verify the final state
SELECT account_id, holder_name, balance FROM bank_accounts;

-- Now demonstrate a ROLLBACK scenario
BEGIN TRANSACTION;

    UPDATE bank_accounts
    SET    balance = balance - 2000.00  -- Alice only has 500 now; CHECK will fail
    WHERE  account_id = 1;

ROLLBACK;  -- We manually roll back, or the DB engine does it on constraint violation

-- Alice's balance is unchanged — Atomicity protected us
SELECT account_id, holder_name, balance FROM bank_accounts;
Output
-- After COMMIT:
account_id | holder_name | balance
-----------+---------------+---------
1 | Alice Johnson | 500.00
2 | Bob Martinez | 1000.00
-- After ROLLBACK (failed deduction):
account_id | holder_name | balance
-----------+---------------+---------
1 | Alice Johnson | 500.00
2 | Bob Martinez | 1000.00
Watch Out: Isolation ≠ 'Transactions Don't Interfere'
Most databases default to Read Committed isolation, which prevents dirty reads but still allows non-repeatable reads. If your transaction reads a row twice and another transaction commits a change in between, you'll get different values. If your logic depends on a value staying stable across two reads in the same transaction, you need Repeatable Read or Serializable — at the cost of more locking and lower throughput.
Production Insight
In production, the most common ACID failure is not a crash but a lost update under Read Committed isolation.
Your transaction reads a row, another transaction updates it, and your subsequent writes overwrite that change.
Rule: If your business logic depends on a value staying stable across reads, use SELECT ... FOR UPDATE or move to Repeatable Read.
Key Takeaway
ACID properties are guarantees about failure handling, not guarantees of correctness under concurrency.
The isolation level you choose determines what concurrency anomalies you accept.
Always match your isolation level to your business requirement — default settings are rarely right.

Normalization vs Denormalization — Choosing the Right Tradeoff

Normalization is the process of organizing your schema to eliminate redundant data and protect against update anomalies. Denormalization is the deliberate reversal of that — adding redundancy back in to speed up reads. The interview question isn't 'what is normalization?' — it's 'when do you normalize and when do you denormalize?'

Normalization follows 'Normal Forms'. First Normal Form (1NF) eliminates repeating groups — every cell holds one atomic value. Second Normal Form (2NF) removes partial dependencies — every non-key column depends on the whole primary key, not just part of it. Third Normal Form (3NF) removes transitive dependencies — non-key columns shouldn't depend on other non-key columns.

Here's the real-world intuition: if you store a customer's city and zip code in the orders table, you'll have to update both every time a customer moves — and you risk them going out of sync. 3NF says: put zip code in a customers table and reference it by customer_id. Now city and zip are always consistent.

Denormalization makes sense in read-heavy analytical systems (data warehouses, reporting dashboards) where joins across 10 normalized tables are too slow. You pre-join the data at write time so reads are instant. The tradeoff: writes become slower and you need application logic to keep redundant data in sync.

The rule of thumb: normalize by default in OLTP systems (transactions), denormalize deliberately in OLAP systems (analytics).

normalization_example.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
-- =============================================
-- BAD: Unnormalized orders table (violates 3NF)
-- =============================================
-- Problem: customer_city depends on customer_id, NOT on order_id.
-- If a customer moves, you update 100 rows instead of 1.
CREATE TABLE orders_unnormalized (
    order_id       INT PRIMARY KEY,
    customer_id    INT,
    customer_name  VARCHAR(100),   -- redundant — should live in customers table
    customer_city  VARCHAR(100),   -- transitive dependency: city -> customer, not order
    product_name   VARCHAR(100),
    order_total    DECIMAL(10, 2)
);

-- =============================================
-- GOOD: Normalized to 3NF — two separate tables
-- =============================================
CREATE TABLE customers (
    customer_id   INT PRIMARY KEY,
    customer_name VARCHAR(100) NOT NULL,
    customer_city VARCHAR(100) NOT NULL
    -- city lives here — update once, reflects everywhere
);

CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    customer_id   INT NOT NULL REFERENCES customers(customer_id),  -- FK enforces integrity
    product_name  VARCHAR(100) NOT NULL,
    order_total   DECIMAL(10, 2) NOT NULL
);

-- Insert sample data
INSERT INTO customers VALUES (1, 'Alice Johnson', 'New York');
INSERT INTO orders   VALUES (101, 1, 'Laptop', 1299.99);
INSERT INTO orders   VALUES (102, 1, 'Mouse',    29.99);

-- Join to reconstruct the full picture — this is the cost of normalization
-- (the join), but the benefit is zero data duplication
SELECT
    o.order_id,
    c.customer_name,
    c.customer_city,
    o.product_name,
    o.order_total
FROM   orders    o
JOIN   customers c ON o.customer_id = c.customer_id;

-- If Alice moves to Boston — ONE update, zero inconsistency
UPDATE customers SET customer_city = 'Boston' WHERE customer_id = 1;
Output
order_id | customer_name | customer_city | product_name | order_total
---------+---------------+---------------+--------------+------------
101 | Alice Johnson | New York | Laptop | 1299.99
102 | Alice Johnson | New York | Mouse | 29.99
Interview Gold: Frame It as a Tradeoff
When an interviewer asks about normalization, don't just recite the normal forms. Say: 'I normalize transactional schemas by default to avoid update anomalies, and I selectively denormalize in read-heavy or analytical contexts where join performance becomes a bottleneck — like pre-aggregating daily sales totals into a summary table.' That answer signals you think in tradeoffs, not rules.
Production Insight
Denormalizing without profiling can make a fast query slower by bloating row size and increasing I/O per scan.
Always measure the actual join cost before adding redundancy.
Rule: Profile first, denormalize second.
Key Takeaway
Normalize for write-heavy OLTP to avoid anomalies; denormalize for read-heavy OLAP after measuring bottlenecks.
The 3NF rule prevents transitive dependencies — city depends on customer, not order.
Use foreign keys to enforce referential integrity — without them, you'll get orphaned records.

Indexing Deep Dive — Why Indexes Speed Up Reads but Hurt Writes

An index is a separate data structure — usually a B-Tree — that the database builds and maintains alongside your table. It stores a sorted copy of one or more columns, with pointers back to the actual rows. Without an index, a query like WHERE email = 'alice@example.com' does a full table scan — reading every single row. With an index on email, the database does a binary search on the B-Tree and jumps directly to the matching row.

So why not index every column? Because every index you create is a structure that must be updated on every INSERT, UPDATE, and DELETE. If your orders table has 8 indexes, inserting one order requires 8 separate B-Tree updates. In a write-heavy system, that overhead adds up fast and can make inserts painfully slow.

There are two important index types you'll encounter in interviews. A clustered index physically reorders the table data to match the index order — there can only be one per table (SQL Server/MySQL InnoDB use the primary key as the clustered index by default). A non-clustered index is a separate structure with pointers to rows — you can have many, and they don't affect the storage order of the table.

Composite indexes follow the 'left-prefix rule': an index on (last_name, first_name) speeds up queries filtering on last_name alone, or both columns — but NOT on first_name alone. The database can only use the leftmost columns in the index definition.

indexing_strategy_demo.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
-- Setup: large employees table to demonstrate index impact
CREATE TABLE employees (
    employee_id   INT PRIMARY KEY,          -- clustered index by default in most DBs
    department_id INT NOT NULL,
    last_name     VARCHAR(100) NOT NULL,
    first_name    VARCHAR(100) NOT NULL,
    email         VARCHAR(200) UNIQUE,      -- UNIQUE automatically creates an index
    hire_date     DATE NOT NULL,
    salary        DECIMAL(10, 2)
);

-- Without any extra indexes, this query does a full table scan (slow at scale)
-- EXPLAIN shows 'Seq Scan' or 'Full Table Scan' — red flag in production
EXPLAIN SELECT * FROM employees WHERE last_name = 'Johnson';

-- Create a non-clustered index on last_name
CREATE INDEX idx_employees_last_name
    ON employees (last_name);  -- B-Tree built; now searches on last_name are O(log n)

-- After the index: EXPLAIN shows 'Index Scan' — jumps directly to matching rows
EXPLAIN SELECT * FROM employees WHERE last_name = 'Johnson';

-- Composite index: useful when you frequently filter by department AND hire_date
CREATE INDEX idx_employees_dept_hire
    ON employees (department_id, hire_date);

-- This query USES the composite index (left-prefix rule satisfied)
SELECT * FROM employees
WHERE  department_id = 3
AND    hire_date > '2022-01-01';           -- Both columns in index, in order

-- This query CANNOT use the composite index efficiently
-- (skips department_id, starts at second column — violates left-prefix rule)
SELECT * FROM employees
WHERE  hire_date > '2022-01-01';           -- Falls back to full scan

-- Check index usage and fragmentation (PostgreSQL syntax)
SELECT
    indexname,
    idx_scan,       -- how many times this index was used
    idx_tup_read    -- rows read through this index
FROM   pg_stat_user_indexes
WHERE  tablename = 'employees';
Output
-- EXPLAIN before index:
Seq Scan on employees (cost=0.00..2841.00 rows=45 width=80)
Filter: ((last_name)::text = 'Johnson')
-- EXPLAIN after index:
Index Scan using idx_employees_last_name on employees (cost=0.29..8.31 rows=45 width=80)
Index Cond: ((last_name)::text = 'Johnson')
-- pg_stat_user_indexes output:
indexname | idx_scan | idx_tup_read
----------------------------+----------+-------------
idx_employees_last_name | 284 | 12780
idx_employees_dept_hire | 97 | 4365
pk_employees | 5021 | 5021
Pro Tip: Use EXPLAIN Before Adding Indexes Blindly
Always run EXPLAIN (or EXPLAIN ANALYZE in PostgreSQL) on a slow query before creating an index. The query plan shows exactly which step is expensive — sometimes the bottleneck is a missing join condition or a function wrapping a column (like WHERE YEAR(hire_date) = 2023, which prevents index use entirely). Fix the query first, then index if needed.
Production Insight
A missing index on a foreign key column is the #1 cause of cascading full table scans in production joins.
Always index foreign keys by default — the database won't warn you.
Rule: When adding a FK constraint, add an index on the FK column immediately.
Key Takeaway
Indexes are B-Trees that speed reads at the cost of slower writes.
The left-prefix rule governs composite indexes — always put the most selective column first.
Use EXPLAIN before creating an index — sometimes the query itself is the problem.

Joins, Keys, and the Difference Between WHERE and HAVING

Joins are the mechanism for combining rows from two or more tables based on a related column. The confusion isn't usually about syntax — it's about understanding which join to use and what happens to unmatched rows.

INNER JOIN returns only rows where the join condition is satisfied in both tables. If a customer has no orders, they don't appear in the result. LEFT JOIN returns all rows from the left table and matching rows from the right — unmatched right-side columns are NULL. This is critical for questions like 'find all customers who have never placed an order' (WHERE order_id IS NULL after a LEFT JOIN).

Cross joins and self-joins come up less often in interviews but signal depth. A self-join joins a table to itself — useful for hierarchical data like an employees table where manager_id references another employee_id in the same table.

The WHERE vs HAVING distinction is a classic interview trap. WHERE filters rows before aggregation happens. HAVING filters groups after aggregation. If you put an aggregate function like COUNT() in a WHERE clause, you'll get a syntax error — because WHERE runs before GROUP BY has created any groups to count. HAVING runs after GROUP BY, so aggregates are fair game.

Primary keys uniquely identify a row and cannot be NULL. Foreign keys enforce referential integrity — a foreign key column in one table must match an existing primary key in another table (or be NULL). Without this constraint, you can end up with orphaned records: orders pointing to customer_ids that no longer exist.

joins_and_groupby_demo.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
-- Setup: customers and orders tables
CREATE TABLE customers (
    customer_id   INT PRIMARY KEY,
    customer_name VARCHAR(100) NOT NULL
);

CREATE TABLE orders (
    order_id    INT PRIMARY KEY,
    customer_id INT REFERENCES customers(customer_id),  -- foreign key
    amount      DECIMAL(10, 2) NOT NULL,
    order_year  INT NOT NULL
);

INSERT INTO customers VALUES (1, 'Alice Johnson'), (2, 'Bob Martinez'), (3, 'Carol Wu');
INSERT INTO orders VALUES
    (101, 1, 250.00, 2023),
    (102, 1, 450.00, 2023),
    (103, 2, 125.00, 2022),
    (104, 2, 800.00, 2023);
-- Carol Wu (customer_id=3) has NO orders — useful for LEFT JOIN demo

-- =============================================
-- INNER JOIN: only customers WITH orders
-- =============================================
SELECT c.customer_name, o.order_id, o.amount
FROM   customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
-- Carol Wu is NOT in this result

-- =============================================
-- LEFT JOIN: ALL customers, even those without orders
-- =============================================
SELECT c.customer_name, o.order_id, o.amount
FROM   customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;
-- Carol Wu appears with NULL for order_id and amount

-- =============================================
-- Find customers who have NEVER placed an order
-- =============================================
SELECT c.customer_name
FROM   customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
WHERE  o.order_id IS NULL;  -- NULL means no matching order row was found

-- =============================================
-- WHERE vs HAVING — the classic interview trap
-- =============================================

-- WHERE filters individual rows BEFORE grouping
SELECT   customer_id, SUM(amount) AS total_spent
FROM     orders
WHERE    order_year = 2023           -- filter rows: only 2023 orders enter the group
GROUP BY customer_id;

-- HAVING filters groups AFTER aggregation
SELECT   customer_id, SUM(amount) AS total_spent
FROM     orders
GROUP BY customer_id
HAVING   SUM(amount) > 500;          -- filter groups: only big spenders

-- WRONG: This throws an error — you can't use aggregate in WHERE
-- SELECT customer_id FROM orders WHERE SUM(amount) > 500 GROUP BY customer_id;
-- ERROR: aggregate functions are not allowed in WHERE
Output
-- INNER JOIN result:
customer_name | order_id | amount
--------------+----------+--------
Alice Johnson | 101 | 250.00
Alice Johnson | 102 | 450.00
Bob Martinez | 103 | 125.00
Bob Martinez | 104 | 800.00
-- LEFT JOIN result:
customer_name | order_id | amount
--------------+----------+--------
Alice Johnson | 101 | 250.00
Alice Johnson | 102 | 450.00
Bob Martinez | 103 | 125.00
Bob Martinez | 104 | 800.00
Carol Wu | NULL | NULL
-- Customers with NO orders:
customer_name
--------------
Carol Wu
-- WHERE 2023 filter result:
customer_id | total_spent
------------+------------
1 | 700.00
2 | 800.00
-- HAVING > 500 result:
customer_id | total_spent
------------+------------
2 | 925.00
Interview Gold: The LEFT JOIN + IS NULL Pattern
The pattern LEFT JOIN ... WHERE right_table.id IS NULL is a classic 'anti-join' — it finds rows in the left table that have no match in the right. Interviewers love this because it separates people who understand what a LEFT JOIN actually produces (NULLs on the right side for unmatched rows) from those who've just memorized a diagram. An alternative is NOT EXISTS, which is often more readable and equally performant on modern query optimizers.
Production Insight
The LEFT JOIN anti-join pattern (WHERE right.id IS NULL) is silently slow on large tables if the right table lacks an index on the join column.
Without an index, PostgreSQL does a Nested Loop with a full scan per row.
Rule: Always index the join column in the right table for LEFT JOIN anti-joins.
Key Takeaway
WHERE filters rows before GROUP BY; HAVING filters groups after.
LEFT JOIN + IS NULL finds unmatched rows; NOT EXISTS is often equivalent.
Primary keys enforce uniqueness; foreign keys enforce referential integrity — never skip them.

Transactions, Isolation Levels, and Concurrency Control

A transaction groups multiple SQL statements into a single logical unit. When you COMMIT, all changes become permanent. When you ROLLBACK, all changes are undone. But what happens when two transactions run at the same time? That's where isolation levels and concurrency control kick in.

Most modern databases use Multi-Version Concurrency Control (MVCC). Instead of locking every row a transaction reads, MVCC keeps older versions of rows. Each transaction sees a snapshot of the data as of its start time. Readers never block writers, and writers never block readers. That's the magic. But MVCC isn't free — it increases storage and makes the transaction log grow.

The SQL standard defines four isolation levels
  • Read Uncommitted: No protection — can see uncommitted changes (dirty reads). Rarely used.
  • Read Committed: Prevents dirty reads. Default in PostgreSQL, Oracle, SQL Server. Allows non-repeatable reads and phantom reads.
  • Repeatable Read: Prevents non-repeatable reads. Still allows phantom reads (new rows inserted by others). Default in MySQL InnoDB.
  • Serializable: Highest level. Prevents all anomalies. Achieved through either locking or snapshot isolation (PostgreSQL uses Serializable Snapshot Isolation).

A deadlock occurs when two transactions each hold a lock the other needs. The database automatically kills one transaction to break the cycle. The fix is always application-side: access tables in the same order, keep transactions short, and avoid user interaction inside transactions.

isolation_and_deadlock_demo.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
-- =============================================
-- Example: Lost Update under Read Committed
-- =============================================
CREATE TABLE account_balance (
    account_id INT PRIMARY KEY,
    balance    DECIMAL(10, 2) NOT NULL CHECK (balance >= 0)
);
INSERT INTO account_balance VALUES (1, 1000.00);

-- Transaction A
BEGIN TRANSACTION;
SELECT balance FROM account_balance WHERE account_id = 1;  -- reads 1000
-- ... some processing ...
UPDATE account_balance SET balance = balance - 200 WHERE account_id = 1;

-- Transaction B (concurrent)
BEGIN TRANSACTION;
SELECT balance FROM account_balance WHERE account_id = 1;  -- also reads 1000
UPDATE account_balance SET balance = balance - 300 WHERE account_id = 1;

-- If Transaction B commits after A, the final balance is 700 (only B's change)
-- Transaction A's deduction is lost. This is a lost update.

-- Fix: Use SELECT ... FOR UPDATE to lock the row
BEGIN TRANSACTION;
SELECT balance FROM account_balance WHERE account_id = 1 FOR UPDATE;  -- acquires row lock
-- Transaction B will block here until A commits
UPDATE account_balance SET balance = balance - 200 WHERE account_id = 1;
COMMIT;

-- =============================================
-- Deadlock scenario (avoid this)
-- =============================================
CREATE TABLE table_a (id INT PRIMARY KEY, val INT);
CREATE TABLE table_b (id INT PRIMARY KEY, val INT);
INSERT INTO table_a VALUES (1, 10);
INSERT INTO table_b VALUES (1, 20);

-- Transaction 1 (started first)
BEGIN;
UPDATE table_a SET val = val + 1 WHERE id = 1;
-- now try to update table_b
UPDATE table_b SET val = val + 1 WHERE id = 1;  -- blocked if Transaction 2 holds lock on table_b

-- Transaction 2 (concurrent)
BEGIN;
UPDATE table_b SET val = val + 1 WHERE id = 1;
-- now try to update table_a
UPDATE table_a SET val = val + 1 WHERE id = 1;  -- DEADLOCK!

-- One of them will be killed by the database; the other succeeds.
-- Prevention: always update table_a before table_b in both transactions.
Output
-- After lost update scenario (without FOR UPDATE):
account_id | balance
-----------+--------
1 | 700.00 -- lost $200
-- After using FOR UPDATE:
account_id | balance
-----------+--------
1 | 800.00 -- correct
-- Deadlock detection log (PostgreSQL):
ERROR: deadlock detected
DETAIL: Process 123 waits for ShareLock on transaction 456; blocked by process 456.
HINT: See server log for query details.
Implicit Locks Can Still Cause Problems
Even with MVCC, UPDATE and DELETE statements acquire write locks on affected rows. If two transactions update the same row concurrently, one will wait for the other. Long-running transactions holding locks for minutes cause lock escalation and can cascade into a system-wide stall. Keep transactions short and commit early.
Production Insight
Default Read Committed isolation allows non-repeatable reads and phantom reads.
If your application reads a sum of account balances twice in a transaction and gets different results, you have a phantom read.
Rule: Use Serializable only when absolutely necessary — it kills throughput.
Key Takeaway
MVCC enables non-blocking reads by keeping older row versions.
Isolation levels trade safety for performance — choose the weakest level that guarantees correctness.
Deadlocks are avoided by locking tables in the same order across all transactions.
● Production incidentPOST-MORTEMseverity: high

Lost Update Due to Read Committed Isolation

Symptom
Account balance ended up higher than expected after two simultaneous transfers; reconciliation failed.
Assumption
Atomicity would prevent partial transactions, so it was safe to run without explicit locking.
Root cause
Both transactions read the balance under Read Committed isolation before the other committed. Each calculated a new balance based on the same initial value. The second commit overwrote the first, effectively losing the first update.
Fix
Changed the transfer logic to use SELECT ... FOR UPDATE (pessimistic locking) or switched to Serializable isolation level for the transfer transaction.
Key lesson
  • Read Committed does not protect against lost updates when two transactions read and then write the same row.
  • If your business logic depends on a value staying stable between a read and a write, you must lock the row or use a higher isolation level.
  • Always test concurrency scenarios with high load — they often reveal hidden race conditions.
Production debug guideSystematic approach to find and fix slow queries in production.3 entries
Symptom · 01
Query that used to be fast suddenly becomes slow
Fix
Check if an index was dropped or if statistics are stale. Run ANALYZE and review recent DDL changes.
Symptom · 02
Deadlock error messages in application logs
Fix
Identify all transactions involved, extract their lock ordering, and ensure all code paths acquire locks on resources in the same global order.
Symptom · 03
Full table scan on a large table despite an existing index
Fix
Check if the index is on the column wrapped in a function (e.g., WHERE YEAR(date)=2023). Rewrite as range condition or create a function-based index.
★ DBMS Troubleshooting Quick ReferenceCommon production database problems and the three commands that fix them.
Slow query on a large table
Immediate action
Run EXPLAIN ANALYZE to see the actual plan and row estimates.
Commands
EXPLAIN (ANALYZE, BUFFERS) SELECT ...
Check for missing indexes: SELECT schemaname, tablename, seq_scan FROM pg_stat_user_tables WHERE seq_scan > 1000;
Fix now
Add an index on the filtering columns, or rewrite the query to avoid full scan.
Transaction deadlock detected+
Immediate action
Identify the blocking transaction using pg_locks (PostgreSQL) or SHOW PROCESSLIST (MySQL).
Commands
SELECT pid, state, query, wait_event FROM pg_stat_activity WHERE wait_event IS NOT NULL;
Kill the blocking transaction: SELECT pg_terminate_backend(pid);
Fix now
Redesign application code to acquire locks in the same order across all transactions.
Missing rows after UPSERT+
Immediate action
Check for duplicate key violations or constraint errors in logs.
Commands
SHOW server_errors; (PostgreSQL) or check error log file.
Test the UPSERT manually with a small dataset to see if ON CONFLICT is correctly specified.
Fix now
Add a UNIQUE constraint and ensure the ON CONFLICT clause targets the correct constraint.
AspectOLTP (Operational)OLAP (Analytical)
Primary use caseDay-to-day transactions (insert/update/delete)Reporting and aggregations (complex reads)
Schema styleHighly normalized (3NF) to avoid anomaliesDenormalized (star/snowflake schema) for fast reads
Query typeMany small, fast queries on few rowsFew large queries scanning millions of rows
Indexing focusIndex on FK columns and WHERE predicatesColumnar storage; bitmap indexes on low-cardinality columns
Transaction needStrong ACID guarantees requiredOften relaxed; eventual consistency acceptable
ExamplesBank transfers, e-commerce orders, user loginsSales dashboards, trend analysis, data warehouses
Optimization targetWrite throughput, low latencyRead throughput, query parallelism

Key takeaways

1
ACID isn't magic
it's a contract about failure behavior. Atomicity prevents partial writes, Consistency enforces schema rules, Isolation controls visibility between concurrent transactions, and Durability means committed data survives crashes via Write-Ahead Logging.
2
Normalize by default for OLTP (one place to update, zero inconsistency), denormalize deliberately for OLAP (pre-joined data for instant reads). Never denormalize without profiling the actual query bottleneck first.
3
Every index is a write-time cost for a read-time benefit. The left-prefix rule governs composite indexes
a (a, b, c) index helps queries filtering on a, a+b, or a+b+c, but not b or c alone. Always verify with EXPLAIN before creating or dropping indexes.
4
WHERE filters rows before aggregation; HAVING filters groups after aggregation. The LEFT JOIN + WHERE right.id IS NULL anti-join pattern is one of the most practical SQL patterns in production, and it's a near-universal interview question.
5
MVCC is the dominant concurrency model
it lets readers see consistent snapshots without blocking writers. Always match your isolation level to the business requirement, and maintain consistent lock ordering to prevent deadlocks.

Common mistakes to avoid

4 patterns
×

Wrapping indexed column in a function in WHERE clause

Symptom
Index on hire_date is completely ignored; query performs a full table scan, causing slow response times in production.
Fix
Rewrite the condition as a range: WHERE hire_date >= '2023-01-01' AND hire_date < '2024-01-01' instead of WHERE YEAR(hire_date) = 2023.
×

Using DELETE to clear a large table instead of TRUNCATE

Symptom
Transaction log fills up, query hangs, and the operation takes orders of magnitude longer than expected.
Fix
Use TRUNCATE TABLE to remove all rows in a minimally logged, fast operation. TRUNCATE is not fully rollbackable in all DBMS — verify before use.
×

Ignoring NULL behavior in aggregate functions like COUNT and AVG

Symptom
COUNT(column) returns fewer rows than COUNT(*) when the column has NULLs; AVG(column) computes average only over non-NULL rows, skewing results.
Fix
Use COUNT(*) for total rows, or COALESCE(column, 0) before aggregation to handle NULLs explicitly.
×

Acquiring locks in inconsistent order across transactions

Symptom
Deadlock errors appear randomly under load, causing application retries and user-facing errors.
Fix
Enforce a global lock ordering: always update tables in the same sequence (e.g., always update accounts before orders).
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Write a query to find the top 3 salaries in each department. If two empl...
Q02SENIOR
Design a 'Like' button system for 100M users. Would you use a normalized...
Q03SENIOR
Explain the difference between B-Tree and Hash indexes. Why is a B-Tree ...
Q04SENIOR
A table has a composite index on (country_id, status, created_at). Which...
Q05JUNIOR
Explain the difference between locking and MVCC in concurrency control.
Q06SENIOR
What is a deadlock and how do you prevent it in a database application?
Q01 of 06SENIOR

Write a query to find the top 3 salaries in each department. If two employees have the same salary, they should both be included in the ranking.

ANSWER
Use DENSE_RANK() window function to assign ranks without gaps: SELECT department_id, employee_id, salary FROM ( SELECT *, DENSE_RANK() OVER (PARTITION BY department_id ORDER BY salary DESC) AS rank FROM employees ) ranked WHERE rank <= 3; DENSE_RANK ensures ties get the same rank and the next rank is not skipped. ROW_NUMBER would arbitrarily pick one of the tied rows. RANK would skip a number after a tie, potentially giving fewer than 3 results.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
What is the difference between a primary key and a unique key in SQL?
02
How do you resolve a deadlock in a production database?
03
When should I use a stored procedure versus writing SQL in application code?
04
What is the difference between a clustered and a non-clustered index?
🔥

That's Database Interview. Mark it forged?

6 min read · try the examples if you haven't

Previous
Top SQL Interview Questions
2 / 4 · Database Interview
Next
NoSQL Interview Questions