Senior 9 min · March 05, 2026

SQL Transactions — Missing ROLLBACK Causes Empty Orders

12-15 empty orders daily from swallowed exceptions — fix missing ROLLBACK, debug deadlocks, pool exhaustion, table bloat from real production incidents..

N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Everything here is grounded in real deployments.

Follow
Production
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • A transaction wraps multiple SQL statements into an all-or-nothing unit — BEGIN → statements → COMMIT or ROLLBACK
  • Atomicity: enforced by the Write-Ahead Log (WAL) — crash recovery replays or undoes incomplete transactions
  • Durability: COMMIT waits for fsync to disk — disabling fsync breaks durability entirely
  • Isolation levels: READ COMMITTED (PG default) → REPEATABLE READ (MySQL default) → SERIALIZABLE — lower = more concurrency, more anomaly risk
  • SAVEPOINT: partial rollback within a transaction — keeps prior work alive after a recoverable error
  • Biggest mistake: wrapping entire request handlers in one transaction — holds locks for hundreds of ms, exhausts connection pools under load
✦ Definition~90s read
What is SQL Transactions and ACID?

SQL transactions are the mechanism that ensures database operations either fully complete or fully fail, preventing partial updates that corrupt data. The core idea is atomicity: a transaction groups multiple reads and writes into a single logical unit.

Imagine you're at an ATM withdrawing $100.

If any step fails, the entire unit must be rolled back, leaving the database in its original state. Missing a ROLLBACK after a failure is a classic bug that creates phantom orders — rows that appear in the database but were never fully processed, because the system committed partial work.

This is why every transaction must explicitly handle errors and issue ROLLBACK, or rely on the database driver to do it automatically.

The real hero behind atomicity and durability is the write-ahead log (WAL). Before any data page is modified, the transaction's changes are first written to the WAL. If the server crashes mid-transaction, the WAL is replayed on restart to undo uncommitted changes (ROLLBACK) or redo committed ones (COMMIT).

This is how databases like PostgreSQL, MySQL with InnoDB, and SQLite guarantee that a committed transaction survives a power loss. Without the WAL, a missing ROLLBACK would leave the database in an inconsistent state that's unrecoverable.

ACID is not a single feature but a pipeline: a transaction begins, acquires locks or snapshots, writes to the WAL, modifies in-memory pages, and then either commits (flushing the WAL to disk) or rolls back (discarding the WAL entries). Isolation levels — Read Uncommitted, Read Committed, Repeatable Read, Serializable — control how much of this pipeline is visible to concurrent transactions.

Higher isolation prevents anomalies like dirty reads and phantom rows but increases lock contention and reduces throughput. For most production systems, Read Committed is the default (PostgreSQL, Oracle, SQL Server) because it balances correctness and performance.

Serializable is rarely used outside financial systems due to its cost.

When NOT to use transactions: for bulk data loads or reporting queries that don't need atomicity, wrapping them in transactions adds unnecessary overhead. Also, avoid long-running transactions — they hold locks and bloat the WAL, causing performance degradation and replication lag.

Use explicit transactions only when you need to ensure that a set of writes is all-or-nothing, and always pair them with proper error handling and ROLLBACK logic.

Plain-English First

Imagine you're at an ATM withdrawing $100. The machine needs to do two things at once: subtract $100 from your account AND spit out the cash. If the power cuts out between those two steps, you can't end up with no cash AND no money in your account — that would be a disaster. A database transaction is the ATM's promise: either BOTH things happen, or NEITHER does. ACID is just the rulebook that makes that promise bulletproof.

A SELECT that returns the wrong customer's invoice. A funds transfer that vanishes into thin air. That's what you get without understanding ACID. SQL transactions are the engine's contract with you—atomic commits, durable logs, and isolation guarantees that prevent phantom orders and corrupted state. Ignore them, and your production database becomes a minefield of lost data, deadlocked sessions, and unkillable long transactions that bring everything to a crawl.

Why Missing ROLLBACK Creates Phantom Orders

SQL transactions group multiple operations into a single atomic unit. The core mechanic: either all operations commit permanently, or none do — the database rolls back to the prior consistent state. This is the 'A' in ACID (Atomicity). Without explicit ROLLBACK on failure, partial writes persist, corrupting data integrity.

Transactions enforce isolation (the 'I') so concurrent operations don't see intermediate states. In practice, this means two simultaneous order inserts won't interleave: one transaction's uncommitted rows are invisible to others. Durability ('D') ensures committed data survives crashes — typically via write-ahead logging. Consistency ('C') guarantees that constraints (e.g., foreign keys, unique indexes) hold before and after.

Use transactions for any multi-step write: transferring funds, placing orders, updating inventory. Real systems fail when a script inserts order headers but skips ROLLBACK on line-item failure — leaving orphaned headers that break reporting and billing. Always wrap multi-statement writes in BEGIN/COMMIT/ROLLBACK blocks, and test the failure path.

Implicit Commit Traps
Many SQL clients auto-commit each statement. A missing ROLLBACK after a failed INSERT inside a manual transaction leaves partial data — not a full rollback.
Production Insight
E-commerce order pipeline: a batch job inserts order headers then line items. A line-item constraint violation (e.g., negative quantity) raises an exception, but the script catches it and continues — the header commits, the items don't. Result: orphan orders in reporting, billing sends invoices for empty orders.
Symptom: support tickets for 'empty orders' with no line items but a valid total of $0.00.
Rule: always wrap multi-table writes in a transaction with explicit ROLLBACK on any error — never catch and swallow exceptions inside a transaction.
Key Takeaway
Atomicity is not optional — partial commits corrupt data silently.
Isolation prevents dirty reads but not all anomalies; use SERIALIZABLE for critical financial ops.
Always test the failure path: missing ROLLBACK is the #1 cause of data corruption in production.
SQL Transaction ACID Pipeline and ROLLBACK Dangers THECODEFORGE.IO SQL Transaction ACID Pipeline and ROLLBACK Dangers Flow from transaction start through WAL to commit/rollback with isolation levels BEGIN TRANSACTION Entry point: starts atomic unit of work Write-Ahead Log (WAL) Logs changes before data flush for durability Isolation Level Applied Controls visibility of concurrent changes COMMIT or ROLLBACK Finalize or abort all changes atomically Missing ROLLBACK Leaves partial writes, phantom orders appear ⚠ Missing ROLLBACK after error creates phantom orders Always use TRY-CATCH with explicit ROLLBACK on failure THECODEFORGE.IO
thecodeforge.io
SQL Transaction ACID Pipeline and ROLLBACK Dangers
Sql Transactions Acid

Atomicity and Durability — The Write-Ahead Log Is the Real Hero

Atomicity means all-or-nothing. Either every statement in a transaction lands in the database, or none of them do. But how does a database engine actually enforce this when the server can die at any microsecond?

The answer is the Write-Ahead Log (WAL). Before any data page is modified in memory, the engine writes an intent record to the WAL — an append-only file on disk. If the server crashes mid-transaction, the WAL lets the engine replay or undo operations on restart. This is called crash recovery, and it's what makes Atomicity a real guarantee rather than a hopeful suggestion.

Durability is the flip side: once you get a COMMIT acknowledgement, the data is safe even if the server immediately crashes. That acknowledgement is only sent after the WAL record is flushed to disk (fsync). This is why COMMIT can feel slightly slow — it's waiting on a real disk write, not just a memory operation. Disabling fsync breaks Durability entirely and is a production disaster waiting to happen.

bank_transfer_atomic.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Transfer $500 from Alice to Bob — both debit and credit must succeed together
BEGIN;

UPDATE accounts
   SET balance = balance - 500
 WHERE account_id = 'alice_001'
   AND balance >= 500;

DO $$
BEGIN
    IF NOT FOUND THEN
        RAISE EXCEPTION 'Insufficient funds for alice_001';
    END IF;
END;
$$;

UPDATE accounts
   SET balance = balance + 500
 WHERE account_id = 'bob_002';

INSERT INTO transfer_audit (from_account, to_account, amount, transferred_at)
VALUES ('alice_001', 'bob_002', 500.00, NOW());

-- COMMIT waits for WAL fsync — this is what makes it durable
COMMIT;
Output
-- Success: UPDATE 1 / UPDATE 1 / INSERT 0 1 / COMMIT
-- Insufficient funds: ERROR raised → automatic ROLLBACK
-- Server crash mid-transaction → WAL recovery undoes partial writes on restart
Never Disable fsync in Production
PostgreSQL's fsync=off and MySQL's innodb_flush_log_at_trx_commit=0 feel like easy performance wins but break Durability. A server crash can corrupt your database in ways that aren't immediately obvious — transactions appear committed but their WAL records were never written to disk. This has caused real data loss at production companies. Use a battery-backed RAID controller or NVMe instead.
Production Insight
COMMIT latency is dominated by WAL fsync — typically 1-10ms on NVMe, 10-50ms on spinning disk.
Batching multiple small transactions into fewer larger ones reduces fsync overhead for bulk imports.
For the internals of what COMMIT actually does step-by-step, see the Database Locking Mechanisms and MVCC articles.
Key Takeaway
The WAL is the real mechanism behind both Atomicity and Durability — not magic, just careful append-only disk writes.
COMMIT is intentionally slow because it waits for a real fsync — disabling this is trading correctness for performance.
Crash recovery replays the WAL from the last checkpoint — your database survives power failures because of this.

ACID Pipeline — How a Transaction Flows Through the Engine

Understanding ACID in production means knowing the exact sequence of steps the database executes from BEGIN to COMMIT. The pipeline consists of five phases:

  1. Transaction Start: The database assigns a transaction ID (XID) and records the start in shared memory structures. For PostgreSQL, the snapshot data for MVCC is initialized.
  2. Statement Execution: Each SQL statement is parsed, planned, and executed. Data pages are read from the buffer pool (or disk if not cached). Modifications are applied to pages in memory, and WAL records are written to the WAL buffer in shared memory.
  3. Pre-commit: On COMMIT, the database writes a commit record to the WAL buffer. This record contains the transaction ID and a list of all data pages modified.
  4. WAL Flush (fsync): The WAL buffer is flushed to disk. This is the blocking step — the database waits for the OS to confirm the write reached stable storage. Only after this does the transaction become durable.
  5. Post-commit: The transaction's XID is marked as committed in the commit log (pg_clog or system table). Locks held by the transaction are released. The COMMIT acknowledgement is sent to the client. The actual data pages may still be in memory — they will be written to disk later by the background writer.

The pipeline is the same across PostgreSQL, MySQL InnoDB, and Oracle, with minor differences in WAL format and commit log storage.

Production Insight
The blocking step in production is the WAL fsync during COMMIT. If your COMMITs are slow, profile fsync latency with tools like iostat or pg_test_fsync. Batch multiple writes in one transaction to amortize fsync overhead. Never disable fsync — the performance gain is a data loss gamble.
Key Takeaway
The ACID pipeline has five phases — the WAL fsync is the bottleneck that guarantees durability. Data pages are written lazily; the WAL is what survives crashes.
ACID Pipeline Flow (PostgreSQL)
YesNoClient issues BEGINAssign XID, init snapshotExecuteUPDATE/INSERT/DELETEWrite WAL records to bufferClient issues COMMITWrite commit record to WALbufferfsync: flush WAL buffer to diskfsync successful?Mark XID committed in pg_clogRelease locksSend COMMIT OK to clientBackground writer writes datapages laterDatabase crash / corruptionrecoveryOn restart: WAL replay orrollback

Isolation Levels — Where ACID Gets Complicated and Performance Gets Real

Isolation is the trickiest ACID property because it's not binary — it's a spectrum. 'Full' isolation (Serializable) means transactions behave as if they ran one after another, but it's expensive. Most databases default to something weaker, and that's where bugs hide.

Read Uncommitted — a transaction can read rows another transaction modified but hasn't committed yet (dirty reads). Almost never appropriate.

Read Committed — you only see committed data, but the same SELECT inside one transaction can return different rows if another transaction commits between them (non-repeatable reads). PostgreSQL and SQL Server's default.

Repeatable Read — the same SELECT always returns the same rows within a transaction, but new rows inserted by other transactions may appear (phantom reads). MySQL InnoDB's default.

Serializable — full isolation. No dirty reads, non-repeatable reads, or phantom reads. PostgreSQL uses Serializable Snapshot Isolation (SSI); MySQL InnoDB uses gap locks.

The key insight: lower isolation = more concurrency = more anomaly risk. Choose deliberately.

isolation_levels_demo.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- Demonstrating non-repeatable read at READ COMMITTED
-- and how REPEATABLE READ prevents it
-- Run Session A and Session B in two separate connections

-- SESSION A
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;

-- First read: sees stock_count = 10
SELECT product_name, stock_count FROM product_inventory WHERE product_id = 1;

-- *** Switch to Session B and commit its update ***

-- Second read:
-- READ COMMITTED: returns 3 (Session B's commit is visible)
-- REPEATABLE READ: returns 10 (snapshot taken at transaction start)
SELECT product_name, stock_count FROM product_inventory WHERE product_id = 1;
COMMIT;

-- SESSION B (runs while Session A is open)
BEGIN;
UPDATE product_inventory SET stock_count = stock_count - 7 WHERE product_id = 1;
COMMIT;

-- Check your current isolation level
SHOW transaction_isolation;  -- PostgreSQL
SELECT @@transaction_isolation;  -- MySQL
Output
-- Session A first read: Mechanical Keyboard | 10
-- Session B commits (stock is now 3)
-- READ COMMITTED second read: 3 (non-repeatable read)
-- REPEATABLE READ second read: 10 (stable snapshot)
MVCC vs Locking
PostgreSQL and Oracle implement isolation using MVCC — readers never block writers because old row versions are stored for concurrent transactions. MySQL InnoDB uses MVCC for reads plus gap locks for Repeatable Read. SQL Server offers both MVCC (snapshot isolation) and locking modes. Knowing this explains why the same isolation level can have different performance characteristics across databases.
Production Insight
PostgreSQL's default READ COMMITTED means any two SELECTs in the same transaction can return different data if another transaction commits between them.
For financial reports or any query that must be internally consistent, explicitly open with REPEATABLE READ.
Serializable is the right choice for double-booking prevention (seat reservations, inventory allocation) but requires retry logic for serialization failures.
Key Takeaway
Isolation level is a deliberate choice, not a default-and-forget setting.
READ COMMITTED allows non-repeatable reads — the same query in the same transaction can return different rows.
Pick REPEATABLE READ for financial reports; SERIALIZABLE for double-booking prevention with retry logic.

Isolation Levels Comparison Matrix (ANSI Standards)

The SQL standard defines four isolation levels based on which concurrency phenomena they prevent. The matrix below maps each level to the three classic anomalies plus a fourth that some standards include:

LevelDirty ReadNon-Repeatable ReadPhantom ReadSerialization Anomaly
READ UNCOMMITTEDPossiblePossiblePossiblePossible
READ COMMITTEDPreventedPossiblePossiblePossible
REPEATABLE READPreventedPreventedPossiblePossible
SERIALIZABLEPreventedPreventedPreventedPrevented

Dirty Read: Reading a row that has been modified by another transaction that has not yet committed. If that transaction later rolls back, the reader has seen uncommitted (and now invalid) data.

Non-Repeatable Read: The same query returns different rows within the same transaction because another transaction committed an update or delete in between.

Phantom Read: A transaction re-executes a query with a WHERE condition and finds new rows that were inserted by another committed transaction. The set of rows changes (like a phantom appearing).

Serialization Anomaly: The result of concurrent transactions is not equivalent to any serial (one-after-another) execution. Serializable prevents all anomalies.

In practice, most databases default to READ COMMITTED (PostgreSQL, SQL Server) or REPEATABLE READ (MySQL InnoDB). Know your database's default and choose intentionally.

Production Insight
When designing a system for high concurrency, start with READ COMMITTED but test for anomalies. If your business logic assumes consistent reads (e.g., reading a user's balance then updating it), set the isolation level to REPEATABLE READ for that transaction. For critical business rules like double-booking prevention, use SERIALIZABLE with retry logic.
Key Takeaway
The ANSI matrix is the standard tool for understanding isolation — always check which anomalies your isolation level permits and whether your application can tolerate them.

Concurrency Phenomena Quick-Reference Table

Beyond dirty reads, non-repeatable reads, and phantom reads, there are other phenomena that affect correctness under concurrent transactions. This table covers the full set a working developer needs to recognize:

PhenomenonDescriptionPrevented ByCommon Example
Dirty ReadRead uncommitted changes from another transactionREAD COMMITTED+A report includes a row that later gets rolled back
Non-Repeatable ReadSame row read twice gives different values within one transactionREPEATABLE READ+Fetching a product price, then computing discount — price changed in between
Phantom ReadA query returns different sets of rows in the same transactionSERIALIZABLE, or REPEATABLE READ with gap locksInventory query: 'SELECT * WHERE type=book' returns 10 books, then later 11
Lost UpdateTwo transactions read and then overwrite the same row, losing one updateRow locking (SELECT FOR UPDATE) or SERIALIZABLETwo users decrement stock: both read stock=5, both write stock=4, losing one sale
Write SkewTwo transactions read overlapping data sets and make conflicting writes without directly updating the same rowSERIALIZABLETwo doctors on-call: both read no other doctor on call, both go off call, leaving no coverage
Read SkewAn inconsistent snapshot where a transaction sees a mix of pre- and post-commit data from another transactionREPEATABLE READ+ in MVCC, or snapshot isolationRead a user's address and then their order — address changed by another transaction in between

Understanding these phenomena is critical when debugging production issues like phantom stock counts or double-booking. Each isolation level and concurrency control mechanism targets a subset of these.

Production Insight
Lost updates are the most common concurrency bug in production — always use SELECT ... FOR UPDATE when reading a value that you will later update. Write skew is harder to detect; SERIALIZABLE isolation is the safest defense but requires retry logic.
Key Takeaway
Concurrency phenomena go beyond the classic three — lost updates and write skew are equally dangerous in production. Use explicit row locking for read-then-write patterns.

Consistency, Savepoints and Long Transaction Dangers in Production

Consistency is the ACID property that's most misunderstood. It doesn't mean 'the data is correct' in a business logic sense — that's your application's job. What it means is that every transaction takes the database from one valid state to another, where 'valid' is defined by schema constraints: foreign keys, CHECK constraints, UNIQUE indexes, and NOT NULL columns.

SAVEPOINTs extend this by letting you create partial rollback points inside a long transaction. Instead of aborting everything on a recoverable error, you can roll back to a named savepoint and retry just that sub-operation. This is invaluable for bulk import scripts and multi-step wizards.

But long-running transactions are expensive regardless of what they're doing. In PostgreSQL, an open transaction holds back autovacuum from reclaiming dead rows, causing table bloat. In MySQL InnoDB, long transactions hold undo log space. The rule: keep transactions as short as possible. Do computation before opening BEGIN, not inside it.

savepoints_and_constraints.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- Bulk import with per-row error recovery using SAVEPOINTs
BEGIN;

SAVEPOINT before_emp_1;
INSERT INTO employees (full_name, dept_id, salary)
VALUES ('Priya Sharma', 1, 95000.00);  -- valid, dept_id 1 exists

SAVEPOINT before_emp_2;
INSERT INTO employees (full_name, dept_id, salary)
VALUES ('Carlos Mendez', 999, 80000.00);  -- FK violation: dept_id 999 missing
-- Caught at application layer, roll back only this insert
ROLLBACK TO SAVEPOINT before_emp_2;
INSERT INTO import_error_log (attempted_name, reason, logged_at)
VALUES ('Carlos Mendez', 'dept_id 999 not found', NOW());

SAVEPOINT before_emp_3;
INSERT INTO employees (full_name, dept_id, salary)
VALUES ('Aisha Okonkwo', 2, 72000.00);  -- valid

SAVEPOINT before_emp_4;
INSERT INTO employees (full_name, dept_id, salary)
VALUES ('Tom Briggs', 1, -5000.00);  -- CHECK violation: salary > 0
ROLLBACK TO SAVEPOINT before_emp_4;
INSERT INTO import_error_log (attempted_name, reason, logged_at)
VALUES ('Tom Briggs', 'salary must be positive', NOW());

COMMIT;  -- Priya and Aisha inserted; Carlos and Tom logged as errors
Output
-- employees: Priya Sharma (Engineering, 95000) | Aisha Okonkwo (Marketing, 72000)
-- import_error_log: Carlos Mendez (FK error) | Tom Briggs (CHECK error)
Long Transactions Kill PostgreSQL Performance
Run: SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC to spot long-running transactions. Any transaction open for more than a few seconds on a write-heavy table blocks autovacuum and accumulates dead tuple bloat. Set idle_in_transaction_session_timeout = '30s' in postgresql.conf as a safety net.
Production Insight
Consistency in ACID means constraint enforcement — the database rejects FK violations, CHECK failures, and UNIQUE conflicts at COMMIT time.
SAVEPOINTs are the right tool for bulk imports: process hundreds of rows in one transaction, recover per-row errors without aborting everything.
Long transactions in PostgreSQL prevent VACUUM from reclaiming dead rows — table bloat is a direct consequence of leaving transactions open.
Key Takeaway
Consistency = constraint enforcement (FK, CHECK, UNIQUE, NOT NULL) — the database is the last line of defense.
SAVEPOINTs allow partial rollback within a transaction — invaluable for bulk operations with recoverable errors per row.
Keep transactions short — open BEGIN as late as possible, COMMIT as early as possible.

SAVEPOINTs for Complex Business Logic — Nested Error Recovery Patterns

When business logic requires multiple steps that each have their own recovery path, SAVEPOINTs enable a nested error handling pattern that preserves the entire transaction's progress. This is especially useful in scenarios like:

  • Batch processing with per-record error handling: Process all records in one transaction; if one record fails (uniqueness, FK violation), log the error and continue with the next record. Without SAVEPOINTs, the entire batch would roll back.
  • Multi-step financial workflows: A payment capture, then a loyalty points update, then an inventory deduction. If the points update fails (e.g., connection timeout to a remote service), you want to roll back only the points step and retry, not the entire payment.
  • Compound validation: Validate a set of business rules one by one, rolling back each invalid step without losing earlier valid work. At the end, commit only if all sub-validations passed.

``java try { connection.setAutoCommit(false); // Step 1: create order Savepoint sp1 = connection.setSavepoint("order_created"); insertOrder(); // Step 2: add line items Savepoint sp2 = connection.setSavepoint("items_added"); insertLineItems(); // If line items fail, rollback to sp2 and log } catch (SQLException e) { connection.rollback(sp2); // rollback only items logError(e); // retry step 2 or continue } finally { connection.releaseSavepoint(sp2); } ``

Note: SAVEPOINTs have a memory cost — each savepoint retains a snapshot of the transaction state. For large batch sizes, release savepoints as soon as their work is finalized.

Production Insight
Use SAVEPOINTs for per-row error recovery in bulk operations, but be aware that each SAVEPOINT holds state in the transaction's memory. For imports of 10,000+ rows, consider releasing savepoints after every N rows (e.g., batch of 100) to limit memory growth. Also, SAVEPOINT names must be unique within a transaction — use an incrementing counter or UUID.
Key Takeaway
SAVEPOINTs enable nested error recovery within a single transaction — use them for multi-step business logic where partial failures can be rescued without rolling back all work.

Atomicity Failure: The Partial Write That Burns You at 2 AM

Every dev thinks they understand atomicity until a power loss hits mid-insert and they're staring at orphaned invoice headers with no line items. Atomicity isn't a feature — it's a contract. The database promises that your transaction either fully commits or fully rolls back. No half-state. Ever. The mechanism? Write-ahead logs and undo segments. Before the engine touches a data page, it logs the 'before' image. If the transaction aborts — whether from a CHECK constraint violation or a dropped network packet — the engine replays those images to restore the original state. Here's what most junior devs miss: atomicity costs you. Every write generates log traffic. Every rollback requires reading those logs back. Watch your log buffer sizes and disk I/O latency. If you're pushing 10k row updates in a single transaction, you're betting the hardware won't blink. And in production, hardware blinks.

AtomicityFailure.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// io.thecodeforge — database tutorial

-- Partial insert that violates FK mid-transaction
BEGIN TRANSACTION;

INSERT INTO invoice_headers (id, customer_id, total)
VALUES (5001, 2047, 1500.00);

-- This insert fails: customer 2047 doesn't exist as supplier
INSERT INTO invoice_line_items (invoice_id, sku, qty, price)
VALUES (5001, 'WIDGET-A', 10, 150.00);

-- ERROR: insert or update on table violates foreign key constraint

-- Transaction rolls back automatically on error
-- Both inserts are discarded. Zero partial writes.
Output
ERROR: insert or update on table "invoice_line_items" violates foreign key constraint "fk_supplier"
DETAIL: Key (supplier_id)=(2047) is not present in table "suppliers".
ROLLBACK
Production Trap: Implicit Transactions
Some ORMs wrap every single INSERT in its own transaction. This kills atomicity for multi-row operations. Always batch related writes in explicit BEGIN/COMMIT blocks.
Key Takeaway
Atomicity means all-or-nothing, not 'try my best'. If you see partial data after a crash, your engine's WAL is misconfigured or you bypassed the transaction boundary.

Durability: The Lie of fsync and What Actually Survives a Power Pull

Durability sounds simple: once COMMIT returns, data stays. That's marketing. In reality, durability depends on the write-ahead log flushing to persistent storage before the commit acknowledgment reaches your client. PostgreSQL does this by default — fsync on every commit. MySQL with InnoDB flushes the redo log. But here's the catch: hardware lies. Disk controllers cache writes in volatile RAM. RAID cards lie about flushing. Cloud SSDs throttle under load and report success before bits hit silicon. Saw a production incident where a RAID controller battery failed silently. Commits returned OK. Server crashed. Lost four minutes of transaction data. The fix? Disable disk write cache, enable write-back barriers, and for critical systems, use synchronous replication to at least two nodes. Test your durability by pulling the plug on a staging server. Read the survivor's guide: synchronous_commit = on in Postgres, innodb_flush_log_at_trx_commit = 1 in MySQL.

DurabilityCheck.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// io.thecodeforge — database tutorial

-- How to verify your durability settings in PostgreSQL
SELECT name, setting, unit
FROM pg_settings
WHERE name IN (
    'fsync',
    'synchronous_commit',
    'wal_sync_method',
    'full_page_writes'
);

-- Expected output for guaranteed durability:
-- fsync                 | on
-- synchronous_commit    | on
-- wal_sync_method       | open_datasync | fdatasync
-- full_page_writes      | on
Output
name | setting | unit
---------------------+---------+------
fsync | on |
synchronous_commit | on |
wal_sync_method | fdatasync |
full_page_writes | on |
Senior Shortcut: The fsync Tax
Each fsync costs 1-5ms. If your commit rate exceeds 200/sec, you need faster storage or batch commits. Durability is a tradeoff: every millisecond of fsync latency adds direct pressure to your transaction throughput.
Key Takeaway
Durability isn't COMMIT returning — it's COMMIT returning after the WAL is on disk. If you can't pull the power cord and recover every committed transaction, you don't have durability.
● Production incidentPOST-MORTEMseverity: high

Silent Partial Orders — A Missing ROLLBACK in the Exception Handler

Symptom
CustomerService reported 12-15 'empty orders' per day. Each had a valid order_id, a charged payment, but zero rows in the order_items table. No exceptions appeared in application logs.
Assumption
The developer assumed that if the INSERT into order_items failed, the exception handler would prevent the COMMIT from running. The code caught the exception, logged a warning, and allowed execution to continue to the commit call.
Root cause
The transaction flow was: INSERT into orders (succeeded) → INSERT into order_items (raised exception) → exception caught and swallowed with a log.warn() → COMMIT executed. The COMMIT committed the order header without the items. The exception handler never called rollback().
Fix
Added explicit rollback() in the catch block before re-throwing. Added a post-commit assertion: if order_items count for the new order_id is zero, raise an alert. Added an integration test that injects an order_items failure and asserts the order header is also rolled back.
Key lesson
  • A caught exception does not prevent COMMIT — you must explicitly call rollback() before re-throwing or returning
  • Never swallow database exceptions with only a log.warn() inside a transaction — always either rollback or abort
  • Add post-commit assertions for critical multi-table writes — catch partial states that the exception handler missed
Production debug guideDiagnosing deadlocks, long transactions, and isolation anomalies5 entries
Symptom · 01
Deadlock detected errors on a high-traffic table
Fix
Run: SELECT pid, wait_event_type, wait_event, query FROM pg_stat_activity WHERE wait_event_type = 'Lock'. Check which transactions are blocking each other. Fix: ensure all code paths acquire locks in the same table order. Deadlocks happen when Transaction A locks table X then Y, while Transaction B locks Y then X.
Symptom · 02
Connection pool exhaustion — requests queuing behind long transactions
Fix
Run: SELECT pid, now() - query_start AS duration, state, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC. Any transaction open for more than a few seconds on a write-heavy table is a problem. Add idle_in_transaction_session_timeout = '30s' to postgresql.conf.
Symptom · 03
Table bloat growing despite regular deletes
Fix
A long-running transaction is blocking autovacuum. Run: SELECT query, state, now() - xact_start AS open_for FROM pg_stat_activity WHERE xact_start IS NOT NULL ORDER BY xact_start. Terminate the oldest idle-in-transaction sessions with pg_terminate_backend(pid).
Symptom · 04
UPDATE seems to have committed but the change isn't visible
Fix
Check AUTOCOMMIT setting: SHOW autocommit (PostgreSQL) or SELECT @@autocommit (MySQL). If your driver has autocommit=off, statements need an explicit COMMIT. Also verify the connection isn't in a failed transaction state — run SELECT txid_current_if_assigned() to check.
Symptom · 05
Same SELECT returns different counts within one transaction
Fix
You are at READ COMMITTED isolation level (the PostgreSQL default). Each statement gets a fresh snapshot — another transaction committed between your two SELECTs. Switch to REPEATABLE READ for this transaction: BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ.
Isolation LevelDirty ReadNon-Repeatable ReadPhantom ReadDefault InUse Case
READ UNCOMMITTEDPossiblePossiblePossibleRarely appropriate — analytics on non-critical data only
READ COMMITTEDPreventedPossiblePossiblePostgreSQL, SQL ServerOLTP default — APIs where slight read variance is acceptable
REPEATABLE READPreventedPreventedPossible*MySQL InnoDBFinancial reports, multi-read consistency within one transaction
SERIALIZABLEPreventedPreventedPreventedDouble-booking prevention, regulatory compliance, inventory systems

Key takeaways

1
The WAL is the real mechanism behind Atomicity and Durability
COMMIT waits for fsync by design. Disabling fsync trades correctness for performance, a deal almost never worth making.
2
Isolation is a spectrum
READ COMMITTED (the PostgreSQL default) allows non-repeatable reads. Choose your isolation level deliberately based on what anomalies your business logic can tolerate.
3
SAVEPOINTs allow partial rollbacks inside long transactions
the right tool for bulk imports where per-row errors shouldn't abort the entire batch.
4
Consistency is enforced by constraints at the database layer
FK, CHECK, UNIQUE, and NOT NULL are your last line of defense against invalid data.

Common mistakes to avoid

3 patterns
×

Wrapping entire request handlers in a single long transaction

Symptom
Database connection pool exhaustion under load — p99 latency spikes as transactions queue behind each other holding row locks for hundreds of milliseconds
Fix
Open the transaction as late as possible and commit as early as possible. Move all validation, external API calls, and computation outside the BEGIN...COMMIT block. A transaction should contain only the actual DML statements and ideally complete in under 10ms.
×

Swallowing exceptions inside a transaction without rolling back

Symptom
Application logs show an error but the transaction commits a partial state — orders created without line items, transfers debited but not credited
Fix
Always call rollback() explicitly in the exception handler before re-throwing. Never let a caught exception allow execution to reach COMMIT. Add a post-commit assertion to catch partial states that slip through.
×

Assuming AUTOCOMMIT=OFF when it is actually ON

Symptom
In MySQL and many JDBC drivers, every statement is its own transaction by default. Developers write UPDATE statements expecting they can ROLLBACK later, only to find the changes are already permanently committed
Fix
Always call BEGIN (or START TRANSACTION) explicitly before multi-statement logic regardless of driver defaults. Check: SELECT @@autocommit (MySQL) or SHOW autocommit (PostgreSQL sessions).
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain what happens mechanically inside PostgreSQL between COMMIT and t...
Q02SENIOR
Your team is seeing intermittent deadlock errors on a high-traffic order...
Q03SENIOR
A colleague suggests switching from READ COMMITTED to READ UNCOMMITTED t...
Q01 of 03SENIOR

Explain what happens mechanically inside PostgreSQL between COMMIT and the application receiving a success response.

ANSWER
When COMMIT is issued: (1) PostgreSQL writes a commit record to the WAL (Write-Ahead Log) buffer in shared memory. (2) The WAL buffer is flushed to disk via fsync — this is the expensive blocking step that guarantees durability. (3) The transaction's XID is marked as committed in the pg_clog (commit log). (4) Only after the fsync completes does the database send the success response to the client. The data pages in memory may not have been written to disk yet — that happens lazily by the background writer and checkpointer. The WAL flush is what makes the commit durable: on crash recovery, PostgreSQL replays the WAL from the last checkpoint and applies any committed changes whose data pages weren't flushed yet.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
What is the difference between ROLLBACK and ROLLBACK TO SAVEPOINT?
02
Does every SQL database support ACID transactions?
03
If two transactions update the same row simultaneously, which one wins?
N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Everything here is grounded in real deployments.

Follow
Verified
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
🔥

That's SQL Advanced. Mark it forged?

9 min read · try the examples if you haven't

Previous
SQL Triggers
5 / 16 · SQL Advanced
Next
SQL Window Functions