Skip to content
Home CS Fundamentals ACID Properties in DBMS: The Foundations of Database Reliability

ACID Properties in DBMS: The Foundations of Database Reliability

Where developers are forged. · Structured learning · Free forever.
📍 Part of: DBMS → Topic 2 of 11
Master ACID properties (Atomicity, Consistency, Isolation, Durability) in DBMS.
⚙️ Intermediate — basic CS Fundamentals knowledge assumed
In this tutorial, you'll learn
Master ACID properties (Atomicity, Consistency, Isolation, Durability) in DBMS.
  • Atomicity: all-or-nothing transactions implemented via WAL and undo logs. Without explicit BEGIN/COMMIT boundaries, auto-commit mode makes each statement its own transaction and a crash between statements leaves unrecoverable partial state.
  • Consistency: the database moves from one valid state to another by enforcing all constraints. Application-level validation should catch business rule violations before they reach the database; constraints are the safety net, not the primary check.
  • Isolation: a speed-versus-safety dial from READ COMMITTED (fast, allows non-repeatable reads) to SERIALIZABLE (safe, 20-40% slower). Default to READ COMMITTED and increase isolation only for specific transactions with demonstrated anomalies.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • ACID guarantees transactional reliability: Atomicity (all-or-nothing), Consistency (valid state transitions only), Isolation (concurrent transactions do not interfere), Durability (committed data survives crashes)
  • Atomicity is implemented via Write-Ahead Log and undo segments — partial failures roll back completely without leaving orphaned state
  • Isolation uses MVCC or locking to prevent dirty reads, lost updates, and phantom reads — the level you choose is a speed-versus-safety dial
  • Durability flushes WAL to disk via fsync before COMMIT returns — this is the most expensive operation in any database write path
  • Setting synchronous_commit=off in PostgreSQL boosts throughput 5-10x but risks losing up to 600ms of committed data on crash — never acceptable for financial data
  • The biggest production mistake: assuming higher isolation levels are always safer — SERIALIZABLE can cut throughput by 40% without preventing anything that READ COMMITTED with proper application logic would not handle
🚨 START HERE
ACID Debugging Cheat Sheet
Quick diagnostic commands for transactional issues in production PostgreSQL systems.
🟡Transaction stuck in 'idle in transaction' state — holding locks, blocking other queries
Immediate ActionIdentify and terminate long-running stalled transactions that are holding locks without doing work.
Commands
SELECT pid, now() - xact_start AS duration, state, query FROM pg_stat_activity WHERE state = 'idle in transaction' ORDER BY duration DESC;
SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction' AND now() - xact_start > interval '5 minutes';
Fix NowSet idle_in_transaction_session_timeout = '30s' in postgresql.conf to automatically terminate stalled transactions. Monitor this metric continuously — a single idle-in-transaction session can hold locks that block the entire application tier.
🟡Deadlock detected in application logs — transactions are mutually blocking each other
Immediate ActionIdentify which queries are involved in the deadlock cycle and what locks they hold.
Commands
SELECT blocked.pid AS blocked_pid, blocked.query AS blocked_query, blocking.pid AS blocking_pid, blocking.query AS blocking_query FROM pg_catalog.pg_locks bl JOIN pg_catalog.pg_stat_activity blocked ON blocked.pid = bl.pid JOIN pg_catalog.pg_locks bbl ON bbl.locktype = bl.locktype AND bbl.relation = bl.relation AND bbl.pid != bl.pid JOIN pg_catalog.pg_stat_activity blocking ON blocking.pid = bbl.pid WHERE NOT bl.granted;
SELECT relname, n_dead_tup, last_autovacuum FROM pg_stat_user_tables WHERE n_dead_tup > 10000 ORDER BY n_dead_tup DESC;
Fix NowEnforce consistent lock ordering in application code — always acquire locks in a deterministic order such as primary key ascending or table name alphabetical. If the deadlock involves a hot table with millions of dead tuples, run VACUUM ANALYZE on that table immediately — dead tuple bloat increases lock contention.
🟡WAL disk usage growing uncontrollably — disk space alerts firing on the WAL partition
Immediate ActionCheck whether inactive replication slots are retaining WAL that will never be consumed.
Commands
SELECT slot_name, active, pg_size_pretty( pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) ) AS retained_wal FROM pg_replication_slots ORDER BY pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) DESC;
SELECT pg_size_pretty(sum(size)) AS total_wal_size FROM pg_ls_waldir();
Fix NowDrop any replication slot that shows active = false and is retaining gigabytes of WAL: SELECT pg_drop_replication_slot('slot_name'). Set max_slot_wal_keep_size in postgresql.conf to cap how much WAL any single slot can retain. An inactive replication slot is a silent time bomb — it prevents WAL recycling indefinitely.
Production IncidentDouble-Debit: $47,000 Lost to Missing Atomicity in a Payment BatchA fintech startup ran a batch payment job without wrapping individual transfers in transactions. A mid-batch OOM kill left 230 accounts debited but zero payments processed — $47,000 vanished with no recovery path.
Symptom230 customer accounts showed unexplained negative balances Monday morning. The external payment processor confirmed no funds received for those accounts. Customer support was flooded with 1,200 tickets by 9 AM. The internal ledger showed debits with no matching payment records, and the discrepancy grew by the hour as more customers checked their accounts.
AssumptionThe team assumed the batch job was atomic because it ran as a single Java process inside a single JVM. They treated process-level execution as equivalent to database-level atomicity. They did not wrap each payment transfer in a BEGIN/COMMIT transaction block because they believed the ORM's implicit auto-commit was sufficient.
Root causeThe batch job executed 10,000 UPDATE statements to debit customer accounts, then 10,000 INSERT statements to create payment records in the payment log table. Each SQL statement ran in auto-commit mode — every individual UPDATE and INSERT was committed immediately upon execution. When the JVM ran out of memory at statement 4,700 of the INSERT phase, all 10,000 debits had already been committed individually, but only 2,350 of the 10,000 payment records had been created. There was no transaction boundary encompassing both the debit and the payment log insert for any individual transfer. The ORM's auto-commit mode meant each statement was its own transaction — atomicity existed at the statement level but not at the business-logic level.
Fix1. Wrapped each payment transfer in an explicit BEGIN/COMMIT block: debit the customer account, insert the payment record, and update the internal ledger — all within a single database transaction. If any step fails, the entire transfer rolls back and the customer's balance is unchanged. 2. Added SAVEPOINT every 100 records so that a failure at record 4,700 rolls back only to the SAVEPOINT at record 4,600, and the batch can resume from that checkpoint rather than reprocessing all 10,000 records from scratch. 3. Added a reconciliation query that runs hourly in production: count of debits versus count of payment records per batch. Any mismatch greater than zero triggers a critical alert immediately. 4. Added a circuit breaker that halts the batch if more than 5 consecutive individual transfers fail, preventing cascading damage. 5. Configured JVM memory monitoring with alerts at 80% heap usage, so OOM conditions are detected before they kill the process mid-batch.
Key Lesson
Never assume a batch process is atomic because it runs in a single process. Atomicity is a database-level property enforced by explicit transaction boundaries, not a JVM-level property enforced by a single process.Use SAVEPOINT inside long-running batches to enable partial rollback and resume — reprocessing 10,000 records because record 4,701 failed is a waste of compute and customer patience.Monitor for debit-to-payment count mismatches in production. A 1% drift between related tables is not a minor accounting issue — it is a critical data integrity alert that should page someone immediately.OOM kills, SIGKILL signals, and hardware failures do not trigger transaction rollback if no transaction boundary exists. Auto-commit mode means every individual statement is its own transaction, and a failure between statements leaves the database in a state that no single rollback can fix.
Production Debug GuideDiagnosing transactional failures in production PostgreSQL and MySQL systems. These symptoms appear silently and compound over time.
Duplicate charges appear after a timeout-retry cycle — customer is billed twice for the same orderCheck whether the retry logic reuses the same transaction or creates a new one. The original transaction may have committed successfully but the response timed out before reaching the application. Implement idempotency keys on the payment table with a unique constraint. Verify the original transaction status before retrying: SELECT * FROM pg_stat_activity WHERE state = 'idle in transaction' to check for abandoned connections. Use INSERT ... ON CONFLICT (idempotency_key) DO NOTHING to make retries safe.
Two concurrent updates overwrite each other silently — the lost update anomalyCheck the current isolation level: SHOW transaction_isolation. If READ COMMITTED (the PostgreSQL default), concurrent transactions can both read the same row value, compute a new value independently, and the second commit silently overwrites the first. Fix with either pessimistic locking (SELECT ... FOR UPDATE to lock the row before reading) or optimistic locking (add a version column and include AND version = :expected in the UPDATE WHERE clause). Choose pessimistic for high-conflict workloads, optimistic for low-conflict.
Committed data disappears after a server crash — users report seeing a success confirmation but the data is goneCheck the synchronous_commit setting: SHOW synchronous_commit. If set to off, committed data exists only in the OS page cache until the wal_writer flushes it to disk. A crash within that window loses the data. Set synchronous_commit = on for any table that stores data a user has seen confirmed. Monitor WAL flush lag with SELECT pg_current_wal_lsn() and compare against the last flushed position.
Deadlocks occurring every few minutes under normal load — application logs show 'deadlock detected' errorsEnable deadlock logging: SET deadlock_timeout = '1s' (the default is already 1s in PostgreSQL). Query pg_stat_activity for blocked and blocking PIDs. The most common cause is inconsistent lock ordering: Transaction A locks row 1 then row 2, while Transaction B locks row 2 then row 1. Fix by establishing a canonical lock ordering in application code — always acquire locks in the same deterministic order (e.g., by primary key ascending or table name alphabetical).
Queries return stale data despite recent writes — a read immediately after a write returns the old valueDetermine whether the read is hitting a replica or the primary. Check replication lag: SELECT now() - pg_last_xact_replay_timestamp() AS replication_lag. If lag is the cause, either route critical reads to the primary or wait for the replica to catch up before serving the response. If using MVCC on the primary, check whether VACUUM is running — excessive dead tuples from un-vacuumed tables can cause reads to see outdated row versions. Run SELECT relname, n_dead_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 10 to identify bloated tables.

ACID is the set of properties that makes a database trustworthy. Without ACID, a power outage at the wrong millisecond could leave your data in a corrupted, half-updated state — a nightmare for financial records, medical systems, or any application where partial state is not just wrong but dangerous. Imagine a bank transfer: if the system fails after debiting your account but before crediting the recipient, that money vanishes. Not temporarily missing, not in a reconciliation queue — gone. ACID prevents this.

These four properties work in tandem, implemented through sophisticated internal mechanisms like transaction logs, locking protocols, and multi-version concurrency control. As a developer, understanding ACID is not about passing interviews. It is about understanding the performance trade-offs you make every time you choose an isolation level, decide between a traditional RDBMS and a relaxed NoSQL system, or configure how aggressively your database flushes data to disk.

A common misconception is that ACID is binary — either a database has it or it does not. In reality, every property has configurable levels that trade safety for throughput. PostgreSQL lets you trade durability for speed with synchronous_commit. MySQL lets you choose between InnoDB's row-level MVCC and MyISAM's table-level locking. The properties exist on a spectrum, and production tuning means knowing which dial to turn, how far to turn it, and what breaks if you go too far.

Atomicity — The All-or-Nothing Guarantee

Atomicity ensures that a transaction is treated as a single, indivisible unit. If any part of the transaction fails — a constraint violation, a timeout, a crash, an OOM kill — the entire operation is rolled back, leaving the database in the exact state it was in before the transaction started. There is no partial commit. There is no state where half the work is done and the other half is not.

The primary implementation mechanism is the Write-Ahead Log (WAL). Before any data pages are modified on disk, the intended changes are appended sequentially to the WAL. If a crash occurs mid-transaction, the database reads the WAL during recovery and identifies any transaction that started but never committed. Those transactions are rolled back — their changes are undone as if they never happened.

In MySQL/InnoDB, atomicity uses the undo log in addition to the redo log. When a transaction modifies a row, the original value is copied to the undo log before the change is applied to the data page. If ROLLBACK is issued — or if the server crashes before COMMIT — the database replays the undo log to restore every modified row to its pre-transaction state.

The critical production insight is this: atomicity only exists within explicit transaction boundaries. If you run 10,000 SQL statements in auto-commit mode, each statement is its own transaction. A crash at statement 4,700 means 4,699 statements are permanently committed and 5,301 never happened — with no mechanism to identify or recover the boundary. The $47,000 incident described above is exactly this failure mode.

SAVEPOINT extends atomicity within a transaction. It creates a named checkpoint inside a running transaction. If a subsequent operation fails, you can ROLLBACK TO SAVEPOINT to undo only the work after the savepoint, while preserving everything before it. This is essential for long-running batches where reprocessing everything from scratch is not acceptable.

io/thecodeforge/db/atomic_order_flow.sql · SQL
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
-- ============================================================
-- Atomic E-commerce Order Flow
-- All three operations succeed together or none of them persist.
-- A crash between statement 2 and statement 3 rolls back everything.
-- ============================================================
BEGIN;

-- Step 1: Create the order header
INSERT INTO io_thecodeforge.orders (customer_id, status, total_amount)
VALUES (1024, 'PENDING_PAYMENT', 599.00);

-- Step 2: Reserve inventory — the heavy, contention-prone operation
-- The WHERE clause prevents negative stock without a CHECK constraint race
UPDATE io_thecodeforge.inventory
SET stock_count = stock_count - 1
WHERE product_id = 'SKU-99'
  AND stock_count > 0;

-- Verify the UPDATE actually affected a row (stock was available)
-- If 0 rows affected, the product is out of stock — abort the transaction
-- Application code checks the affected row count here

-- Step 3: Record the payment intent
INSERT INTO io_thecodeforge.payments (order_id, provider, amount, status)
VALUES (currval('orders_order_id_seq'), 'STRIPE', 599.00, 'INITIATED');

-- If ANY constraint violation occurs (FK, CHECK, UNIQUE),
-- the entire transaction is rolled back — no orphaned order,
-- no phantom inventory decrement, no unmatched payment record.
COMMIT;

-- ============================================================
-- SAVEPOINT example for batch processing
-- Enables partial rollback without losing the entire batch
-- ============================================================
BEGIN;

-- Process first 100 records
SAVEPOINT batch_checkpoint_100;

-- ... 100 INSERT/UPDATE statements ...

-- Process records 101-200
SAVEPOINT batch_checkpoint_200;

-- ... 100 INSERT/UPDATE statements ...
-- Record 157 fails with a constraint violation

-- Roll back ONLY to the last checkpoint — records 1-100 survive
ROLLBACK TO SAVEPOINT batch_checkpoint_100;

-- Resume processing from record 101 with corrected data
-- ... retry logic here ...

COMMIT;  -- Everything that was not rolled back is now durable
▶ Output
INSERT 0 1
UPDATE 1
INSERT 0 1
COMMIT

SAVEPOINT batch_checkpoint_100
SAVEPOINT batch_checkpoint_200
ROLLBACK TO SAVEPOINT
COMMIT
Mental Model
Atomicity Is a Time Machine for Your Database
Atomicity lets the database rewind to the exact moment before a transaction started, as if the transaction never happened. The WAL is the recording; ROLLBACK is the rewind button.
  • WAL records the intent before the change is applied to data pages — crash recovery replays or undoes based on COMMIT status.
  • Undo log (InnoDB) or rollback segments store the original row values — ROLLBACK restores them atomically.
  • SAVEPOINT creates named checkpoints inside a transaction — partial rollback without losing all prior work.
  • Without explicit BEGIN/COMMIT boundaries, auto-commit mode means each statement is its own transaction — a crash between statements leaves an unrecoverable partial state.
  • OOM kills, SIGKILL, and hardware failures do not trigger application-level rollback logic. Only database-level transaction boundaries provide crash safety.
📊 Production Insight
The most common atomicity failure in production is batch jobs running in auto-commit mode. Each SQL statement commits immediately, and a crash mid-batch leaves an unrecoverable partial state.
A secondary failure mode is application-level retries that do not check whether the original transaction committed. The timeout fires, the application retries, and both the original and the retry commit — producing a duplicate charge.
Rule: every logical unit of work must be wrapped in explicit BEGIN/COMMIT. If it takes more than one SQL statement and they must succeed or fail together, it needs a transaction. Auto-commit is for interactive psql sessions, not for production application code.
🎯 Key Takeaway
Atomicity is implemented via WAL and undo logs — the database records intent before applying changes to data pages, enabling complete rollback on failure.
SAVEPOINT creates checkpoints inside a transaction for partial rollback without discarding all prior work.
The most dangerous production pattern: batch jobs in auto-commit mode that leave an unrecoverable partial state when the process is killed mid-execution.
When to Use SAVEPOINT vs Full ROLLBACK
IfSingle operation that must fully succeed or fully fail — an order placement, a fund transfer, a permission change
UseUse plain BEGIN/COMMIT — no SAVEPOINT needed. The transaction is small and indivisible.
IfMulti-step process where individual steps can fail without invalidating prior steps — an order with optional gift wrapping
UseUse SAVEPOINT before each optional step. ROLLBACK TO SAVEPOINT on failure, then continue with the remaining steps.
IfLong-running batch job processing thousands of records — nightly ETL, bulk import, mass update
UseUse SAVEPOINT every N records (100-1000 depending on record size). Enables resume from the last checkpoint on failure without reprocessing the entire batch.
IfNested business logic where an inner operation's failure should not discard the outer operation's work
UseUse nested SAVEPOINTs — each inner scope rolls back independently while the outer transaction continues.

Consistency — The Valid State Machine

Consistency ensures that a transaction brings the database from one valid state to another valid state, respecting every defined rule — constraints, triggers, cascades, and domain invariants. If a transaction would violate any constraint (foreign key, unique, check, not null), the database rejects the entire transaction. The database never persists an invalid state, even if the application code attempts to create one.

Consistency is the property that ties the other three together. Atomicity ensures no partial transaction leaks invalid intermediate state. Isolation ensures concurrent transactions do not create invalid combinations of values that no single transaction would produce. Durability ensures the valid state, once committed, persists through crashes. Without any one of the other three, consistency cannot be guaranteed.

There are two kinds of consistency that matter in practice. Schema-level consistency is enforced by database constraints: foreign keys prevent orphaned references, unique constraints prevent duplicates, check constraints prevent domain violations like negative stock counts. Application-level consistency is enforced by business logic: a transfer must debit and credit the same amount, an order must have at least one line item, a subscription cancellation must trigger a prorated refund. The database cannot enforce application-level invariants automatically — those are the application developer's responsibility, backed by database constraints as a safety net.

In production, constraint violations surface as SQL errors that applications must handle gracefully. A common anti-pattern is catching constraint violations and retrying blindly without diagnosing why the violation occurred. A unique constraint violation on an idempotency key is normal and expected — the correct response is to return the existing record, not to retry with a new key. A foreign key violation is a bug — it means the application is referencing data that does not exist, and retrying will produce the same failure.

io/thecodeforge/db/consistency_enforcement.sql · SQL
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
-- ============================================================
-- Consistency Enforcement — Constraints as Safety Nets
-- The database rejects any transaction that would create invalid state.
-- ============================================================

-- 1. Foreign Key: an order cannot reference a non-existent customer
--    This prevents orphaned records that break JOIN integrity.
INSERT INTO io_thecodeforge.orders (customer_id, status, total_amount)
VALUES (99999, 'PENDING', 100.00);
-- ERROR:  insert or update on table "orders" violates foreign key constraint
--         "orders_customer_id_fkey"
-- DETAIL: Key (customer_id)=(99999) is not present in table "customers".

-- 2. Check Constraint: stock cannot go negative
--    This catches the bug where two concurrent decrements both pass
--    the application-level check but the net result goes below zero.
ALTER TABLE io_thecodeforge.inventory
ADD CONSTRAINT chk_stock_non_negative
CHECK (stock_count >= 0);

UPDATE io_thecodeforge.inventory
SET stock_count = stock_count - 50
WHERE product_id = 'SKU-99' AND stock_count = 3;
-- ERROR:  new row for relation "inventory" violates check constraint
--         "chk_stock_non_negative"
-- DETAIL: Failing row contains (SKU-99, -47).

-- 3. Unique Constraint with idempotency key: prevent duplicate payments
--    This is how you make payment retries safe — the database enforces
--    that each logical payment attempt can only succeed once.
CREATE UNIQUE INDEX IF NOT EXISTS idx_payments_idempotency
  ON io_thecodeforge.payments (idempotency_key);

-- First attempt — succeeds
INSERT INTO io_thecodeforge.payments
  (idempotency_key, order_id, provider, amount, status)
VALUES ('pay_abc123', 1024, 'STRIPE', 599.00, 'COMPLETED');

-- Retry with same idempotency key — safely ignored
INSERT INTO io_thecodeforge.payments
  (idempotency_key, order_id, provider, amount, status)
VALUES ('pay_abc123', 1024, 'STRIPE', 599.00, 'COMPLETED')
ON CONFLICT (idempotency_key) DO NOTHING;
-- INSERT 0 0 — no duplicate created, no error raised

-- To return the existing record on conflict:
INSERT INTO io_thecodeforge.payments
  (idempotency_key, order_id, provider, amount, status)
VALUES ('pay_abc123', 1024, 'STRIPE', 599.00, 'COMPLETED')
ON CONFLICT (idempotency_key)
DO UPDATE SET status = io_thecodeforge.payments.status
RETURNING *;
▶ Output
ERROR: FK violation on customer_id=99999
ERROR: CHECK violation stock_count=-47
INSERT 0 1 (first payment)
INSERT 0 0 (idempotent retry — no duplicate)
⚠ Constraints Are Safety Nets, Not Primary Validation
Never rely solely on database constraints for business rule validation. Constraints are the last line of defense — they catch bugs that slipped past application-level validation. If your production logs are full of constraint violation errors, that means invalid data is reaching the database layer, which means your application validation is broken or missing. Validate business rules in application code first. Return clear error messages to users when validation fails. Use constraints to catch the cases where application validation has a bug — because it will, eventually. Log every constraint violation in production with full request context so you can diagnose why the application validation missed it.
📊 Production Insight
Constraint violations in production are silent throughput killers — each failed INSERT or UPDATE triggers an implicit ROLLBACK, wastes disk I/O on the WAL write, and returns an error that the application must handle.
Blind retry loops on constraint violations can cause deadlocks under high concurrency — each retry acquires locks that conflict with other retries.
Rule: validate preconditions in application code before opening the transaction. Use constraints as a safety net for bugs, not as the primary validation mechanism. Log every constraint violation with full request context.
🎯 Key Takeaway
Consistency is enforced by constraints (FK, Unique, Check, Not Null) — the database rejects any transaction that would create an invalid state.
Consistency depends on the other three ACID properties: Atomicity prevents partial state, Isolation prevents concurrent interference, Durability prevents loss of valid state.
Two kinds of consistency matter: schema-level (enforced by the database) and application-level (enforced by your code, backed by constraints as a safety net).

Isolation — Handling the Chaos of Concurrency

Isolation defines how and when the changes made by one transaction become visible to other concurrent transactions. Without isolation, the lost update anomaly occurs: two transactions read the same row, both compute a new value independently, and the second commit silently overwrites the first commit's changes. Neither transaction is aware that the other existed.

Databases solve concurrency problems using two fundamentally different approaches. Pessimistic locking (SELECT ... FOR UPDATE) acquires an exclusive lock on the row at read time — any other transaction that tries to read or write the same row blocks until the lock is released. This guarantees no lost updates but reduces concurrency. Optimistic concurrency (MVCC plus version columns) allows concurrent reads without blocking and detects conflicts at write time — if the row changed since you read it, your update fails and the application retries.

The SQL standard defines four isolation levels, from weakest to strongest:

READ UNCOMMITTED allows dirty reads — you can see uncommitted changes from other transactions. PostgreSQL silently promotes this to READ COMMITTED because dirty reads are almost never desirable.

READ COMMITTED is the PostgreSQL default. Each SQL statement within a transaction sees a fresh snapshot of committed data at the moment that statement begins. This prevents dirty reads but allows non-repeatable reads: if you read a row twice within the same transaction, you might get different values if another transaction committed a change between your two reads.

REPEATABLE READ takes a snapshot when the transaction starts and uses that snapshot for every statement within the transaction. This prevents both dirty reads and non-repeatable reads. In PostgreSQL, REPEATABLE READ also prevents phantom reads because MVCC snapshots are transaction-scoped. In MySQL/InnoDB, REPEATABLE READ uses gap locks to prevent phantom inserts, which makes it stricter than the SQL standard requires.

SERIALIZABLE provides full isolation — transactions execute as if they ran one at a time, sequentially. PostgreSQL implements this with Serializable Snapshot Isolation (SSI), which detects read-write dependencies and aborts transactions that would violate serializability. This is the safest level but reduces throughput by 20-40% under contention because conflicting transactions are aborted and must be retried.

io/thecodeforge/db/isolation_lost_update.sql · SQL
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283
-- ============================================================
-- Preventing the Lost Update Anomaly
-- Two concurrent transactions try to update the same row.
-- Without isolation controls, the second commit silently
-- overwrites the first commit's changes.
-- ============================================================

-- Approach 1: Pessimistic Locking — SELECT FOR UPDATE
-- Acquires an exclusive row lock at read time.
-- Any other transaction that tries to read this row FOR UPDATE
-- will block until this transaction commits or rolls back.

-- Transaction A:
BEGIN;
SELECT stock_count FROM io_thecodeforge.inventory
WHERE product_id = 'SKU-99'
FOR UPDATE;  -- exclusive lock acquired on this row
-- Returns stock_count = 10

UPDATE io_thecodeforge.inventory
SET stock_count = stock_count - 1,
    updated_at = now()
WHERE product_id = 'SKU-99';
-- stock_count is now 9

COMMIT;  -- lock released, Transaction B can proceed

-- Transaction B (concurrent):
BEGIN;
SELECT stock_count FROM io_thecodeforge.inventory
WHERE product_id = 'SKU-99'
FOR UPDATE;  -- BLOCKS here until Transaction A commits
-- After A commits, returns stock_count = 9 (the updated value)

UPDATE io_thecodeforge.inventory
SET stock_count = stock_count - 1
WHERE product_id = 'SKU-99';
-- stock_count is now 8 — CORRECT

COMMIT;

-- ============================================================
-- Approach 2: Optimistic Locking — Version Column
-- No locks held during read. Conflict detected at write time.
-- If someone else changed the row, the UPDATE affects 0 rows
-- and the application retries.
-- ============================================================

-- Add a version column to the table
ALTER TABLE io_thecodeforge.inventory
ADD COLUMN IF NOT EXISTS version INTEGER DEFAULT 1;

-- Transaction A reads the row (no lock)
SELECT stock_count, version FROM io_thecodeforge.inventory
WHERE product_id = 'SKU-99';
-- Returns: stock_count=10, version=1

-- Transaction A writes with version check
UPDATE io_thecodeforge.inventory
SET stock_count = stock_count - 1,
    version = version + 1
WHERE product_id = 'SKU-99'
  AND version = 1;  -- only succeeds if no one else changed it
-- UPDATE 1 — success, version is now 2

-- Transaction B (concurrent) tries the same
UPDATE io_thecodeforge.inventory
SET stock_count = stock_count - 1,
    version = version + 1
WHERE product_id = 'SKU-99'
  AND version = 1;  -- FAILS: version is now 2, not 1
-- UPDATE 0 — zero rows affected, application detects and retries

-- ============================================================
-- Check current isolation level
-- ============================================================
SHOW transaction_isolation;
-- Default in PostgreSQL: read committed

-- Set isolation level for a specific transaction
BEGIN ISOLATION LEVEL REPEATABLE READ;
-- ... your queries here ...
COMMIT;
▶ Output
SELECT FOR UPDATE: 1 row locked
UPDATE 1
COMMIT

-- Transaction B blocks, then:
SELECT: stock_count=9
UPDATE 1
COMMIT

-- Optimistic locking:
UPDATE 1 (Transaction A succeeds)
UPDATE 0 (Transaction B detects conflict — application retries)

transaction_isolation: read committed
Mental Model
Isolation Is a Speed vs Safety Dial — Not a Binary Switch
Lower isolation levels let more transactions run concurrently but risk anomalies. Higher levels prevent anomalies but create bottlenecks. The right setting depends on what anomaly you are trying to prevent.
  • READ COMMITTED: each statement sees a fresh snapshot of committed data. Fast. Allows non-repeatable reads within a transaction.
  • REPEATABLE READ: the entire transaction sees one consistent snapshot from when it started. Prevents most anomalies in PostgreSQL. ~5-10% slower under contention.
  • SERIALIZABLE: transactions execute as if they ran sequentially. Safest. 20-40% slower under contention because conflicting transactions are aborted and retried.
  • MVCC (PostgreSQL): readers never block writers and writers never block readers — both work on different row versions simultaneously.
  • Pessimistic locking (SELECT FOR UPDATE): blocks other transactions explicitly. Use when write conflicts are frequent and the cost of retrying a failed optimistic write is higher than the cost of blocking.
📊 Production Insight
Most production applications default to READ COMMITTED and never change it. This is correct for roughly 90% of workloads — higher isolation levels add latency and increase abort rates without preventing anomalies that your application logic already handles.
The remaining 10% — financial reporting queries that must see a consistent snapshot across multiple tables, inventory decrements that cannot tolerate lost updates — should use REPEATABLE READ or SELECT FOR UPDATE on a per-transaction basis.
Rule: only increase isolation when you have identified a specific anomaly that the current level does not prevent. SERIALIZABLE should be a targeted fix for a demonstrated problem, not a global default applied as a safety blanket.
🎯 Key Takeaway
Isolation is a spectrum from READ COMMITTED (fast, allows some anomalies) to SERIALIZABLE (slow, prevents all anomalies). The right choice depends on which specific anomaly you need to prevent.
MVCC in PostgreSQL is optimistic — readers never block writers. Pessimistic locking (SELECT FOR UPDATE) is explicit blocking. Choose based on conflict frequency and retry cost.
Only increase isolation when you have a specific, demonstrated anomaly to prevent. Applying SERIALIZABLE globally costs 20-40% throughput for protection you probably do not need on most transactions.
Isolation Level Selection Guide
IfRead-heavy analytics or reporting with no concurrent writes to the same rows
UseREAD COMMITTED — fastest reads, minimal locking overhead, sufficient for read-only workloads.
IfConcurrent writes to the same row — inventory decrements, counter increments, balance updates
UseREAD COMMITTED with SELECT FOR UPDATE on the specific row — pessimistic lock prevents the lost update anomaly without raising the global isolation level.
IfFinancial reporting where a query must see a consistent snapshot across multiple tables — total account balances, revenue reconciliation
UseREPEATABLE READ — the entire transaction sees one snapshot, preventing non-repeatable reads that would produce incorrect totals.
IfTransaction logic where phantom reads would cause incorrect results — checking for the existence of a record and inserting if absent
UseSERIALIZABLE — prevents all anomalies including phantoms, but expect 20-40% throughput reduction under contention. Retry aborted transactions.
IfHigh-throughput event processing where occasional conflicts are acceptable and can be retried
UseREAD COMMITTED with optimistic locking (version column). No blocking, no lock contention, application retries on conflict.

Durability — Survival After the Crash

Durability guarantees that once a transaction is committed, it remains committed — even if the server loses power one millisecond later, even if the kernel panics, even if the disk controller lies about having flushed its cache. This is the property that lets you show a user 'Payment Confirmed' and know that the confirmation is permanent.

The mechanism is straightforward in principle and expensive in practice. Before the database returns COMMIT to the client, it ensures the WAL record for that transaction has been physically written to non-volatile storage — not just to the OS page cache (which is volatile RAM), but all the way down to the disk platters or flash cells. This operation is called fsync, and it is the single most expensive operation in any database write path.

Each fsync takes 1-5 milliseconds on enterprise SSDs and 5-15 milliseconds on spinning disks. With synchronous_commit=on (the PostgreSQL default), every COMMIT waits for fsync to complete before returning to the client. This limits throughput to roughly 200-1,000 commits per second per disk on SSD, because each commit must wait for the physical write to complete.

Setting synchronous_commit=off changes the bargain dramatically. The database writes the WAL record to the OS page cache and returns COMMIT immediately, without waiting for fsync. The wal_writer background process flushes the page cache to disk every wal_writer_delay milliseconds (default 200ms). This batches multiple transactions' WAL records into a single fsync, boosting throughput to 5,000-10,000 commits per second — a 5-10x improvement.

The cost of that improvement is a durability window. If the server crashes within the wal_writer_delay window after a commit, any transactions that committed during that window but were not yet flushed to disk are lost. The data was in volatile RAM (the OS page cache), and volatile RAM does not survive power loss. In practice, you can lose up to roughly 3x the wal_writer_delay value because the wal_writer may not have completed even one flush cycle before the crash.

On restart, the database replays the WAL from the last confirmed flush point. Any committed transaction whose WAL record was flushed is recovered. Any committed transaction whose WAL record was only in the page cache is gone — permanently.

io/thecodeforge/db/durability_config.sql · SQL
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
-- ============================================================
-- Durability Configuration — PostgreSQL
-- synchronous_commit controls whether COMMIT waits for WAL fsync
-- ============================================================

-- Check current durability setting
SHOW synchronous_commit;
-- Default: on (full durability — COMMIT waits for fsync)

-- Check current WAL position
SELECT pg_current_wal_lsn() AS current_wal_position;

-- Check WAL file currently being written
SELECT pg_walfile_name(pg_current_wal_lsn()) AS current_wal_file;

-- ============================================================
-- Performance vs Safety Trade-off
-- ============================================================

-- OPTION 1: Full durability (default) — for financial data
-- Every COMMIT waits for fsync. Throughput: ~200-1000 TPS on SSD.
SET synchronous_commit = on;
-- Use this for: payments, account balances, audit trails, medical records
-- Rule: if the user sees a success confirmation, the data must be durable.

-- OPTION 2: Relaxed durability — for analytics and session data
-- COMMIT returns before fsync. Throughput: ~5000-10000 TPS on SSD.
-- Risk: lose up to wal_writer_delay (default 200ms) of committed data on crash.
SET synchronous_commit = off;
-- Use this for: analytics events, session caches, non-critical logs
-- NEVER use for: payments, balances, anything a user has seen confirmed

-- OPTION 3: Per-transaction durability — the production sweet spot
-- Default to 'on', relax for specific non-critical writes
BEGIN;
SET LOCAL synchronous_commit = off;  -- only affects THIS transaction
INSERT INTO io_thecodeforge.analytics_events (event_type, payload)
VALUES ('page_view', '{"url": "/product/42"}');
COMMIT;  -- returns immediately, WAL flushed asynchronously

BEGIN;
-- synchronous_commit is back to 'on' for this transaction
INSERT INTO io_thecodeforge.payments (order_id, amount, status)
VALUES (1024, 599.00, 'COMPLETED');
COMMIT;  -- waits for fsync — data is durable before returning

-- ============================================================
-- Monitor WAL performance in production
-- ============================================================
SELECT
  stats_reset,
  wal_records,
  wal_bytes,
  wal_write,
  wal_sync,
  wal_write_time,
  wal_sync_time
FROM pg_stat_wal;

-- Monitor COMMIT latency (requires pg_stat_statements extension)
SELECT
  query,
  calls,
  round(mean_exec_time::numeric, 2) AS avg_ms,
  round(total_exec_time::numeric, 2) AS total_ms
FROM pg_stat_statements
WHERE query ILIKE '%COMMIT%'
ORDER BY total_exec_time DESC
LIMIT 5;
▶ Output
synchronous_commit: on
current_wal_position: 0/16B3F60
current_wal_file: 000000010000000000000001

SET LOCAL synchronous_commit = off
INSERT 0 1
COMMIT (immediate — no fsync wait)

INSERT 0 1
COMMIT (waited 2.3ms for fsync)
⚠ Never Disable Durability for Financial Data
Setting synchronous_commit=off is acceptable for analytics dashboards, session caches, page view counters, and non-critical logging. It is never acceptable for payment processing, account balances, audit trails, or any data where a user has seen a success confirmation. If a customer sees 'Payment Confirmed' and the transaction is lost because the server crashed within the 200ms durability window, you have a compliance violation, a trust violation, and a customer support incident — not just a technical problem. The per-transaction approach (SET LOCAL synchronous_commit = off) is the production sweet spot: default to full durability, and relax it explicitly for specific non-critical writes with a code comment explaining why.
📊 Production Insight
The most expensive operation in a database is fsync — and it is the price you pay for durability. Every millisecond of fsync latency directly limits your maximum commits per second.
Disabling synchronous_commit for genuinely non-critical data is a legitimate optimization that most production PostgreSQL deployments should consider. But the decision must be explicit, documented, and limited to specific transaction types — never applied as a global setting.
Rule: if the user sees a success message, the data behind that message must be durable. If it is a background analytics event that the user never sees, you can trade durability for throughput — but mark the decision explicitly in code with a comment explaining the trade-off.
🎯 Key Takeaway
Durability is implemented via WAL fsync — the database ensures the transaction log is physically on disk before returning COMMIT to the client.
synchronous_commit=off boosts throughput 5-10x by batching fsyncs, but risks losing up to ~600ms of committed data on crash.
The per-transaction approach (SET LOCAL synchronous_commit = off) is the production pattern: default to full durability, relax it explicitly for specific non-critical writes.
🗂 Isolation Levels — Complete Comparison
PostgreSQL behaviour at each SQL standard isolation level. Note that PostgreSQL silently promotes READ UNCOMMITTED to READ COMMITTED.
Isolation LevelDirty ReadNon-Repeatable ReadPhantom ReadImplementation MechanismThroughput Impact
READ UNCOMMITTEDPossible in theory — PostgreSQL promotes to READ COMMITTEDPossiblePossibleMVCC snapshot per statement (identical to READ COMMITTED in PostgreSQL)Baseline — fastest possible
READ COMMITTED (PostgreSQL default)Prevented — each statement sees only committed dataPossible — two reads of the same row in one transaction may return different valuesPossible — new rows inserted by other transactions become visible between statementsMVCC snapshot taken at the start of each statement, not the transactionMinimal overhead — the sweet spot for 90% of workloads
REPEATABLE READPreventedPrevented — the entire transaction sees one consistent snapshotPrevented in PostgreSQL (MVCC snapshot is transaction-scoped). MySQL/InnoDB uses gap locks.MVCC snapshot taken at the start of the transaction and held for its entire duration~5-10% slower under contention due to increased serialization failures that require retry
SERIALIZABLEPreventedPreventedPreventedSerializable Snapshot Isolation (SSI) with predicate locking — detects read-write dependencies~20-40% slower under contention — conflicting transactions are aborted and must be retried by the application

🎯 Key Takeaways

  • Atomicity: all-or-nothing transactions implemented via WAL and undo logs. Without explicit BEGIN/COMMIT boundaries, auto-commit mode makes each statement its own transaction and a crash between statements leaves unrecoverable partial state.
  • Consistency: the database moves from one valid state to another by enforcing all constraints. Application-level validation should catch business rule violations before they reach the database; constraints are the safety net, not the primary check.
  • Isolation: a speed-versus-safety dial from READ COMMITTED (fast, allows non-repeatable reads) to SERIALIZABLE (safe, 20-40% slower). Default to READ COMMITTED and increase isolation only for specific transactions with demonstrated anomalies.
  • Durability: committed data survives crashes because WAL is flushed to disk via fsync before COMMIT returns. synchronous_commit=off trades durability for throughput — acceptable for analytics, never for financial data.
  • Every ACID property has a configurable dial. Turning durability down (synchronous_commit=off) or isolation down (READ COMMITTED) is a legitimate production optimization — but only when the business requirement explicitly permits it and the decision is documented in code.
  • CAP theorem governs distributed systems: during a network partition, choose between consistency and availability. ACID focuses on local integrity within a single database. BASE trades consistency for availability in distributed systems. The choice depends on the business requirement, not on technology preference.

⚠ Common Mistakes to Avoid

    Setting synchronous_commit=off globally or for tables that store financial data
    Symptom

    Users see Payment Confirmed but the transaction is lost if the server crashes within the wal_writer_delay window. The customer is charged externally by the payment gateway, but the database has no record of the charge. Reconciliation fails. Refund is manual.

    Fix

    Never disable synchronous_commit for any table that stores financial transactions, account balances, or audit records. Use synchronous_commit=off only for analytics events, session data, or non-critical logs — and only via SET LOCAL within specific transactions, not as a global configuration. Add a code comment at every SET LOCAL synchronous_commit = off line explaining why durability is not required for that specific write.

    Applying SERIALIZABLE isolation globally as a preventive safety measure
    Symptom

    Application throughput drops 30-40% under concurrent load with no corresponding improvement in data quality. Deadlocks and serialization failures increase because SERIALIZABLE uses predicate locking that blocks range scans. Application retry logic creates additional load.

    Fix

    Use READ COMMITTED as the global default. Apply SERIALIZABLE only to specific transactions that have demonstrated a phantom read or write skew bug in production. In Spring Boot, use @Transactional(isolation = Isolation.SERIALIZABLE) on the specific method, not on the class or application-wide configuration. Always implement retry logic for SERIALIZABLE transactions because serialization failures are expected, not exceptional.

    Relying on database constraints as the primary business validation mechanism
    Symptom

    Production logs show thousands of constraint violation errors with minimal debugging context. Each violation triggers an implicit ROLLBACK, wastes WAL I/O, and returns a generic database error code that the application must map to a user-facing message.

    Fix

    Validate business rules in application code before opening the transaction. Return clear, specific error messages to users when validation fails. Use constraints as a safety net that catches bugs in your application validation — not as the primary validation layer. Log every constraint violation in production with full request context so you can diagnose why the application validation missed it.

    Implementing retry logic without idempotency keys for payment transactions
    Symptom

    Timeout-retry cycles produce duplicate charges. The first attempt succeeded and committed, but the response timed out before reaching the application. The application retries, creates a second charge, and the customer is billed twice for the same order.

    Fix

    Generate a unique idempotency key per logical payment attempt — typically a UUID generated by the client before the first attempt. Store it in a column with a unique constraint. Use INSERT ... ON CONFLICT (idempotency_key) DO NOTHING to make retries safe. If the insert returns zero rows affected, query the existing record and return it to the caller. The customer is charged exactly once regardless of how many retries the application sends.

    Not monitoring long-running idle in transaction sessions
    Symptom

    Lock accumulation over hours causes gradually degrading query latency. Queries that normally take 10ms start taking 30 seconds as they wait for locks held by abandoned transactions. The degradation is gradual, making it difficult to identify the root cause without monitoring.

    Fix

    Set idle_in_transaction_session_timeout = '30s' in postgresql.conf to automatically terminate sessions that sit in an open transaction without executing queries for more than 30 seconds. Monitor with: SELECT pid, now() - xact_start AS duration FROM pg_stat_activity WHERE state = 'idle in transaction'. Alert on any session exceeding 10 seconds. The most common cause is application code that opens a transaction, makes an HTTP call to an external service, and does not commit or rollback if the HTTP call times out.

Interview Questions on This Topic

  • QExplain the Lost Update anomaly. How would you solve it in a Spring Boot application using @Version (Optimistic Locking) versus SELECT FOR UPDATE (Pessimistic Locking)?Mid-levelReveal
    A Lost Update occurs when two transactions read the same row, both compute a new value based on the old value, and the second commit silently overwrites the first commit's changes. Example: Admin A reads stock=10 and decrements to 9. Admin B also reads stock=10 and decrements to 9. Final value is 9 instead of 8 — one decrement is silently lost. Optimistic Locking with @Version: Add a version column to the JPA entity annotated with @Version. Hibernate automatically includes version in the WHERE clause of every UPDATE: UPDATE inventory SET stock=9, version=2 WHERE id=1 AND version=1. If another transaction modified the row first, the version no longer matches, the UPDATE affects zero rows, and Hibernate throws OptimisticLockException. The application catches the exception and retries. This approach holds no database locks during the read phase — best for low-conflict scenarios where retries are rare. Pessimistic Locking with SELECT FOR UPDATE: The first transaction acquires an exclusive row lock at read time. The second transaction blocks on its SELECT until the first commits or rolls back. In Spring Data JPA: @Lock(LockModeType.PESSIMISTIC_WRITE) on the repository method, or @Transactional with a native query using FOR UPDATE. This guarantees no lost update but reduces concurrency because blocked transactions wait. Best for high-conflict scenarios where retries would be frequent and expensive. Choose optimistic when write conflicts are rare (less than 5% of transactions). Choose pessimistic when write conflicts are frequent (inventory decrements during a flash sale).
  • QHow does the database ensure Consistency when a transaction violates a Foreign Key constraint? Does it roll back the entire transaction or just the offending statement?Mid-levelReveal
    The database rolls back the entire transaction, not just the offending statement. When a constraint violation occurs — foreign key, unique, check, or not null — the database issues an error and aborts the entire transaction. All changes made by every statement within the current BEGIN/COMMIT block are undone via the undo log or WAL rollback mechanism. This is because consistency is tied to atomicity — the transaction is an indivisible unit. If the database rolled back only the violating statement, the remaining statements might reference data that logically depended on the rolled-back statement, creating an inconsistent state. For example, if an INSERT into order_items violates a foreign key to orders, rolling back only that INSERT would leave an order with no items — itself an inconsistent state if the business rule requires at least one item per order. The application receives a SQLSTATE error code — 23503 in PostgreSQL for FK violations, 23505 for unique violations — and must handle it appropriately. For an idempotency key violation, the correct response is to return the existing record. For a foreign key violation, the correct response is to log the error with full context and return an application-level error to the caller.
  • QDescribe a scenario where READ COMMITTED isolation would cause an application bug that REPEATABLE READ would prevent.Mid-levelReveal
    Scenario: A financial reporting query calculates total account balance by summing all individual account balances across a table with millions of rows. Under READ COMMITTED, the query takes a fresh snapshot for each statement (and PostgreSQL may process different rows at different moments within a single long-running SELECT): 1. The query reads Account A: $1,000 2. A concurrent transaction transfers $500 from Account A to Account B and commits 3. The query reads Account B: $2,500 (includes the $500 transfer) 4. The query's total: $3,500 — but the correct total is $3,000 The $500 transfer was counted twice: once in Account A's pre-transfer balance and once in Account B's post-transfer balance. This is a non-repeatable read at the row level — the query saw different points in time for different rows within a single statement execution. Under REPEATABLE READ, the entire transaction sees a consistent snapshot from the moment the first statement executed. Account A reads $1,000 and Account B reads $2,000 (both pre-transfer values). Total: $3,000 — correct for the snapshot point in time. This matters for any operation that reads multiple rows and depends on their mutual consistency: financial reports, inventory reconciliation, capacity planning, and any aggregation query where rows are related by a conservation law (money transferred, inventory moved, seats reserved).
  • QWhat is the performance impact of setting synchronous_commit=off in PostgreSQL? What specific ACID property are you compromising, and when is this trade-off acceptable?SeniorReveal
    Setting synchronous_commit=off disables the requirement that WAL records be flushed to non-volatile storage via fsync before COMMIT returns to the client. Instead, WAL records are written to the OS page cache (volatile RAM) and flushed to disk asynchronously by the wal_writer background process every wal_writer_delay milliseconds (default 200ms). Performance impact: throughput increases 5-10x because COMMIT no longer blocks on fsync. Each fsync takes 1-5ms on SSD, and removing that wait allows the database to batch multiple transactions' WAL records into a single fsync operation. Durability compromise: if the server crashes (power loss, kernel panic, hardware failure), any transactions that committed within the last wal_writer_delay window may be permanently lost. The OS page cache is volatile — it does not survive power loss. In practice, you can lose up to approximately 3x the wal_writer_delay value because the wal_writer may not have completed even one flush cycle before the crash. Atomicity, consistency, and isolation are unaffected — only durability changes. The transaction still either fully commits or fully rolls back. Constraints are still enforced. Isolation still prevents concurrent interference. The only change is when the committed data becomes permanent on non-volatile storage. This trade-off is acceptable for: analytics event logging, session state caches, non-critical application logs, page view counters — any data where losing the last 200-600ms of writes after a crash has no user-facing impact and can be regenerated. It is never acceptable for: payment transactions, account balances, audit trails, medical records — any data where a user has been shown a success confirmation or where regulatory compliance requires persistence guarantees.
  • QWhy do many NoSQL databases adopt BASE instead of ACID? When would you choose BASE over ACID, and what specific guarantees are you giving up?SeniorReveal
    BASE stands for Basically Available, Soft state, Eventually consistent. NoSQL databases adopt BASE because ACID's guarantees — especially strong consistency and full isolation — are extremely expensive to maintain across distributed nodes connected by a network. The CAP theorem formalizes this: during a network partition, a distributed system must choose between Consistency (every read returns the most recent write) and Availability (every request receives a response). BASE chooses availability — the system remains responsive during partitions but may return stale data temporarily. Specifically, BASE gives up: immediate consistency (a write on node A may not be visible on node B for seconds or minutes), isolation (concurrent operations may see intermediate states), and sometimes durability (some systems acknowledge writes before replication completes). Choose BASE when: (1) High write throughput across geographic regions matters more than immediate global consistency — social media likes, view counts, analytics events where a few seconds of staleness is invisible to users. (2) The system can tolerate and self-correct temporary inconsistency — a product catalog where a price update propagates in seconds, not milliseconds. (3) Availability during network partitions is more important than consistency — a CDN or content platform where serving slightly stale content is better than serving an error page. Choose ACID when: (1) Financial transactions where any inconsistency is a compliance violation and any data loss is a regulatory incident. (2) Inventory management where overselling causes customer-facing failures that damage trust. (3) Medical records where data integrity is a legal requirement with audit trail obligations. The choice is not ACID versus NoSQL — it is about matching the consistency model to the business requirement. PostgreSQL with synchronous_commit=off is already a partial BASE trade-off within an ACID system. Many modern distributed databases (CockroachDB, TiDB, YugabyteDB) offer ACID guarantees across distributed nodes at the cost of higher write latency — proving that the trade-off is a spectrum, not a binary choice.

Frequently Asked Questions

How does PostgreSQL implement MVCC (Multi-Version Concurrency Control)?

PostgreSQL implements MVCC by keeping multiple physical versions of the same logical row simultaneously. Every row has two hidden system columns: xmin (the transaction ID that created this version) and xmax (the transaction ID that deleted or replaced this version, or zero if the row is still current). When you run a query, PostgreSQL compares your transaction's snapshot against these xmin and xmax values to determine which version of each row was visible at your snapshot point. This is why readers never block writers in PostgreSQL — the reader sees the old version of the row while the writer creates a new version. Both operate on different physical tuples without any locking conflict. The old versions (dead tuples) are eventually cleaned up by the VACUUM process, which reclaims the space they occupied.

What is the difference between ACID and the CAP theorem?

ACID and CAP address different scopes and different problems. ACID describes the internal properties of transactions within a single logical database — it is about integrity guarantees for individual operations. CAP describes the fundamental trade-offs in distributed systems — it is about how data behaves when it is replicated across multiple nodes connected by a network that can partition. You can have a fully ACID-compliant database running on a single node with no CAP concerns. The moment you shard or replicate that database across a network, CAP dictates that during a network partition, you must choose between Consistency (all nodes see the same data simultaneously) and Availability (every request receives a response even if some nodes are unreachable). ACID and CAP are not alternatives — they operate at different architectural layers.

What is a Write-Ahead Log (WAL) and why is it faster than writing directly to data files?

A WAL is a sequential, append-only log file where the database records the intent of every data modification before applying it to the actual data files. Writing to data files is random I/O — the database must seek to different physical locations on disk to update specific rows in different tables and pages. WAL is sequential I/O — it simply appends each new record to the end of a single file. Sequential writes are dramatically faster than random writes on both SSDs and spinning disks because they avoid seek latency and can fully utilize the storage device's write bandwidth. By writing the intent to the WAL first and fsyncing it, the database achieves durability quickly. The actual data files (heap pages, index pages) are updated later in the background by the checkpoint process, which can batch and optimize the random writes without any time pressure.

Can I use synchronous_commit=off on a read replica?

synchronous_commit controls whether the primary server waits for its own WAL to be flushed to its own disk before returning COMMIT to the client. This setting is irrelevant on read replicas because replicas do not accept commits — they replay WAL records received from the primary. What matters on replicas is different: max_standby_streaming_delay controls how far behind the replica can fall before it starts cancelling long-running queries that conflict with incoming WAL replay. hot_standby_feedback tells the primary that the replica still needs certain row versions, preventing the primary's VACUUM from removing rows the replica's queries are reading. Monitor replication lag with: SELECT now() - pg_last_xact_replay_timestamp() AS lag. If lag exceeds your application's staleness tolerance, route critical reads to the primary.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousIntroduction to DBMSNext →Database Normalization in DBMS
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged