MongoDB is CP. It uses a single primary for writes; if the primary becomes isolated due to a network partition, the minority partition cannot accept writes. Reads from secondaries may be stale, but you can configure reads from the primary for strong consistency. During a partition, the system prioritizes consistency over availability.

Senior 10 min · March 05, 2026

CAP Theorem — Why R=3, W=3, N=3 Broke Banking Reads

Q: What is CAP Theorem in simple terms?

CAP Theorem says that a distributed database can only guarantee two out of three properties: Consistency (all nodes see the same data), Availability (every request gets a response, even if stale), and Partition Tolerance (system works even when network nodes can't talk to each other). Because network failures (partitions) are guaranteed, you must choose between Consistency and Availability when a partition occurs.

Q: Can you have both CP and AP in the same database?

Yes, many databases allow per-operation consistency levels. For example, Cassandra lets you set consistency per query: ALL gives CP (if any node down, write fails), ONE gives AP. This way, you can use CP for critical operations (e.g., payments) and AP for less critical ones (e.g., logs).

Q: What is the difference between CAP and PACELC?

PACELC extends CAP by stating that even when there is no partition (Else), you still trade off Consistency and Latency. CAP only covers the case when a partition exists. PACELC makes the trade-off explicit at all times, not just during failures.

Q: Why is partition tolerance not optional?

Because network partitions happen in any distributed deployment — cloud provider outages, hardware failures, even brief network blips. If you design a system that is not partition tolerant (CA), it will become either inconsistent or unavailable when a partition inevitably occurs. The only way to avoid partitions is to run a single-node system.

Strict quorum config (R=3, W=3, N=3) blocked all reads when one node partitioned.

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

✓ Production

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

CAP Theorem says a distributed system can guarantee at most two of Consistency, Availability, and Partition Tolerance.
Network partitions are inevitable — you must choose between CP (consistent, but may reject requests) or AP (available, but may return stale data).
Databases like Cassandra (AP) and Zookeeper (CP) make opposite trade-offs intentionally.
The biggest mistake: treating partition tolerance as optional — it's not; the choice is CP or AP during a partition.
Performance insight: CP systems add ~2-5x latency during partitions due to quorum waits; AP systems handle them with no extra latency but risk stale reads.

✦ Definition~90s read

What is CAP Theorem and Databases?

CAP Theorem defines the fundamental trade-offs in distributed databases. It states that a distributed data store cannot simultaneously provide Consistency, Availability, and Partition Tolerance. Understanding CAP is essential for choosing the right database for your workload and for designing systems that survive network failures without data corruption.

★

Imagine you and your twin run identical noticeboards at opposite ends of a school.

The theorem says you can't have all three. The key insight: network partitions are inevitable in any real deployment — cloud providers have documented failures, switches fail, cables get cut. So you must choose: do you sacrifice Consistency (become AP) or Availability (become CP) during a partition?

Plain-English First

Imagine you and your twin run identical noticeboards at opposite ends of a school. If someone pins a new message on yours, it takes a minute to walk to your twin and update theirs. During that minute, a student asking your twin sees old info — you can't have both boards perfectly in sync AND instantly available AND survive the case where the hallway between you is blocked. CAP Theorem says the same thing about databases spread across servers: when the network between them breaks (and it will), you must choose whether to give users a possibly-stale answer right now or make them wait until you're sure the answer is correct. You can never fully dodge that choice.

Every time Netflix keeps streaming while a data center goes dark, or your bank refuses to show your balance during a network blip, a silent architectural decision is playing out — one that was formally described by Eric Brewer in 2000 and proved by Gilbert and Lynch in 2002. CAP Theorem isn't an abstract academic curiosity; it's the load-bearing wall of every distributed system you'll ever build or operate. Ignore it and you'll ship a system that silently returns wrong data, drops writes, or locks up entirely under exactly the conditions you need it most.

The theorem states that a distributed data store can guarantee at most two of three properties simultaneously: Consistency (every read returns the most recent write or an error), Availability (every request receives a non-error response, though it may be stale), and Partition Tolerance (the system continues operating even when network messages between nodes are lost or delayed). The critical insight most engineers miss is that network partitions are not optional — they happen on every cloud platform, in every data center, on every network switch that has ever existed. This means the real choice is always CA-during-partition, which translates to: do you go CP or AP when the network fails?

By the end of this article you'll be able to explain exactly which guarantees Cassandra, MongoDB, Zookeeper, etcd, CockroachDB and DynamoDB sacrifice and why, design a partition-handling strategy for a real production system, understand PACELC (CAP's more honest successor), spot the five most common architectural mistakes engineers make when reasoning about CAP, and walk into any senior engineering interview with precise, battle-tested answers. Let's get into it.

What is CAP Theorem and Databases?

Let's see a concrete example. Suppose you're building a distributed key-value store with three replicas. You want every read to return the latest write (Consistency). You also want every request to get a response (Availability). And you want the system to work even if the network between replicas goes down (Partition Tolerance). The theorem says you can't have all three. The key insight: network partitions are inevitable in any real deployment — cloud providers have documented failures, switches fail, cables get cut. So you must choose: do you sacrifice Consistency (become AP) or Availability (become CP) during a partition?

DistributedStore.javaJAVA

package io.thecodeforge.cap;

/**
 * Simplified model of a distributed key-value store showing CAP trade-offs.
 */
public class DistributedStore {
    enum ConsistencyModel { CP, AP }

    private final ConsistencyModel model;
    private final int replicas;

    public DistributedStore(ConsistencyModel model, int replicas) {
        this.model = model;
        this.replicas = replicas;
    }

    // In a real system, network partition detection & quorum logic would go here.
    public String read(String key) {
        if (model == ConsistencyModel.CP) {
            // Must read from majority quorum; if not possible, return error.
            return readWithConsistency(key, replicas / 2 + 1);
        } else {
            // AP: return from fastest replica, even if stale.
            return readFromAnyReplica(key);
        }
    }

    private String readWithConsistency(String key, int quorum) { return "latest_value"; }
    private String readFromAnyReplica(String key) { return "possibly_stale_value"; }
}

Forge Tip:

Type this code yourself rather than copy-pasting. The muscle memory of writing it will help it stick.

Production Insight

In production, a CP system with quorum reads (R+W > N) can still reject requests if the partition isolates too many nodes.

Always test quorum failure scenarios with chaos engineering tools like Chaos Monkey.

Rule: Never deploy a CP system without a circuit breaker that degrades to AP for read-only operations.

Key Takeaway

CAP Theorem forces a choice during network partitions.

You cannot have all three — pick the one that matches your business requirements.

CP gives consistency at the cost of availability; AP gives availability at the cost of consistency.

thecodeforge.io

CAP Theorem: R=3, W=3, N=3 Broke Banking Reads

Cap Theorem Databases

The Three Properties: Consistency, Availability, Partition Tolerance

Let's define each property precisely, because the textbook definitions hide production nuances.

Consistency — Every read receives the most recent write or an error. In a distributed system, this means all nodes see the same data at the same time. Linearizability is the strictest form. But there are weaker consistency models like eventual consistency that CAP does not forbid — CAP's Consistency is specifically linearizability (or atomic consistency). If your system only guarantees eventual consistency, it's not providing CAP-Consistency; it's effectively AP.

Availability — Every request receives a (non-error) response, without the guarantee that it contains the most recent write. The key: the response must not be a timeout or 500. Some systems interpret availability as "the system is up," but CAP defines it precisely: every request gets a response, even if it's stale. That's why AP systems like DynamoDB always succeed reads — they may return old data.

Partition Tolerance — The system continues to operate despite arbitrary message loss or delay between nodes. This is the one property you cannot sacrifice in a distributed system. You could theoretically build a system that is CA as long as no partition ever occurs, but in practice partitions happen. So the choice is CP or AP.

consistency_check.sqlSQL

-- Checking consistency in a replicated database after a network partition.
-- TheCodeForge example: Compare timestamps across replicas.

SELECT node_id, last_updated, value
FROM system.consistency_check
WHERE key = 'customer_balance_123'
ORDER BY last_updated DESC;

-- If timestamps differ across nodes, you have stale reads.
-- The difference in milliseconds tells you how far behind the AP system is.

Output

node_id | last_updated | value

A | 2026-04-22 10:00:00 | 500

B | 2026-04-22 10:00:05 | 500

C | 2026-04-22 09:59:55 | 450

Mental Model: CAP as a Triangle

C + A: CA system that is not partition tolerant — unrealistic at scale.
C + P: CP system — consistent and partition tolerant, but may reject requests.
A + P: AP system — always returns a response, but may return stale data.
In production, you always end up at one of the bottom corners: CP or AP.
The top corner (CA) is a theoretical ideal that only exists in single-node databases.

Production Insight

Many engineers assume 'eventually consistent' means AP, but some eventually consistent systems can be configured for strong consistency (e.g., Cassandra with ALL read consistency).

The real trade-off is often between read repair cost and write availability.

Rule: Document your consistency model explicitly for each microservice — don't assume developers understand the defaults.

Key Takeaway

Consistency means linearizability in CAP context.

Availability means every request gets a response, not that the response is correct.

Partition tolerance is not a choice — you must plan for partitions.

Why Partition Tolerance Is Not Optional

Every distributed system will experience network partitions. Cloud providers have regional outages. Network switches fail. even a brief blip in a packet can cause a split-brain scenario. If you design a system that is CA (no partition tolerance), you are implicitly saying: "I will shut down the entire system if a partition occurs." That is often unacceptable for production systems that require high availability.

Consider the scenario: two data centers connected by a WAN link. The link goes down. If your database is CA, both data centers must stop serving to maintain consistency. But if you're running an e-commerce platform, you'd rather serve stale product inventory than show an error page. That's the AP choice.

Therefore, the real decision is not "if" you handle partitions, but "how" you handle them. Do you want your system to maintain consistency at the cost of rejecting some requests (CP)? Or do you want to serve every request even if some nodes have stale data (AP)? This choice must be made consciously and per operation, not as an afterthought.

io/thecodeforge/cap/PartitionDecision.javaJAVA

package io.thecodeforge.cap;

public class PartitionDecision {
    public enum Decision { REJECT_READ, SERVE_STALE }

    public Decision handlePartitionRead(boolean needConsistency, boolean canReachMajority) {
        if (needConsistency && !canReachMajority) {
            return Decision.REJECT_READ;  // CP behavior
        } else {
            return Decision.SERVE_STALE;   // AP behavior
        }
    }
}

Warning: Ignoring partitions is dangerous

Don't assume your network is reliable. The cloud will fail. Your switch will fail. A deployment script can accidentally isolate a node. Test partition scenarios regularly with chaos engineering.

Production Insight

The worst production incident I saw was a CA system that stopped all traffic during a 30-second network blip — the SLA breach cost $2M.

The team had never tested partition behavior because they believed 'the network is redundant.'

Rule: Always assume partitions will happen; test your system's behavior under partition at least quarterly.

Key Takeaway

You cannot opt out of partition tolerance in a distributed system.

If you try, you'll end up with an unavailable system during the first real partition.

Design for partition recovery — not just partition detection.

CP vs AP: The Real Trade-off

The classic CAP choice is between CP (Consistent but possibly unavailable during partition) and AP (Available but possibly inconsistent during partition). Let's see how real databases choose.

CP Systems

Zookeeper: Uses a leader-based quorum (Zab protocol). If the leader loses connection to a majority of followers, it steps down and no writes are possible until a new leader is elected. Reads are also served only by the leader, so during a partition, the minority partition is completely unavailable for reads and writes. This is strict CP.
etcd (used by Kubernetes): Raft consensus. Requires a majority of nodes to make progress. During a partition, the minority partition becomes unavailable. etcd is CP by design — consistent data is more important than availability.
HBase: Uses HDFS and ZooKeeper. It's CP — during a region server failure, the region becomes unavailable until recovery completes.

AP Systems

Cassandra: Uses Dynamo-style replication with tunable consistency. By default, reads and writes use QUORUM, but you can lower consistency to ONE. During a partition, each partition can continue to serve reads and writes. After partition heals, read repair and hinted handoff resolve conflicts. Cassandra is AP by default.
DynamoDB: AP by default (eventually consistent reads). You can request strongly consistent reads, but they are limited to one partition (the leader) and may fail during partition. DynamoDB is designed for high availability.
Couchbase: AP with multi-master replication. During partition, each node accepts writes; conflict resolution uses last-write-wins (timestamp).

The middle ground: Many databases allow per-operation trade-off. For example, Cassandra's consistency level can be set per query: ALL (CP) vs ONE (AP). This lets you mix depending on the criticality of the data.

docker-compose-cassandra.ymlYAML

# TheCodeForge example: Demonstrating Cassandra AP behavior with docker compose
version: '3.8'
services:
  cassandra1:
    image: cassandra:4.1
    environment:
      - CASSANDRA_SEEDS=cassandra1,cassandra2,cassandra3
      - CASSANDRA_CLUSTER_NAME=APDemo
  cassandra2:
    image: cassandra:4.1
    environment:
      - CASSANDRA_SEEDS=cassandra1,cassandra2,cassandra3
  cassandra3:
    image: cassandra:4.1
    environment:
      - CASSANDRA_SEEDS=cassandra1,cassandra2,cassandra3
# To simulate partition: docker network disconnect <network> cassandra3

Forge Info:

You can simulate a network partition in Docker by disconnecting a container from the network. Observe how Cassandra handles the partition differently than Zookeeper.

Production Insight

I've seen teams choose AP because 'we need 100% uptime,' then discover stale reads causing financial reconciliation failures.

The key is to understand which operations require strong consistency and which can tolerate staleness.

Rule: Use CP for critical numeric data (account balances), AP for content (product descriptions).

Key Takeaway

CP sacrifices availability for consistency — use for financial or transactional data.

AP sacrifices consistency for availability — use for content delivery or caching.

Mix both within the same system using per-operation consistency levels.

PACELC: CAP's Honest Extension

Even when the network is healthy (no partition), there's still a trade-off between Consistency and Latency. PACELC (pronounced 'pace-elk') extends CAP: If Partition (P) then trade-off between C and A; Else (E) trade-off between C and L (Latency).

This is the practical reality: your database is never free from trade-offs. Even in normal operation, you must decide whether to pay the latency cost of strong consistency or accept lower latency with eventual consistency.

Examples

Cassandra: In normal operation (no partition), if you choose consistency level ALL, you pay high latency because all replicas must acknowledge. This is the 'Else' branch of PACELC: you choose C over L.
DynamoDB: Default eventually consistent reads give lower latency. If you request strongly consistent reads, latency increases — again C over L.
Spanner (Google's globally distributed DB): Uses TrueTime to provide external consistency with high latency. In normal operation, Spanner chooses C over L because it must wait for TrueTime.

The key insight: you cannot design a system that is both low-latency and strongly consistent under normal conditions. The trade-off persists.

IOConsistencyTradeoff.javaJAVA

package io.thecodeforge.cap;

public class ConsistencyLatencyTradeoff {
    // PACELC: Else (no partition) trade-between Consistency and Latency.
    // This method simulates the latency cost of strong consistency.
    public void simulateRead(boolean strongConsistency) {
        long start = System.nanoTime();
        if (strongConsistency) {
            // Read from all replicas and wait for latest version
            long quorumLatency = synchronizeWithAllReplicas();
            System.out.println("Strong consistency latency: " + quorumLatency + " ms");
        } else {
            long readFromLocal = readFromFastestReplica();
            System.out.println("Eventual consistency latency: " + readFromLocal + " ms");
        }
    }
    private long synchronizeWithAllReplicas() { return 75; }
    private long readFromFastestReplica() { return 5; }
}

Output

Strong consistency latency: 75 ms

Eventual consistency latency: 5 ms

Mental Model: PACELC Decision Matrix

Partition? → choose between C and A (CP or AP).
No partition? → choose between C and L (strong consistency or low latency).
For critical writes: use strong consistency even if latency increases.
For reads of volatile data (like stocks): use strong consistency.
For reads of static data (like user profiles): use eventual consistency for speed.

Production Insight

I once saw a team set all Cassandra reads to ALL consistency 'to be safe,' causing 80th percentile latency to jump from 2ms to 150ms.

They hadn't realized that PACELC means you pay the latency cost even when there's no partition.

Rule: Audit your read patterns — set the weakest consistency level that satisfies your data correctness needs.

Key Takeaway

PACELC extends CAP to normal operating conditions.

Consistency always costs latency — even without network failures.

Design per-operation consistency levels based on data freshness requirements.

How Real Databases Choose: Cassandra, MongoDB, Zookeeper, etcd, CockroachDB, DynamoDB

Let's examine how popular databases handle the CAP trade-off in practice.

Cassandra: AP by design. Uses a Dynamo-style replication model with multi-master writes. Writes are always available (unless the entire cluster goes down). During partition, each side continues to accept writes. After partition heals, read repair and hinted handoff propagate updates. You can increase consistency level to ALL for strong consistency, but then you lose partition tolerance (because if any node is down, the write fails). Cassandra's default consistency is QUORUM (a mix of C and A).

MongoDB: CP with replica sets. Has a single primary that accepts writes; secondaries replicate asynchronously. If the primary goes down or is isolated, an election is held to pick a new primary; the old primary (if isolated) will step down when it reconnects. Reads from secondaries may be stale. MongoDB's default read preference is primary (strong consistency). You can allow reads from secondaries for better availability but at the cost of potential staleness. So MongoDB is CP during partition (minority partition cannot write).

Zookeeper: Strict CP. Uses Zab atomic broadcast. Requires a majority (quorum) to commit writes. Any node not in the majority becomes read-only (and eventually shuts down). Zookeeper is never AP — it prefers to be unavailable rather than inconsistent.

etcd: Also CP using Raft. Similar to Zookeeper — a majority must be present. If you lose majority, etcd stops serving. This is deliberate for Kubernetes etcd — losing consistency would be catastrophic.

CockroachDB: CP (strong consistency) with some AP properties. Uses Raft for replication but can survive some partitions as long as a majority of replicas for each range are available. CockroachDB offers serializable isolation. It uses clock synchronization (hybrid logical clocks) to provide linearizability without TrueTime. If a range's majority is lost, that range is unavailable — so it's CP within each range.

DynamoDB: AP. DynamoDB's always-on availability is its primary feature. It provides eventually consistent reads by default. Strongly consistent reads are available but limited to one partition and may fail during partition. DynamoDB is the canonical AP system.

Production Insight

I've seen teams migrate from MongoDB to CockroachDB thinking they'd get the same consistency model, but CockroachDB's Raft-based CP caused write failures during multi-region partitions.

Another team used Zookeeper for a global leader election but didn't realize that a WAN partition would cause the minority data center to go completely down.

Rule: Test each database's actual partition behavior before committing to it — read the documentation and run chaos experiments.

Key Takeaway

No database is 'better' — each makes explicit CAP trade-offs.

Cassandra = AP; Zookeeper = CP; MongoDB = CP with read staleness options.

CockroachDB = CP per range; DynamoDB = AP with eventual consistency.

Common Mistakes Engineers Make With CAP

Assuming all databases in the same category behave identically. Redis is CP in cluster mode (with optional replicas), but a single Redis instance is CA (no partition tolerance). Many engineers think Redis is AP.
Believing 'eventual consistency' always means AP. Cassandra with consistency level ALL is actually CP — if you configure it that way, you lose partition tolerance.
Treating CAP as a permanent choice. You can design a system that switches between CP and AP based on the type of operation (e.g., use strongly consistent reads for payment, eventually consistent for product images).
Ignoring the cost of consistency in normal operation (PACELC). Using strong consistency for every read will increase database load and latency significantly.
Not testing partition recovery. The most common production issue is not the partition itself but the behavior after it heals — e.g., read repair storms, data conflicts, or slow convergence.

Warning: Partition recovery is not trivial

After a partition heals, systems like Cassandra run read repair, which can cause high CPU and I/O. In one production incident, a 2-minute partition caused a 30-minute read repair storm that degraded performance. Always test recovery paths.

Production Insight

A common mistake: using strong consistency for logging data. Logs don't need linearizability — use eventual consistency to avoid CAP trade-offs altogether.

Another: deploying a CP system in a multi-region setup without understanding that cross-region partitions are more common than intra-region ones.

Rule: Map each data type to a consistency requirement: only enforce CP for immutable or critical data.

Key Takeaway

CAP is not a one-size-fits-all decision.

You can mix CP and AP per operation.

Test partition recovery as thoroughly as partition behavior.

Why Consistency Models Matter More Than CAP Labels

Stop wasting time arguing whether your system is CP or AP. The real world runs on shades of consistency. CAP's binary choice is a thought experiment, not an architecture guide. When you pick CP, you get linearizability—every read returns the latest write. That's expensive in latency and availability. AP systems use weaker models: causal consistency, read-your-writes, or eventual consistency. DynamoDB lets you choose per request with conditional writes and consistent reads. Cassandra offers tunable consistency per query. The mistake? Implementing strong consistency everywhere because you fear stale data. Most user-facing apps handle seconds of staleness fine. Financial ledgers don't. Map your consistency needs to each data path, not your entire database. That's how you build systems that scale without burning down on a Friday night.

consistency_choice.pyPYTHON

// io.thecodeforge
import boto3

dynamodb = boto3.client('dynamodb', region_name='us-east-1')

def read_user_balance(user_id: str, strong: bool = False) -> dict:
    # Production: payment reads need strong consistency
    kwargs = {
        'TableName': 'UserAccounts',
        'Key': {'UserId': {'S': user_id}}
    }
    if strong:
        kwargs['ConsistentRead'] = True  # Blocking, higher latency
        
    return dynamodb.get_item(**kwargs)

# Leaderboard reads tolerate 1-second stale data
# Balance reads for checkout cannot
balance = read_user_balance('u_42', strong=True)
leaderboard_view = read_user_balance('u_42', strong=False)

Output

$ python consistency_choice.py

# No output — this is production config, not a script you run.

Production Trap:

Randomly turning on strong consistency for all reads kills P99 latency. Measure baseline. Apply strongly consistent reads only after confirming the data flow requires it. Your SRE will thank you.

Key Takeaway

Tune consistency per query, not per database. Most reads can be eventually consistent; only payments, inventory, and auth demand strong.

Network Partitions Happen: How to Survive Without Picking a Side

You can't avoid partitions in distributed systems. Cloud regions go down. Switches fail. Someone pulls the wrong cable. CAP says you must choose: CP (block writes) or AP (serve stale data). But real systems cheat. They use hybrid approaches. CockroachDB defaults to CP but you can relax isolation for availability. Cassandra uses hinted handoff and read repair to heal after a partition. The trick is compensating mechanisms. During a partition, AP systems write to both sides then merge with conflict resolution. CRDTs (Conflict-free Replicated Data Types) eliminate manual conflict handling. Or use an external coordinator like Zookeeper to maintain quorum and reject writes to minority partitions. The worst move? Ignoring partitions until they bite you. Test failure modes with Chaos Engineering. Run a network partition in staging before your customers do it for you.

partition_handler.goGO

// io.thecodeforge
package main

import (
	"fmt"
	"math/rand"
	"time"
)

type Node struct {
	ID      string
	IsAlive bool
}

func (n *Node) Write(data string) error {
	if !n.IsAlive {
		// Quorum write failed — fallback to local log for replay
		fmt.Printf("[%s] Partition detected. Queuing write: %s\n", n.ID, data)
		return fmt.Errorf("partition: write queued")
	}
	fmt.Printf("[%s] Write committed: %s\n", n.ID, data)
	return nil
}

func main() {
	rand.Seed(time.Now().UnixNano())
	nodes := []Node{
		{ID: "us-east-1a", IsAlive: true},
		{ID: "us-east-1b", IsAlive: true},
		{ID: "us-west-2a", IsAlive: false}, // Simulated partition
	}

	for _, node := range nodes {
		err := node.Write("order_123: status=paid")
		if err != nil {
			// Push to dead letter queue for replay after partition heals
			fmt.Printf("Replay required for node %s\n", node.ID)
		}
	}
}

Output

$ go run partition_handler.go

[us-east-1a] Write committed: order_123: status=paid

[us-east-1b] Write committed: order_123: status=paid

[us-west-2a] Partition detected. Queuing write: order_123: status=paid

Replay required for node us-west-2a

Design Pattern:

Use a write-ahead log per node that replays after partition recovery. Combine with idempotent write keys so replay doesn't duplicate data.

Key Takeaway

Don't choose CP or AP blindly. Build compensating mechanisms: write queues, conflict resolution, and external quorum managers. Partitions are inevitable; your recovery plan isn't.

● Production incidentPOST-MORTEMseverity: high

The Banking Blackout: A CP System Refuses Reads During a Network Blip

Symptom

Customers could not view balances or initiate transfers for 12 minutes. The service returned HTTP 503 for all read requests.

Assumption

The engineering team assumed that because the system was designed as CP (consistency over availability), it would still serve stale reads during a partition if the majority partition was healthy.

Root cause

The system used a strict quorum read (R + W > N) with R=3, W=3, N=3. During a network partition, one node was isolated. The surviving two nodes formed a majority, but with R=3, a read required all three nodes. Since only two were reachable, the read quorum could not be met, and the system returned an error. The team had configured quorum sizes without considering partition survival.

Fix

Changed read quorum to R=2. Now during the same partition, the two-node majority could serve reads. Writes remained R=2 as well (W=2), preserving strong consistency because the minority node could not accept writes. The system became CP with better availability during partitions.

Key lesson

Always test your quorum configuration under a simulated network partition.
CP doesn't have to mean total unavailability — you can tune quorum sizes to survive minority failures while preserving consistency.
Business impact of strict CP must be agreed with product owners before going to production.

Production debug guideSymptom → Action guide for engineers debugging consistency or availability issues.4 entries

Symptom · 01

Reads return stale data after a network blip (AP system)

→

Fix

Check read repair stats: In Cassandra, run nodetool repair -pr. Check hinted handoff delivery. If stale reads persist, increase read consistency to LOCAL_QUORUM.

Symptom · 02

Writes are rejected during partition (CP system like Zookeeper)

→

Fix

Check which nodes are in the majority partition. ZK: run zkServer.sh status on each node. Verify leader election. If a minority partition is trying to write, reconfigure client connection string to exclude isolated nodes.

Symptom · 03

Inconsistent data between replicas after partition heals

→

Fix

Run anti-entropy repair: In Cassandra, nodetool repair. In DynamoDB, check last-updated timestamps or use version vectors. For MongoDB, check oplog and run rs.stepDown() to force sync.

Symptom · 04

Read latency spikes during partition (both CP and AP)

→

Fix

Check if reads are blocking on quorum negotiation. In CP systems, may need to reduce quorum size temporarily. In AP systems, ensure read repair is not synchronous. Investigate network latency between nodes.

★ Quick Debug Cheat Sheet for CAP ScenariosImmediate commands and actions when you suspect a CAP-related issue in production.

All reads returning errors (503)−

Immediate action

Check if system is CP and quorum is too high. Reduce read consistency level temporarily.

Commands

For Cassandra: cqlsh -e 'CONSISTENCY ONE' before reading.

Check nodetool status to see node liveness.

Fix now

Reduce quorum size or switch to AP for reads until partition resolves.

Stale reads after partition (AP system)+

Zookeeper leader election loops+

CAP Choices of Popular Databases

Database	CAP Classification	Default Behavior During Partition	Tunable?	Production Use Case
Cassandra	AP	Both partitions serve reads/writes; conflict resolution via LWW	Yes (consistency levels)	Time-series data, shopping cart events
MongoDB	CP	Only primary partition serves; minority partition cannot write	Yes (read preference)	Transactional apps with occasional stale reads
Zookeeper	CP	Minority partition shuts down completely	No	Leader election, configuration management
etcd	CP	Minority partition unavailable	No	Kubernetes cluster state
CockroachDB	CP per range	Each range's majority must be available; otherwise range unavailable	Limited (survival goals)	Multi-region OLTP requiring serializability
DynamoDB	AP	Both partitions continue; eventually consistent reads	Yes (strongly consistent reads option)	Real-time web apps, gaming leaderboards

Key takeaways

CAP forces a choice during partitions

CP or AP — you cannot have all three in a distributed system.

Partition tolerance is mandatory

you must design for network failures, not assume they won't happen.

PACELC extends CAP

even in normal operation, you trade consistency for latency.

Use per-operation consistency levels to mix CP and AP for different data within the same system.

Each database makes specific CAP trade-offs

test them in your environment before choosing.

Common mistakes to avoid

4 patterns

Confusing CAP Consistency with ACID Consistency

Symptom

Developers assume a database that is ACID-compliant is also CAP-consistent. But ACID consistency is about invariants within a transaction; CAP consistency is about linearizability across nodes. An ACID database can be AP (e.g., some NoSQL databases are not ACID but are still consistent in CAP terms).

Fix

Distinguish between ACID consistency (database rules) and CAP consistency (distributed freshness). Document which one you mean in architecture discussions.

Using strong consistency for all reads in an AP database

Symptom

Latency spikes and timeouts because the database must coordinate with all replicas for every read. This defeats the purpose of an AP system.

Fix

Use consistency levels per operation: strong for critical data, eventual for non-critical. Benchmark the latency difference.

Assuming partition tolerance means 'I don't have to worry about partitions'

Symptom

System behaves unpredictably during network failures because no partition handling strategy was implemented. For example, an AP system may accumulate unbounded conflict logs.

Fix

Implement partition detection, conflict resolution, and recovery procedures. Test with chaos engineering.

Not considering the PACELC trade-off in normal operation

Symptom

System performance is worse than expected because strong consistency is used even when no partition exists. The latency cost is paid all the time.

Fix

Use eventual consistency for reads that don't need absolute freshness. Profile your read-heavy endpoints.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

Explain CAP Theorem and give an example of a CP system vs an AP system. ...

Q02SENIOR

What is PACELC and how does it extend CAP Theorem? Give a real example w...

Q03SENIOR

You're designing a global e-commerce platform. The product catalog must ...

Q04SENIOR

How does MongoDB handle the CAP trade-off compared to Cassandra? When wo...

Q05SENIOR

Explain why you can't have a CA system in practice and why the term 'CA'...

Q01 of 05SENIOR

Explain CAP Theorem and give an example of a CP system vs an AP system. Which would you choose for a banking ledger?

ANSWER

CAP says a distributed system can guarantee at most two of Consistency, Availability, Partition Tolerance. In practice, partitions are inevitable, so the choice is CP or AP. A CP system like Zookeeper prioritizes consistency over availability — it will reject requests during a partition. An AP system like Cassandra prioritizes availability — it will serve requests even if some nodes have stale data. For a banking ledger, you need strong consistency to prevent double-spending, so you'd choose CP (e.g., having a quorum-based consensus like Raft or using a CP database like Zookeeper for transaction coordination). However, you might design the ledger to accept deposits even during a partition (AP for inserts) but reject withdrawals until consistency is restored. The answer should show awareness that even in financial systems, not all operations require strict CP.

FAQ · 5 QUESTIONS

Frequently Asked Questions

What is CAP Theorem in simple terms?

Is MongoDB CP or AP?

Can you have both CP and AP in the same database?

What is the difference between CAP and PACELC?

Why is partition tolerance not optional?

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

✓ Verified

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

🔥

That's Database Design. Mark it forged?

10 min read · try the examples if you haven't