MongoDB Replication Lag — 3% Reconciliation Failure
- NoSQL is a family of purpose-built databases — pick the one that matches your data access pattern
- Document stores: flexible schemas, embedded data, best for catalogs and profiles
- Key-value stores: fastest reads by primary key, memory-bound, ideal for caching and counters
- NoSQL databases trade ACID for scalability, flexibility, and speed
- Four main types: document, key-value, column-family, graph — each solves a different problem
- CAP theorem governs the consistency/availability trade-off every NoSQL system faces
- Schema-on-read allows storing polymorphic data without migrations
- Production pitfall: choosing NoSQL for relational data leads to painful query workarounds
- Performance insight: key-value stores can do sub-millisecond reads; column stores excel at range scans over wide columns
Quick Debug Cheat Sheet: NoSQL Performance & Failures
Inconsistent reads across replicas in MongoDB
db.collection.find({}).readPref('primary').readConcern('majority')rs.printSlaveReplicationInfo() to check replication lagRedis memory exhaustion – OOM kills
redis-cli MEMORY STATSredis-cli --bigkeys to find large keysCassandra high read latency – 99th percentile >500ms
nodetool cfstats keyspace.tablenodetool tablestats keyspace.table | grep 'Read'Production Incident
Production Debug GuideSymptom → Action guide for the four NoSQL families
explain(). Look for collection scans vs index scans. If index is not used, verify query shape matches the index — MongoDB cannot use partial indexes on regex or negation.fork() latency during BGSAVE.Every app you use daily — Instagram's feed, Netflix's recommendations, Uber's driver locations — stores data differently from the neat rows-and-columns world of SQL. These systems handle millions of writes per second, store wildly different shapes of data, and must stay online across data centers on different continents. Traditional relational databases are incredible tools, but they were designed in an era when a server rack cost more than a house and the internet didn't exist yet. The world changed; the data layer had to change with it.
The core problem SQL solves — enforcing a rigid schema and guaranteeing ACID transactions — is exactly what becomes a bottleneck at web scale or when your data shape is unpredictable. When every user profile has a different set of preferences, when a social graph has billions of edges, or when you need to read a user's session in under a millisecond from any region on earth, forcing data into tables with foreign keys and JOINs creates real pain: slow migrations, expensive hardware scaling, and query planners that simply give up.
By the end of this article you'll understand the four main NoSQL families and what problem each one was built to solve, how CAP theorem governs the trade-offs every NoSQL system makes, and how to look at a real-world requirement and choose the right database — or know when to stick with Postgres.
What Is NoSQL — and Why Did It Emerge?
NoSQL stands for 'Not Only SQL'. It's a category of database systems designed to handle data that doesn't fit neatly into fixed tables. The need emerged in the mid-2000s when internet giants like Google, Amazon, and Facebook hit walls with traditional relational databases. Their workloads demanded horizontal scaling across thousands of servers, flexible schemas for rapidly changing product features, and sub-millisecond access times for billions of users. NoSQL systems sacrificed strict ACID guarantees in exchange for these properties.
Think of NoSQL as a set of purpose-built tools rather than a single approach. Each type — document, key-value, column-family, graph — optimises for a different data access pattern. The common thread: all of them avoid the rigid table-join-index model of SQL. They're not better or worse; they're built for different jobs.
Production reality: most organisations end up running multiple NoSQL databases alongside a relational system. A typical architecture uses Postgres for core business transactions, Redis for caching and session storage, MongoDB for product catalogues, and Elasticsearch for search. Understanding the trade-offs helps you pick the right tool without overcomplicating your infrastructure.
version: '3.8' services: postgres: image: postgres:15 environment: POSTGRES_DB: orders # ... mongodb: image: mongo:7 # stores product catalog - flexible schema redis: image: redis:7-alpine # session cache - sub-millisecond reads
- SQL: rigid drawers, you must know all fields upfront, but finding exactly what you need is fast and guaranteed consistent
- NoSQL: throw things in, no upfront planning, but you might have to rummage through the whole bag to find something, and occasionally you'll get stale contents
- You wouldn't carry a filing cabinet on a hike; you wouldn't store legal records in a backpack. Pick the storage that matches the job.
Document Stores — MongoDB, Couchbase
Document stores save data as self-contained JSON/BSON documents. Each document can have its own structure — one user may have 3 fields, another 20. This is ideal for product catalogues, user profiles, and content management systems where the schema evolves rapidly.
MongoDB is the most popular example. It stores documents in collections (similar to tables) but doesn't enforce a schema. Queries use a rich JSON-based query language with indexes, aggregations, and geospatial support. Document stores support secondary indexes, but joins are expensive — you typically denormalise related data into a single document.
Performance: reads are fast because a single document contains all the data needed for a page. Writes can be a bottleneck if you update large documents frequently — the entire document is rewritten. Atomic operations on single documents are supported, but multi-document transactions (available since MongoDB 4.0) have limited isolation and performance overhead.
// Connect to MongoDB and query product catalog const { MongoClient } = require('mongodb'); const uri = "mongodb://localhost:27017"; const client = new MongoClient(uri); async function run() { await client.connect(); const db = client.db('shop'); const products = db.collection('products'); // Insert a document — note different fields per category await products.insertOne({ name: 'Wireless Mouse', price: 29.99, category: 'electronics', specs: { connectivity: 'Bluetooth', buttons: 6 } // another product might have 'size' and 'color' instead of 'specs' }); // Query with index hint const cursor = products.find({ price: { $gte: 20 } }); const results = await cursor.toArray(); console.log(results.length); }
Key-Value Stores — Redis, DynamoDB, Riak
Key-value stores are the simplest NoSQL family — a map from a unique key to a blob of data (string, JSON, binary). They're built for lightning-fast lookups by primary key. Redis, DynamoDB, and Memcached are the heavy hitters.
Redis is an in-memory data structure server, not just a cache. It supports strings, hashes, lists, sets, sorted sets, and streams. Multi-key operations are atomic because Redis is single-threaded (for data operations). Persistence is optional. Production use cases: session stores, rate limiter counters, leaderboards, real-time messaging via Pub/Sub.
DynamoDB is a fully managed key-value and document store by AWS. It scales horizontally automatically using consistent hashing. It offers single-digit millisecond latency at any scale. But query flexibility is limited — you must model access patterns upfront (primary key, sort key, secondary indexes). The pricing model (read/write capacity units) can be surprising.
import redis import json r = redis.Redis(host='localhost', port=6379, decode_responses=True) # Store user session (auto-expire after 3600s) user_session = {'user_id': 42, 'role': 'admin', 'iat': 1712345678} r.setex('session:abc123', 3600, json.dumps(user_session)) # Retrieve – O(1) session = json.loads(r.get('session:abc123')) print(session['role']) # 'admin' # Atomic counter for rate limiting key = f'ratelimit:user:42:{datetime.utcnow():%Y%m%d%H}' current = r.incr(key) r.expire(key, 3600) # auto-clean after an hour if current > 1000: print('Rate limit exceeded')
Rate limit exceeded
Column-Family Stores — Cassandra, HBase, Scylla
Column-family stores (often called wide-column stores) store data in rows but allow each row to have different columns. The key idea: data is indexed by row key and sorted by column key within each row. This makes them excellent for time-series data, IoT streaming, and analytics workloads that scan large ranges of a known row.
Apache Cassandra is the standard-bearer. It offers tunable consistency — choose how many replicas must respond before the read/write is considered successful. Its architecture is masterless: every node can accept reads/writes. Data is partitioned via consistent hashing and replicated across nodes. Writes are designed to be blazing fast (append-only commit log + memtable + periodic SSTable flush).
HBase (on top of HDFS) offers strong consistency but at the cost of write throughput. ScyllaDB is a C++ rewrite of Cassandra claiming 10x better performance on the same hardware.
-- Keyspace with network topology replication CREATE KEYSPACE iot_data WITH replication = { 'class': 'NetworkTopologyStrategy', 'dc1': '3' }; CREATE TABLE iot_data.sensor_readings ( sensor_id uuid, day text, -- partition key: YYYY-MM-DD ts timestamp, -- clustering column for sorting temperature float, humidity float, PRIMARY KEY ((sensor_id, day), ts) ) WITH CLUSTERING ORDER BY (ts DESC); -- Efficient: fetch latest 100 readings for a sensor on a given day SELECT * FROM sensor_readings WHERE sensor_id = ? AND day = '2026-05-01' ORDER BY ts DESC LIMIT 100;
nodetool cfhistograms. If you see reads scanning thousands of tombstones per query, reduce TTL or use TWCS compaction strategy.Graph Databases — Neo4j, Amazon Neptune
Graph databases model data as nodes (entities) and edges (relationships). This makes them the natural choice for social networks, recommendation engines, fraud detection, and any domain where the connections between data points are as important as the data itself.
Neo4j is the most mature graph database. It uses the property graph model: nodes and edges can have key-value properties. Queries are expressed in Cypher, a declarative language that looks like ASCII art. Relationships are first-class citizens — they always have a direction and a type. This avoids the costly join tables and recursive queries needed in SQL to traverse relationships.
Performance: traversing relationships is O(1) per hop because edges are stored as pointers. For graph queries like 'find friends of friends of friends who like this movie', graph databases are orders of magnitude faster than SQL joins across multiple tables.
-- Find movie recommendations for a user based on friends' ratings MATCH (u:User {id: 'alice'})-[:FRIEND]->(f:User) MATCH (f)-[:RATED]->(m:Movie) WHERE NOT EXISTS { (u)-[:RATED]->(m) } RETURN m.title, AVG(f.rating) AS avg_rating ORDER BY avg_rating DESC LIMIT 10
CAP Theorem and Trade-offs in NoSQL
The CAP theorem states a distributed data store can provide at most two of three guarantees: Consistency (every read sees the latest write), Availability (every request receives a non-error response), and Partition Tolerance (system continues despite network splits). In practice, partitions are inevitable in any distributed system, so you must choose between CP (Consistency + Partition Tolerance) and AP (Availability + Partition Tolerance).
NoSQL databases make explicit trade-offs: MongoDB is CP by default (primary reads), but can be configured for eventual consistency (AP). Cassandra is AP by default — it prefers availability over consistency. Redis cluster is CP for single-key operations, but AP for multi-key transactions across nodes.
This is not a theoretical exercise. In production, the CAP choice determines how your system behaves during a network partition. If a node is isolated but available, it may accept writes that conflict with writes accepted by the rest of the cluster. When the partition heals, you need conflict resolution (last-write-wins, CRDTs, or manual reconciliation). Many teams discover CAP the hard way — when their 'eventually consistent' system fails to converge for hours.
package io.thecodeforge.nosql; // Illustrating CAP trade-off in code — not real API public class CapExample { enum CapChoice { CP, AP, CA_UNREALISTIC } static class Config { CapChoice choice; int readConcern; // e.g., 1 (local), majority int writeConcern; // e.g., 1 (ack from one), all static Config forMongoDb(String consistencyLevel) { Config c = new Config(); if ("strong".equals(consistencyLevel)) { c.choice = CapChoice.CP; c.readConcern = 3; // majority c.writeConcern = 3; // majority } else { c.choice = CapChoice.AP; c.readConcern = 1; // local c.writeConcern = 1; // local } return c; } } public static void main(String[] args) { Config prod = Config.forMongoDb("strong"); System.out.println("Production config: " + prod.choice); // Under partition: strong consistency means some writes may fail } }
- CP systems (e.g., HBase, MongoDB with majority concern): will reject writes during a partition to maintain consistency
- AP systems (e.g., Cassandra, DynamoDB): will accept writes during a partition, but you may read stale or conflicting data after healing
- The choice determines your on-call experience: CP systems cause write failures; AP systems cause data reconciliation nightmares
- No 'right' answer — it depends on your business: do you tolerate lost writes or inconsistent reads?
| Feature | Document (MongoDB) | Key-Value (Redis) | Column-Family (Cassandra) | Graph (Neo4j) |
|---|---|---|---|---|
| Data Model | JSON/BSON documents | Key → value blob | Rows with flexible columns | Nodes & relationships |
| Query Language | MQL (MongoDB Query Language) | Commands (SET, GET, etc.) | CQL (Cassandra Query Language) | Cypher |
| Best Use Case | Product catalogs, user profiles | Session cache, counters, leaderboards | Time series, IoT, event logging | Social graphs, recommendations, fraud |
| Scalability Model | Replica sets + sharding | Redis Cluster (hash slots) | Masterless ring (consistent hashing) | Read replicas, causal clustering |
| Consistency (default) | CP (primary reads, majority writes) | CP (single-node cluster) / AP (cluster multi-key) | AP (tunable per query) | CP (single-instance writes) / AP (cluster reads) |
| Latency (p50 read) | 1-5ms (indexed, local) | <1ms (in-memory) | 2-10ms (tunable consistency) | 5-20ms (indexed traversal) |
| Primary Limitation | Multi-document transactions slow | Memory-bound, no complex queries | Ad-hoc queries hard; tombstone overhead | Bad for bulk aggregations; write performance |
🎯 Key Takeaways
- NoSQL is a family of purpose-built databases — pick the one that matches your data access pattern
- Document stores: flexible schemas, embedded data, best for catalogs and profiles
- Key-value stores: fastest reads by primary key, memory-bound, ideal for caching and counters
- Column-family stores: write-optimized, masterless, great for time-series and high-throughput writes
- Graph databases: relationship-first queries, natural for social and recommendation systems
- CAP theorem forces a choice between consistency and availability during partitions — know which one your business needs
- Most production systems use polyglot persistence: SQL for transactions, NoSQL for specific workloads
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QExplain the CAP theorem and how it affects NoSQL database choices. Give a real-world example of a CP vs AP choice.SeniorReveal
- QWhen would you choose MongoDB over PostgreSQL? Give a concrete scenario.Mid-levelReveal
- QDescribe how Cassandra handles writes and how it achieves high write throughput.SeniorReveal
- QWhat are trade-offs of using an in-memory key-value store like Redis vs a disk-backed one like DynamoDB?Mid-levelReveal
- QExplain the concept of tombstone in Cassandra. Why is it a performance problem and how do you mitigate it?SeniorReveal
- QWhen would you use a graph database over a document store?SeniorReveal
Frequently Asked Questions
What is NoSQL in simple terms?
NoSQL stands for 'Not Only SQL'. It refers to databases that don't use the traditional table-based relational model. Instead, they store data as documents, key-value pairs, wide columns, or graphs. They're built to handle large scale, flexible schemas, and high availability — often sacrificing some ACID guarantees in exchange.
Which NoSQL database should I learn first?
Start with MongoDB (document store) — it's the most popular, has a rich query language, and the concepts transfer to other NoSQL systems. Then learn Redis (key-value) for caching and real-time workloads. For interview preparation, also understand Cassandra (column-family) and Neo4j (graph) at a high level. The key is understanding the trade-offs, not just syntax.
Can I use NoSQL for a banking application?
Not without significant careful design. Banking requires ACID transactions across multiple entities (accounts, ledgers, transactions). Most NoSQL systems sacrifice strong consistency or multi-record atomicity. While some NoSQL databases (e.g., MongoDB with replica set majority write concern) can be configured for strong consistency, the operational complexity and potential for data loss during edge cases make relational databases a safer choice for core financial systems.
What is the difference between MongoDB and Cassandra?
MongoDB is a document store — data is stored as JSON-like documents with flexible schemas. It supports secondary indexes, rich queries, and aggregation pipelines. Cassandra is a column-family store — data is stored in rows with flexible columns, but it's designed for high write throughput and horizontal scaling across many nodes. Cassandra has a more limited query language (CQL) — you must model your data around the queries you'll run. MongoDB is easier to start with; Cassandra is better for time-series and write-heavy workloads.
Is NoSQL faster than SQL?
It depends on the query pattern. For simple key lookups, key-value stores can be 10-100x faster than SQL because they avoid parsing, planning, and joining. But for complex queries involving multiple conditions, aggregations, and relationships, a well-indexed SQL database often outperforms NoSQL, which would require application-level logic. The speed difference is a function of access pattern, not the storage engine itself.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.