Intermediate 6 min · March 06, 2026

Database Selection — The $200k MongoDB Transaction Failure

MongoDB's write concern caused duplicate debits and lost credits in production.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
Quick Answer
  • Relational DBs (PostgreSQL, MySQL) win when data has clear relationships and ACID matters
  • Document DBs (MongoDB) shine when schema varies per record and reads need no JOINs
  • Key-value stores (Redis, DynamoDB) are for sub-millisecond lookups by a single key
  • Graph DBs (Neo4j) excel at relationship traversals like 'friends of friends'
  • Time-series DBs (InfluxDB, TimescaleDB) crush metrics and sensor data by 90% compression
  • The biggest mistake: picking one database for everything — real systems use multiple types per access pattern

Every system design decision you make flows downstream from one foundational choice: where and how you store your data. Pick the wrong database and you'll spend years fighting your own infrastructure — slow queries you can't optimize, schema migrations that take weekends, or consistency guarantees you can't afford. The engineers who built Instagram, Discord, and Uber didn't just grab the nearest database; they matched the shape of their data to the shape of their tool.

The real problem isn't that developers don't know databases exist — it's that most tutorials teach you the syntax of one and leave you guessing about the rest. You end up defaulting to PostgreSQL for everything (which isn't always wrong, but often is suboptimal), or worse, you swing to MongoDB because someone on Reddit said 'NoSQL scales better' without understanding what that actually means or under what conditions it's true.

By the end of this article you'll be able to look at a system design problem — whether it's a ride-sharing app, a real-time analytics dashboard, or a social network — and confidently justify your database choice with concrete trade-offs. You'll know not just what each database type is, but when it earns its place and when it becomes a liability.

The 5 Database Categories Every System Designer Must Know

Before you can choose a database, you need to understand what categories exist and what problem each one was invented to solve. Each type emerged because engineers hit a wall with the previous solution.

Relational databases (PostgreSQL, MySQL) were built when data was structured, relationships mattered, and correctness was non-negotiable — think bank transactions, inventory systems, HR records. They enforce a rigid schema so your data stays consistent even when your application code has bugs.

Document databases (MongoDB, Firestore) emerged when product teams needed to iterate fast — adding a new user field shouldn't require a migration script. They store data as JSON-like documents, which mirrors how most application code already thinks about objects.

Key-value stores (Redis, DynamoDB) optimize for one thing: get me a value in under a millisecond given a key. Session storage, caching, leaderboards, rate limiting — anything where lookup speed is everything.

Graph databases (Neo4j, Amazon Neptune) flip the model entirely. In a relational DB, joining five tables to find 'friends of friends who bought X' is expensive. In a graph DB, that traversal is native — relationships are first-class citizens stored as edges.

Time-series databases (InfluxDB, TimescaleDB) are purpose-built for append-heavy, time-stamped data: metrics, sensor readings, financial ticks. They compress data intelligently and make range queries over time windows blazingly fast.

SQL vs NoSQL: The Trade-off Nobody Explains Clearly Enough

The SQL vs NoSQL debate gets oversimplified into 'SQL is old, NoSQL scales' — which is dangerously wrong. The real difference comes down to three axes: consistency guarantees, query flexibility, and data shape.

SQL databases enforce ACID — Atomicity, Consistency, Isolation, Durability. When you transfer $500 between bank accounts, you need both the debit and credit to succeed or both to fail. You need every reader to see the same balance. ACID gives you that. NoSQL databases often trade some of these guarantees for throughput and horizontal scalability.

NoSQL isn't one thing — it's four different philosophies (document, key-value, graph, time-series) that all happen to not use SQL syntax. MongoDB isn't 'like PostgreSQL but faster' — it's a fundamentally different access pattern optimized for different query shapes.

The honest answer: use SQL when your data has clear relationships, when correctness is critical, or when you need ad-hoc queries. Use NoSQL when your access pattern is predictable and simple, your schema is genuinely unstable, or when you need to scale writes horizontally across many machines — which is a real problem, but later than most engineers think.

Polyglot Persistence: Why Real Systems Use Multiple Databases

Here's the insight that separates junior from senior system design thinking: no single database is optimal for every access pattern in a complex system. The industry term for intentionally using multiple database types in one system is polyglot persistence — and it's the norm at scale, not the exception.

Consider a social media platform. User profiles need ACID transactions (you can't have a half-saved account). The activity feed needs sub-millisecond reads with a predictable key (userId → feed). The 'People You May Know' feature lives entirely in relationship traversal. Post engagement metrics over time are time-series. Forcing all four into PostgreSQL works, but you're swimming upstream on three of them.

The trade-off is operational complexity. Every additional database type means another deployment, another monitoring stack, another on-call runbook, another thing that can go down at 3am. For a team of three engineers, one PostgreSQL instance is almost always the right answer. For a team of fifty with clear bottlenecks, polyglot persistence pays for itself.

The decision framework: introduce a new database type only when you have a measured, specific bottleneck that your current database genuinely cannot solve with indexing, caching, or schema changes.

The CAP Theorem — What It Actually Means for Database Choice

The CAP theorem says a distributed database can only guarantee two of three properties simultaneously: Consistency (every read gets the most recent write), Availability (every request gets a response, even if it's stale), and Partition Tolerance (the system keeps working even if network messages between nodes are lost).

Here's the part most tutorials skip: partition tolerance isn't optional in any real distributed system. Networks fail. So the real choice is always: during a network partition, do you want your system to stay consistent or stay available? That's it. CA vs CP vs AP.

Banking: choose CP. If two ATMs briefly can't talk to each other, it's better to reject a transaction than to dispense cash twice. Postgres, HBase.

Shopping cart: choose AP. If Amazon's recommendation service is partitioned, better to show a slightly stale 'items in cart' than to refuse to load the page. DynamoDB in AP mode, Cassandra.

This isn't just academic — it directly affects which database product you reach for and how you configure it. Redis in standalone mode is CA. Redis Cluster sacrifices some consistency for availability. DynamoDB lets you choose per-operation with its 'strongly consistent reads' flag.

A Step-by-Step Decision Framework for Choosing Your Database Stack

By now you understand the categories, the SQL vs NoSQL trade-offs, polyglot persistence, and CAP. But how do you actually decide? Here's a repeatable framework we've used at three different companies to make database decisions that didn't need undoing two years later.

Step 1: List every distinct access pattern in your system. For each — what data is read, what key is used, how often, what latency is acceptable, whether consistency matters, whether schema changes frequently.

Step 2: For each access pattern, map it to the best-fit database category using the decision tree from Section 1. If multiple patterns map to the same category, that's a candidate.

Step 3: Evaluate the operational load of each candidate. A second database type adds deployment, monitoring, backups, and expertise requirements. Ask: is the performance gain worth the Ops tax?

Step 4: If you choose multiple databases, design clear boundaries. Each service owns its data store. No cross-store transactions — use event-driven patterns (Saga, event sourcing) to maintain consistency across stores.

Step 5: Prototype the critical path. Don't just benchmark with empty datasets. Load test with realistic data sizes and concurrent access patterns. Measure p99 latency, not just median.

This framework prevents the two most common errors: picking a database based on hype, and over-engineering for scale that never materializes.

Database Type Comparison
Database TypeBest Use CaseWorst Use CaseConsistency ModelHorizontal ScaleQuery FlexibilityExample Products
Relational (SQL)Transactions, complex joins, audit trailsRapidly changing schema, massive write throughputACID (strong)Hard — requires sharding or read replicasVery High — ad-hoc SQLPostgreSQL, MySQL, SQLite
DocumentNested objects, CMS, product catalogs with variable attributesComplex multi-entity relationships, heavy aggregationsEventual (tunable)Easy — built-in shardingMedium — needs indexes for non-primary queriesMongoDB, Firestore, CouchDB
Key-ValueCaching, sessions, rate limiting, feature flagsComplex queries, rich data relationshipsConfigurable per opVery Easy — partitioned by keyVery Low — only key lookupRedis, DynamoDB, Memcached
GraphSocial networks, recommendation engines, fraud detectionSimple CRUD, time-series, or bulk analyticsACID (most)Hard — graph partitioning is unsolvedVery High — relationship traversalNeo4j, Amazon Neptune, JanusGraph
Time-SeriesMetrics, IoT sensors, financial ticks, monitoringRelational queries, document-style lookupsEventual (append-only)Easy — time-partitioned by designLow — optimized for time-window onlyInfluxDB, TimescaleDB, Prometheus
Wide-ColumnMassive write throughput, sparse data, multi-datacenterAd-hoc queries, small datasets, ACID transactionsTunable per queryExcellent — designed for linear scaleLow — must design queries into schemaCassandra, HBase, ScyllaDB

Key Takeaways

  • Each database category exists because another type failed for a specific access pattern.
  • Start with PostgreSQL and only add a new database when you have a measured, specific bottleneck.
  • Polyglot persistence is the norm at scale, but it comes with an operational tax.
  • CAP gives you a binary choice: during a partition, either reject requests (CP) or serve stale data (AP).
  • The best database choice is the one your team can operate and that matches your actual access patterns.

Common Mistakes to Avoid

  • Picking MongoDB for transactional workloads because it's 'faster'
    Symptom: Concurrent writes cause data loss: two balance updates overwrite each other, transactions fail silently. Recovery requires manual reconciliation.
    Fix: Use PostgreSQL for any operation that needs atomicity across multiple documents or rows. Keep MongoDB for read-heavy, single-document workloads with no atomic cross-document requirements.
  • Assuming NoSQL means 'no schema' — then discovering your app logic IS the schema
    Symptom: A 'user' document missing an 'email' field crashes the profile page. No migration error, just a runtime exception. Different collections have inconsistent field names.
    Fix: Use MongoDB's $jsonSchema validation to enforce structure. Or, if your schema is actually stable (most are after v1), use PostgreSQL and avoid the hidden schema problem entirely.
  • Choosing a graph DB because you heard it's 'great for social networks' — without having any traversal queries
    Symptom: You store user profiles and 2-3 relationships. Query performance is worse than PostgreSQL because graph DBs optimize for traversal, not simple lookups. Your BI team can't run ad-hoc SQL.
    Fix: Only use a graph DB if your primary access pattern is multi-hop traversals (e.g., 'friends of friends of friends'). If you're just storing one-to-many relationships, a relational DB with JOINs is faster and easier.
  • Adding a second database before the first one is optimized
    Symptom: The team spends 40% of sprints maintaining two databases: backups, connection pools, monitoring dashboards, and expertise. The original database was never indexed correctly.
    Fix: Measure first: run EXPLAIN ANALYZE on slow queries. Add indexes, caching (Redis), or read replicas before adding a second database type. The second DB should be a last resort, not a first instinct.

Interview Questions on This Topic

  • QHow would you design the database for a ride-sharing app like Uber? Consider driver location updates, ride matching, and payment processing.SeniorReveal
    I'd use polyglot persistence. For payment processing and ride history, use PostgreSQL for ACID compliance. For real-time driver location updates (high write throughput, simple key-value pattern), use Redis or a key-value store. For ride matching (finding nearby drivers), use a spatial index in PostgreSQL (PostGIS) or a specialized service. Each database serves the access pattern it's built for. The key is to keep strong consistency where money is involved, and eventual consistency everywhere else.
  • QWhat does the CAP theorem mean for a database selection decision in a distributed system?SeniorReveal
    In any distributed system, partition tolerance is mandatory — networks fail. So the real choice is between CP (consistency) and AP (availability). For a payment system, CP is correct: during a network partition, reject transactions rather than risk double-spends. For a social media feed, AP is fine: serve slightly stale data rather than show an error page. You can also have per-operation consistency choices, like DynamoDB's strongly consistent reads. The decision should match the business criticality of each operation.
  • QWhen would you choose a document database over a relational database for an e-commerce system?Mid-levelReveal
    I'd choose a document database for the product catalog when products have widely varying attributes (a laptop has processor speed, a shirt has size and color) and the UI always loads a complete product in one go. A single document read returns the full product without JOINs. But I'd keep orders and payments in PostgreSQL because they require ACID transactions across multiple rows. So the catalog is in MongoDB, and the transactional layer stays in PostgreSQL. Each database handles the access pattern it was built for.

Frequently Asked Questions

Should I always use PostgreSQL for everything if I'm a startup?

Yes, almost always. PostgreSQL handles millions of users with proper indexing. Instagram, one of the largest image-sharing platforms, ran on PostgreSQL until they had 13 employees. Only introduce a second database when you hit a measurable, specific bottleneck that PostgreSQL cannot solve with indexes, caching, or read replicas.

Is MongoDB faster than PostgreSQL?

It depends on the query. For reading a single document with all its nested data, MongoDB is faster because it avoids JOINs. But for complex aggregations across multiple entities, PostgreSQL is faster because SQL's query planner optimizes JOINs and aggregations. The performance difference is about matching the database to the query shape, not raw speed.

What is polyglot persistence and when should I use it?

Polyglot persistence is the practice of using multiple database types in one system. Use it when you have clearly distinct access patterns that each benefit from a different database category (e.g., transactional data in PostgreSQL, caching in Redis, time-series in InfluxDB). But only add each additional database when the operational cost is justified by a measured performance or feature gain.

Can I use a graph database for hierarchical data like a company org chart?

You can, but a relational database with recursive CTEs (e.g., PostgreSQL's WITH RECURSIVE) is often simpler and equally performant for tree hierarchies up to thousands of nodes. Graph databases shine when you need multi-hop traversals across many connections (e.g., 'find all employees who are within 2 degrees of separation from a specific manager'). For a simple org chart, stick with PostgreSQL.

What is the biggest mistake engineers make when choosing a database?

The biggest mistake is choosing a database based on hype or an expected future scale that never happens. Engineers migrate from PostgreSQL to MongoDB because 'NoSQL scales better', but later find that MongoDB's lack of ACID causes data loss in transactional operations. The fix: prototype with realistic data, measure bottlenecks, and default to a relational database until you prove you need something else.

🔥

That's Databases in Design. Mark it forged?

6 min read · try the examples if you haven't

Previous
Design a Content Moderation System
1 / 5 · Databases in Design
Next
Choosing Between Redis and Memcached