RDS vs DynamoDB — Eventual Consistency Failures at Scale
At 50k writes/sec, DynamoDB returned stale balances.
20+ years shipping production infrastructure and CI/CD at scale. Written from production experience, not tutorials.
- RDS: relational, SQL, ACID, vertical scaling, joins, foreign keys
- DynamoDB: NoSQL, key-value, eventual consistency, horizontal scaling, single-table access patterns
- Performance: DynamoDB offers single-digit millisecond latency at any scale; RDS latency increases with table joins and index depth
- Production insight: WRONG choice leads to 10x cost overruns or impossible schema migrations mid-project
- Biggest mistake: putting relational data (orders, invoices) into DynamoDB and forcing complex joins in app code
Think of RDS like a giant spreadsheet where every row must follow strict column rules — you can't just add a random extra column to one row without updating the whole sheet. DynamoDB is more like a folder of sticky notes — each note can have completely different information on it, and you can find any note almost instantly because they're all sorted by a label you chose. One is rigid and relational, the other is flexible and blazing fast. The trick is knowing which one your app actually needs.
Every application needs somewhere to store data. But the database decision you make on day one can haunt you for years — choosing the wrong engine means rewriting queries, hitting performance walls, or paying five times more in cloud costs than you should. AWS gives you two wildly different database philosophies under one roof: RDS (Relational Database Service) and DynamoDB. Understanding the difference isn't just academic — it directly affects how fast your app scales, how much it costs, and how easy it is to maintain when traffic triples overnight.
RDS solves the problem of structured, relationship-heavy data. Your users table needs to join your orders table, which joins your products table — and you need those JOINs to be consistent, transactional, and correct. DynamoDB solves a completely different problem: massive throughput at predictable latency. When you're storing session tokens, IoT sensor readings, or user activity events where you need single-digit millisecond reads at any scale, DynamoDB is built for exactly that.
By the end of this article you'll be able to provision both services with infrastructure-as-code, write idiomatic queries against each, understand the cost model differences, and — most importantly — make a confident architectural decision when someone in a design review asks 'should we use RDS or DynamoDB for this?'
Why RDS and DynamoDB Consistency Models Differ at Scale
RDS (Relational Database Service) and DynamoDB are both managed database services on AWS, but they operate on fundamentally different consistency principles. RDS provides strong consistency by default through ACID transactions, using synchronous replication within a single region. DynamoDB, by contrast, offers configurable eventual consistency for reads, trading immediate accuracy for lower latency and higher throughput — a trade-off that becomes critical at scale.
In practice, DynamoDB's eventual consistency means that after a write, subsequent reads may return stale data for up to one second. This is not a bug but a design choice: DynamoDB replicates data across multiple Availability Zones asynchronously. For reads that require the latest data, you must explicitly request strongly consistent reads, which cost twice the read capacity units and may fail with a 400 error if the replica is unreachable. RDS, using synchronous replication, guarantees that a committed write is immediately visible to all subsequent reads, but this locks you into a single-writer model that limits write throughput.
Choose RDS when your application demands strict consistency — financial transactions, inventory systems, or any workflow where stale reads cause data corruption. Choose DynamoDB when you need single-digit millisecond latency at any scale and can tolerate eventual consistency for most operations, such as session stores, gaming leaderboards, or IoT telemetry. The wrong choice leads to either throttled writes (RDS at scale) or silent data corruption (DynamoDB with strong consistency assumed).
Data Model: Rigid vs Flexible
RDS requires a fixed schema. Every table has defined columns with data types, constraints, and relationships enforced via foreign keys. That's great when your data actually fits a relational model — invoices have line items, users have addresses. But it's painful when you need to add a new attribute to a subset of rows: you either add a nullable column or create a separate extension table.
DynamoDB has no schema constraints (except a required partition key and optional sort key). Items in the same table can have completely different attributes. You can add a new field to one item without touching existing items. That flexibility comes at a cost: no automatic referential integrity, no JOINs, and complex multi-item operations require careful application logic.
The rule: if your data has rich relationships you need to enforce at the database level, pick RDS. If your access patterns are primarily by primary key and you need schema evolution without downtime, DynamoDB wins.
Query Capabilities: SQL vs Key-Value
RDS supports full SQL: SELECT with JOINs, WHERE clauses on any column, GROUP BY, aggregations, subqueries, window functions. You can ask complex analytical questions in a single query. DynamoDB supports only three operations on data: GetItem (by primary key), Query (by partition key + optional sort key conditions), and Scan (full table, expensive). Every other filter must be applied on the client side after retrieving the data.
That means DynamoDB forces you to design your access patterns before you write code. You can't spontaneously run a query to find all users who signed up in March and have at least three orders. You'd need a secondary index (GSI) specifically designed for that query, which adds complexity and cost.
Production reality: most teams adopting DynamoDB underestimate how much their query needs will evolve. They end up adding GSIs, creating materialized views, or streaming to a search engine. The trade-off is predictable performance at scale versus ad-hoc query flexibility.
Scaling: Vertical vs Horizontal
RDS scales vertically — you upgrade the instance size (db.r5.large -> db.r5.xlarge) with some downtime. Read-heavy workloads use read replicas (up to 15 for Aurora). Write scaling is harder: you can't shard writes automatically (unless you implement application-level sharding or use Aurora Serverless v2).
DynamoDB scales horizontally from day one. Each partition key can handle 3000 RCU / 1000 WCU. If you exceed that, DynamoDB splits partitions automatically — but only if your partition key is well-distributed. A hot partition key (e.g., a single tenant that gets 90% of traffic) will throttle regardless of total table capacity.
The scaling axis decision is critical. If you expect unpredictable write spikes (e.g., Black Friday), DynamoDB's on-demand auto-scaling is a huge win. If you need complex transactional writes across multiple tables (e.g., inventory deduction + order creation), RDS's single-writer architecture with ACID is simpler to reason about.
Performance and Latency
DynamoDB guarantees single-digit millisecond latency for GetItem and Query operations at any scale, as long as your partition key is designed well. Reads served from cache (DAX) can be even faster. RDS latency varies: a simple PK lookup can be 1-5ms, but a complex JOIN with full table scans can be 100ms or more. RDS performance depends on query design, indexes, and instance size.
The key difference: DynamoDB latency is predictable regardless of data size (within partition limits). RDS latency grows with data volume and query complexity. For latency-sensitive applications (user sessions, real-time leaderboards), DynamoDB shines. For analytical queries where 500ms is acceptable, RDS is fine.
Cost Model: Provisioned vs On-Demand
RDS charges per instance hour + storage + IOPS. You pay for the capacity you provision (even if idle) plus storage costs (GP2, GP3, io1). Reserved instances reduce cost for steady workloads. DynamoDB charges per read/write unit (provisioned or on-demand). On-demand lets you pay per request, ideal for variable workloads. But per-request costs are higher than provisioned for sustained traffic.
A common mistake: underestimating DynamoDB costs for heavy read/write workloads. At scale, a table doing 10,000 writes/second with on-demand pricing can cost over $10,000/month. RDS for similar write throughput would be cheaper with a large instance (e.g., db.r5.12xlarge ~$6,000/month reserved). But RDS can't sustain that write throughput for complex writes involving multiple indexes and triggers.
Key trade-off: DynamoDB costs are directly tied to throughput — you can't have high throughput for cheap. RDS costs are tied to compute — you can get moderate throughput at lower cost, but you hit a vertical scaling ceiling.
When to Use Each: Decision Framework
Here's a practical decision tree: If your data has clear relationships (foreign keys, joins in every query) and you need ACID transactions across multiple entities, choose RDS (specifically Aurora for better performance). If your access patterns are primarily by primary key or a single partition key prefix, you need single-digit millisecond latency at any scale, and you can tolerate eventual consistency (or pay for strong consistency), choose DynamoDB.
If you need both? Use a hybrid: store transactional data (orders, users) in RDS, and operational data (sessions, events, pre-computed aggregates) in DynamoDB. Many production systems do exactly this — the key is not forcing one database to do everything.
There's also a middle ground: Amazon Aurora with MySQL compatibility offers up to 5X throughput of standard MySQL and can handle some key-value patterns with the right index design. Don't ignore it as a compromise if your team is familiar with SQL.
The Replication Gap: Why Your Multi-Region Setup Is Lying to You
You think because you enabled DynamoDB global tables or RDS cross-Region read replicas that your data is consistent everywhere. It's not. At least, not in the way your application probably assumes. The disconnect between marketing slides and actual behavior has melted production databases for teams who didn't read the fine print.
DynamoDB global tables use last-writer-wins (LWW) conflict resolution. If two simultaneous writes hit different Regions for the same item, the later timestamp wins—period. Your carefully sequenced business logic doesn't matter. RDS cross-Region replicas are asynchronous. A primary failure can drop seconds of committed writes that never made it to the replica. That's not a failover; that's data loss.
You need to audit every request your application makes after a Region failover. Are you reading from the new primary assuming it has the same state as the old? It doesn't. You need idempotency keys, defensive read-repair, and a clear understanding of your actual recovery point objective (RPO), not the one you wish you had.
Cost Tetris: The Signal-to-Noise Ratio Nobody Optimizes
Most teams optimize for read cost or write cost. They miss the real killer: the interaction between access patterns and pricing models. A single DynamoDB scan on a large table can cost more in consumed RCUs than a month of normal traffic. An RDS query that triggers a full table scan on a 500GB table will spike your IOPS bill and degrade every other query.
The problem isn't the database. It's that your application issues queries the database wasn't designed for, and you pay for that mismatch in compute, storage I/O, and unnecessary transfers. You need to instrument every query's cost per execution, not just latency. A fast query that eats 1000 RCUs is worse than a slow query that eats 1.
Start by measuring the actual Cost Per Query (CPQ). For DynamoDB, log the ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits from every response. For RDS, enable Performance Insights and monitor the top SQL patterns by total IO cost. When you see a pattern costing 10x the average, rewrite the query or add an index before the bill arrives.
Don't rely on the AWS Cost Explorer alone. It aggregates too much. You need per-query granularity to catch the silent spenders.
Backup and Restore: RDS vs DynamoDB Recovery Semantics
Backup strategies differ fundamentally between RDS and DynamoDB due to their underlying architectures. RDS offers automated snapshots with point-in-time recovery (PITR) up to 35 days, restoring an entire DB instance. DynamoDB provides on-demand and continuous backups with PITR, but restores to a new table—not an in-place overwrite. This matters because RDS backups are tied to storage volumes via EBS snapshots, enabling cross-region copy and instance cloning for testing. DynamoDB backups are exported to S3 in DynamoDB JSON or Parquet format, allowing granular table-level recovery but no native cross-region replication without AWS Backup integration. RDS restores can take hours for multi-TB instances; DynamoDB restores complete in minutes for most tables. Choose RDS when you need full database rollback with minimal downtime. Choose DynamoDB when table-level isolation and fast recovery of specific datasets are critical.
Security and Access Control: IAM vs Network Boundaries
RDS and DynamoDB enforce security through different primitives. RDS relies on VPC security groups and subnet groups—network-level access control that isolates database traffic from the public internet. DynamoDB operates outside the VPC boundary, using IAM policies and VPC endpoints (Gateway or Interface) for private access. RDS authenticates via database users and passwords (or IAM database authentication), while DynamoDB uses IAM roles and resource-level policies exclusively. This creates a critical distinction: RDS requires managing database credentials and rotating them; DynamoDB eliminates password management entirely. RDS supports encryption at rest via KMS per DB instance; DynamoDB encrypts by default with an AWS-owned key or customer-managed KMS key per table. For access control, RDS provides granularity at the database user level; DynamoDB at the item and attribute level using fine-grained access control (FGAC) in IAM policies. Choose RDS when compliance demands network segmentation; choose DynamoDB when per-item authorization is required.
Cost Optimization: Provisioning Waste You Are Paying For
RDS costs come from provisioned IOPS, storage class (gp3, io2), and instance hours—idle instances still burn money. DynamoDB costs from read/write capacity units (RCU/WCU) or on-demand request pricing plus storage per GB. The waste pattern differs: RDS over-provisions IOPS for burst needs, then pays for unused I/O; DynamoDB over-provisions RCU/WCU for peak load, then wastes idle capacity. RDS allows stopping instances (not for Multi-AZ) to save compute costs; DynamoDB has no idle state—you pay per request. Use RDS Reserved Instances for predictable workloads (up to 60% savings). Use DynamoDB reserved capacity (capacity units pre-purchased) for steady-state tables. Monitor RDS CloudWatch metrics for CPU/IOPS headroom >30%—downsize instance class. For DynamoDB, switch to on-demand if RCU utilization drops below 10% for 7 days. RDS storage auto-scaling prevents over-provisioning but spikes costs if misconfigured. DynamoDB auto-scaling targets utilization but incurs write sharding overhead.
The switch broke when we hit 50k writes per second
ConsistentRead parameter for all balance queries. This doubled read costs and halved throughput, revealing the underlying scaling issue. They then moved to RDS PostgreSQL with read replicas, which provided ACID transactional reads for balance operations and eventual consistency for reporting queries.- Match consistency model to the data: financial data needs strong consistency.
- Don't chase horizontal scalability before verifying your access patterns fit key-value models.
- Always test with production-scale traffic before committing to a database decision.
DynamoDB: Set `ConsistentRead: true` in your API call; RDS: Check transaction isolation levelCheck application code for missing read-after-write guaranteesKey takeaways
Common mistakes to avoid
5 patternsUsing DynamoDB for complex relational data
Choosing RDS for single-table, high-throughput key-value workloads
Super-indexing your DynamoDB table
Ignoring partition key design in DynamoDB
Not testing with realistic read/write loads before choosing
Interview Questions on This Topic
When would you choose DynamoDB over RDS for a new application?
Frequently Asked Questions
20+ years shipping production infrastructure and CI/CD at scale. Written from production experience, not tutorials.
That's Cloud. Mark it forged?
10 min read · try the examples if you haven't