Intermediate 12 min · March 06, 2026

DynamoDB Basics Explained: Tables, Keys, and Real-World Patterns

Q: What is the difference between DynamoDB GetItem and Query?

GetItem fetches a single item when you know its complete primary key (partition key + sort key) — it's always O(1) and the fastest possible read. Query fetches multiple items that share the same partition key, optionally filtered by sort key conditions. Query is efficient because it reads only one partition, but it returns a collection rather than a single item. Never use Scan when Query will do the job.

Q: Can I change my DynamoDB partition key after creating the table?

No — the primary key (partition key and sort key) is set at table creation and cannot be changed. If you need a different key structure, you have to create a new table and migrate your data to it. This is the most important reason to design your access patterns carefully before creating your DynamoDB table.

Q: Is DynamoDB always eventually consistent, or can I get strongly consistent reads?

DynamoDB offers both. By default, reads are eventually consistent, meaning you might read slightly stale data for a few milliseconds after a write. You can request strongly consistent reads by setting ConsistentRead=True in your GetItem or Query call — this guarantees you see the latest committed data. Strongly consistent reads cost twice as many read capacity units, and they are not available on Global Secondary Indexes at all.

Q: How do I handle time-series data in DynamoDB?

Use a composite sort key with a timestamp (e.g., 'YYYY-MM-DDTHH:MM:SS') and the partition key as the device or sensor ID. This lets you query 'get all readings from sensor-42 in the last hour' efficiently. For very high write rates (millions per second), use write sharding — partition on a high-cardinality key like a random UUID, and include a timestamp as sort key. For historical data, use Time to Live (TTL) to auto-expire old records and export them to S3 via DynamoDB Streams.

Q: What is the difference between PAY_PER_REQUEST and provisioned capacity?

PAY_PER_REQUEST (on-demand) charges per read/write request with no capacity planning — great for unpredictable workloads or new applications. Provisioned capacity lets you specify RCUs and WCUs, which gives cost predictability and can be more cost-effective at steady, high throughput. On-demand costs about 2-3x more per request than provisioned for high volumes. You can switch between modes once per 24 hours, but note that on-demand has a per-partition throughput limit (~1,000 WCU per partition) that can still cause throttling if traffic is extremely skewed.

DynamoDB basics demystified — learn partition keys, sort keys, single-table design, and when to choose DynamoDB over SQL with battle-tested code examples..

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

✓ Production

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of fundamentals
✓Comfortable reading code examples
✓Basic production concepts

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

DynamoDB is a fully managed NoSQL key-value and document database with single-digit-millisecond latency at any scale
Partition key determines which physical node stores your data; choose high-cardinality attributes to avoid hot partitions
Sort key enables efficient range queries within a partition; composite sort keys like 'DATE#ID' unlock powerful access patterns
Global Secondary Indexes let you query by a different key, but they add cost and eventual consistency lag
Single-table design co-locates related entities (customers + orders) in one table to replace joins with fast partition queries
The biggest mistake: modeling tables like SQL schemas instead of mapping access patterns first

✦ Definition~90s read

What is DynamoDB Basics?

DynamoDB is a fully managed NoSQL key-value and document database from AWS, designed for single-digit-millisecond latency at any scale. Unlike traditional relational databases that normalize data across many tables with joins, DynamoDB forces you to think in terms of a single table with a composite primary key (partition key + optional sort key) and uses secondary indexes to support alternative access patterns.

★

Imagine a massive library with billions of books.

It's serverless—you don't provision servers, just throughput capacity—and it scales horizontally by automatically partitioning data across multiple nodes based on the partition key's hash. This makes it ideal for high-traffic applications like gaming leaderboards, session stores, and e-commerce catalogs, but it's a poor fit for complex ad-hoc queries, multi-row transactions (though DynamoDB Transactions exist, they're limited), or workloads requiring relational integrity.

Real-world usage often hits 10-20 million requests per second at companies like Lyft, Snapchat, and Airbnb, but only when you embrace its single-table design philosophy rather than trying to map SQL patterns onto it.

Plain-English First

Imagine a massive library with billions of books. Instead of organizing them alphabetically (which slows down as the library grows), you assign every book a unique locker number and go straight to that locker instantly — no browsing required. DynamoDB works the same way: it uses a 'key' to jump directly to your data in milliseconds, no matter if you have 100 rows or 100 billion. The catch? You have to decide how to label your lockers BEFORE you start filling them.

Every app eventually hits a wall with its database. SQL tables start choking on hundreds of millions of rows, joins get expensive, and scaling horizontally becomes an engineering nightmare. Amazon DynamoDB was built specifically to shatter that wall — it's the database that powers Amazon's own shopping cart, which handles millions of writes per second during Prime Day without breaking a sweat. If you're building anything at scale on AWS, DynamoDB will cross your path sooner or later.

The core problem DynamoDB solves is predictable performance at any scale. Traditional relational databases slow down as data grows because query planners scan more rows and indexes get larger. DynamoDB sidesteps this entirely by requiring you to declare your access patterns upfront. You design your keys so that every query hits exactly one partition — think of it as pre-building the fast path before traffic arrives, not scrambling to optimize after the fact.

By the end of this article you'll understand how DynamoDB's partition and sort keys actually work under the hood, how to model a real e-commerce order system without a single SQL join, why single-table design exists and when it's overkill, and the two mistakes that will silently ruin your DynamoDB performance in production.

What DynamoDB Actually Is: A Managed Key-Value Store with a Single-Table Mindset

DynamoDB is a fully managed NoSQL database that stores data as items in tables, each uniquely identified by a primary key. The core mechanic is simple: you get or put an item by its key in single-digit millisecond latency at any scale. There are no joins, no foreign keys, no secondary indexes by default. Every query must target a specific partition key — or, at best, a range of sort keys within one partition. This constraint is not a limitation; it's the design that enables predictable performance. DynamoDB achieves consistent single-digit millisecond latency by distributing data across partitions using a hash of the partition key. Provisioned throughput is divided evenly across partitions, so a hot key — one accessed far more than others — throttles requests even if the table has spare capacity. The only way to avoid this is to design your partition key for high cardinality and uniform access. On-demand capacity auto-scales but still has a per-partition throughput limit. The real power emerges when you model one-to-many and many-to-many relationships using composite sort keys, sparse indexes, and the single-table design pattern. Use DynamoDB when your workload is read- or write-heavy with predictable access patterns — user sessions, game state, IoT telemetry, or metadata caches. Avoid it when you need ad-hoc queries across multiple attributes, complex aggregations, or transactions spanning unrelated entities. In production, DynamoDB shines when you accept its constraints and model your data to match its access patterns, not when you try to make it behave like a relational database.

⚠ Hot Partition Trap

A partition key with low cardinality (e.g., 'status' with only 'active'/'inactive') will throttle all requests to that single partition, regardless of total table throughput.

📊 Production Insight

A gaming leaderboard table used a single partition key 'leaderboard' for all scores, causing 100% request throttling under 50k writes/second.

Symptom: ProvisionedThroughputExceededException on a table with 90% unused capacity.

Rule: Always design partition keys to have at least 1000 distinct values for every 1000 WCU provisioned.

🎯 Key Takeaway

DynamoDB is not a relational database — model for access patterns, not entities.

Partition key cardinality and uniform access are the single most important design decisions.

Single-table design with composite sort keys and sparse indexes replaces joins and ad-hoc queries.

thecodeforge.io

Dynamodb Basics

Partition Keys and Sort Keys: The Engine Under the Hood

Every DynamoDB table has a primary key. That key comes in two flavors: a simple primary key (just a partition key) or a composite primary key (partition key + sort key). Understanding the difference isn't just syntax trivia — it determines what queries you can run efficiently.

The partition key is hashed by DynamoDB internally to decide which physical server (partition) stores your item. This is why it's sometimes called a hash key. Every read and write for that item goes to exactly that one partition. The golden rule: items with the same partition key live on the same server. That's powerful — it means related data is co-located — but it also means if one partition key gets hit with 90% of your traffic, you have a 'hot partition' problem and performance tanks.

The sort key (also called a range key) doesn't affect which partition stores the item, but it determines the order of items within that partition. This lets you query a range — 'give me all orders for customer-42 placed in the last 30 days' — in a single, efficient request. That range query is only possible because the sort key values are stored in sorted order on disk within each partition. Think of the partition key as the drawer label in a filing cabinet, and the sort key as the alphabetical tabs inside that drawer.

create_orders_table.pyPYTHON

import boto3

# Create a DynamoDB client pointing at a local DynamoDB instance for dev
# In production, remove 'endpoint_url' and use your AWS region
dynamodb = boto3.resource(
    'dynamodb',
    region_name='us-east-1',
    endpoint_url='http://localhost:8000'  # local DynamoDB for safe testing
)

# Create the Orders table with a COMPOSITE primary key:
#   partition_key = customer_id  (groups all orders for one customer together)
#   sort_key      = order_date#order_id  (allows range queries by date)
table = dynamodb.create_table(
    TableName='Orders',
    KeySchema=[
        {
            'AttributeName': 'customer_id',   # partition key
            'KeyType': 'HASH'
        },
        {
            'AttributeName': 'order_date_order_id',  # sort key (composite value)
            'KeyType': 'RANGE'
        }
    ],
    AttributeDefinitions=[
        # Only KEY attributes go here — DynamoDB is schemaless for everything else
        {'AttributeName': 'customer_id',        'AttributeType': 'S'},  # S = String
        {'AttributeName': 'order_date_order_id','AttributeType': 'S'}
    ],
    BillingMode='PAY_PER_REQUEST'  # on-demand pricing — no capacity planning needed for dev
)

# Wait until the table is actually ready before writing to it
table.wait_until_exists()
print(f"Table status: {table.table_status}")
print(f"Table name:   {table.table_name}")

Output

Table status: ACTIVE

Table name: Orders

⚠ Watch Out: Attribute Definitions Are NOT Your Schema

You only list attributes in AttributeDefinitions if they are used in a key (primary key or GSI/LSI key). Listing every attribute there is a common mistake that confuses people coming from SQL — DynamoDB is schemaless for non-key attributes, so each item can have completely different fields.

📊 Production Insight

Hot partitions are the #1 DynamoDB killer. One skewed partition key (like 'status' with 3 values) can throttle your entire table.

Always validate partition key cardinality against your peak traffic per key.

Rule: if you cannot predict the max writes per partition key value, redesign before deploying.

🎯 Key Takeaway

Partition key determines server, sort key determines order.

High cardinality = good. Low cardinality = hot partition.

Design your keys so every query hits exactly one partition.

Writing and Reading Data — With a Real E-Commerce Pattern

Now that the table exists, let's populate it with real data and then query it the way a production app would. The key insight here is that DynamoDB offers two fundamentally different operations: GetItem (fetch by exact primary key — always fast, O(1)) and Query (fetch all items sharing a partition key, optionally filtered by sort key — still fast because it stays within one partition). There's also Scan, which reads every item in the table — treat this like a last resort in production.

For our e-commerce example, the pattern is: partition key = customer_id, sort key = a string that combines the date and the order ID separated by a # symbol. That separator trick is deliberately chosen. Because DynamoDB sorts strings lexicographically, the ISO date format (YYYY-MM-DD) sorts chronologically. So 'give me all orders after 2024-01-01' becomes a simple sort key condition — begins_with or between — without scanning the whole table.

This is the core discipline of DynamoDB design: your sort key isn't just an ID, it's a query tool. Every character in it is a deliberate choice about what queries you want to be able to answer efficiently.

orders_crud.pyPYTHON

import boto3
from boto3.dynamodb.conditions import Key
from decimal import Decimal  # DynamoDB uses Decimal, not float, for numbers

dynamodb = boto3.resource(
    'dynamodb',
    region_name='us-east-1',
    endpoint_url='http://localhost:8000'
)
table = dynamodb.Table('Orders')

# ─── WRITE: Put three orders for the same customer ─────────────────────────
# Sort key format: "YYYY-MM-DD#ORDER-ID" — date first so range queries work
orders_to_insert = [
    {
        'customer_id':         'CUST-42',
        'order_date_order_id': '2024-01-15#ORD-001',
        'total_usd':           Decimal('59.99'),
        'status':              'DELIVERED',
        'items':               ['Mechanical Keyboard', 'USB Hub']
    },
    {
        'customer_id':         'CUST-42',
        'order_date_order_id': '2024-03-22#ORD-002',
        'total_usd':           Decimal('149.00'),
        'status':              'SHIPPED',
        'items':               ['Monitor Stand']
    },
    {
        'customer_id':         'CUST-42',
        'order_date_order_id': '2024-06-10#ORD-003',
        'total_usd':           Decimal('29.99'),
        'status':              'PROCESSING',
        'items':               ['Desk Mat']
    }
]

for order in orders_to_insert:
    table.put_item(Item=order)  # put_item: create or fully overwrite an item

print("Inserted 3 orders for CUST-42")

# ─── READ: Get ONE specific order by its full primary key ───────────────────
response = table.get_item(
    Key={
        'customer_id':         'CUST-42',
        'order_date_order_id': '2024-03-22#ORD-002'
    }
)
specific_order = response.get('Item', {})
print(f"\nSpecific order: {specific_order['order_date_order_id']} — ${specific_order['total_usd']}")

# ─── QUERY: Get all orders for CUST-42 placed on or after 2024-03-01 ────────
# 'between' on the sort key stays inside one partition — this is efficient!
range_response = table.query(
    KeyConditionExpression=
        Key('customer_id').eq('CUST-42') &
        Key('order_date_order_id').between('2024-03-01', '2024-12-31')
)

print(f"\nOrders from March 2024 onwards ({range_response['Count']} found):")
for order in range_response['Items']:
    print(f"  {order['order_date_order_id']} | ${order['total_usd']} | {order['status']}")

# ─── UPDATE: Change just the status field without rewriting the whole item ───
table.update_item(
    Key={
        'customer_id':         'CUST-42',
        'order_date_order_id': '2024-03-22#ORD-002'
    },
    UpdateExpression='SET #s = :new_status',  # #s avoids 'status' reserved word conflict
    ExpressionAttributeNames={'#s': 'status'},
    ExpressionAttributeValues={':new_status': 'DELIVERED'}
)
print("\nUpdated ORD-002 status to DELIVERED")

Output

Inserted 3 orders for CUST-42

Specific order: 2024-03-22#ORD-002 — $149.00

Orders from March 2024 onwards (2 found):

2024-03-22#ORD-002 | $149.00 | SHIPPED

2024-06-10#ORD-003 | $29.99 | PROCESSING

Updated ORD-002 status to DELIVERED

💡Pro Tip: 'status' Is a Reserved Word in DynamoDB

DynamoDB has hundreds of reserved words (STATUS, NAME, DATE, etc.). If you use them directly in an UpdateExpression or FilterExpression you'll get a cryptic validation error. Always use ExpressionAttributeNames with a # prefix alias for any attribute that might be reserved. Keeping this habit prevents a whole class of mysterious runtime bugs.

📊 Production Insight

Always test your composite sort key format with real data before production.

A typo like 'order_date' vs 'order_date_order_id' can break range queries silently.

Rule: document the exact sort key pattern for your team — one person's separator choice is another's bug.

🎯 Key Takeaway

GetItem = O(1) by exact key. Query = efficient partition-level scan.

Design your sort key to be a query tool, not just an ID.

Never use Scan in production — pre-plan access patterns.

thecodeforge.io

Dynamodb Basics

Global Secondary Indexes: Querying Data a Different Way

Here's the problem you'll hit in week two of using DynamoDB: your table is perfectly designed for 'get all orders by customer', but product management just asked for 'get all orders with status SHIPPED'. That query doesn't match your partition key at all. In SQL, you'd add an index. In DynamoDB, you add a Global Secondary Index (GSI).

A GSI is essentially a separate, automatically maintained copy of your table organized around a different key. You define a new partition key (and optionally a sort key), and DynamoDB replicates writes to that index asynchronously. Queries against a GSI are just as fast as queries against the base table — because the same partitioning logic applies.

The word 'global' means the index spans the entire table, not just one partition. There's also a Local Secondary Index (LSI), which must share the base table's partition key but can have a different sort key — and it must be defined at table creation time, not added later. GSIs can be added to existing tables, making them far more flexible in practice.

The design trade-off: every GSI costs you money (storage + write capacity) and adds a few milliseconds of eventual consistency lag. You're not getting something for free — you're choosing which query patterns deserve their own fast path.

orders_gsi.pyPYTHON

import boto3
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource(
    'dynamodb',
    region_name='us-east-1',
    endpoint_url='http://localhost:8000'
)

# Add a GSI to the existing Orders table so we can query by order status
# GSI partition key: status  |  GSI sort key: order_date_order_id
# This lets us ask: "give me all SHIPPED orders, most recent first"
dynamodb.meta.client.update_table(
    TableName='Orders',
    AttributeDefinitions=[
        {'AttributeName': 'status',              'AttributeType': 'S'},
        {'AttributeName': 'order_date_order_id', 'AttributeType': 'S'}
    ],
    GlobalSecondaryIndexUpdates=[
        {
            'Create': {
                'IndexName': 'StatusByDate-GSI',
                'KeySchema': [
                    {'AttributeName': 'status',              'KeyType': 'HASH'},
                    {'AttributeName': 'order_date_order_id', 'KeyType': 'RANGE'}
                ],
                'Projection': {
                    # INCLUDE only the fields we actually need — saves cost vs ALL
                    'ProjectionType': 'INCLUDE',
                    'NonKeyAttributes': ['customer_id', 'total_usd']
                }
            }
        }
    ]
)

print("GSI creation initiated (takes a moment to backfill)...")

# Wait for the GSI to become ACTIVE before querying it
waiter = dynamodb.meta.client.get_waiter('table_exists')
waiter.wait(TableName='Orders')

table = dynamodb.Table('Orders')

# ─── Query via GSI: all SHIPPED orders ──────────────────────────────────────
gsi_response = table.query(
    IndexName='StatusByDate-GSI',   # tell DynamoDB to use our new index
    KeyConditionExpression=Key('status').eq('SHIPPED')
)

print(f"\nAll SHIPPED orders ({gsi_response['Count']} found):")
for order in gsi_response['Items']:
    print(f"  Customer: {order['customer_id']} | "
          f"Order: {order['order_date_order_id']} | "
          f"Total: ${order['total_usd']}")

Output

GSI creation initiated (takes a moment to backfill)...

All SHIPPED orders (1 found):

Customer: CUST-42 | Order: 2024-03-22#ORD-002 | Total: $149.00

🔥Interview Gold: GSI Eventual Consistency

GSIs are eventually consistent — a write to the base table will propagate to the GSI within milliseconds, but not instantly. This means a query on a GSI immediately after a write might not reflect the latest data. For use cases where that matters (e.g., financial transactions), always read from the base table using the full primary key, which guarantees strong consistency.

📊 Production Insight

GSI write capacity is subtracted from your table's write capacity if using provisioned mode — a common surprise.

If you have 3 GSIs, each write to the base table consumes 4 write capacity units (1 base + 3 GSI). Plan accordingly.

Rule: use ProjectionType INCLUDE to keep GSI storage costs down — projecting ALL attributes is expensive.

🎯 Key Takeaway

GSIs let you query by a different key, but cost money and add eventual consistency.

LSIs share the partition key but must be defined at table creation — plan ahead.

Design GSI projections carefully: include only the attributes you need.

Single-Table Design: Why One Table Often Beats Many

Single-table design (STD) is the concept that trips up virtually every developer migrating from SQL. The instinct is to create one DynamoDB table per entity — an Orders table, a Customers table, a Products table — mirroring what you'd do in a relational database. But DynamoDB wasn't designed for that, and doing it that way throws away its biggest advantage.

In SQL, you join across tables at query time. DynamoDB has no joins. Instead, single-table design stores heterogeneous entity types in the same table, using generic attribute names like PK and SK (partition key and sort key) whose values encode the entity type. For example: PK='CUSTOMER#CUST-42', SK='PROFILE' for a customer record, and PK='CUSTOMER#CUST-42', SK='ORDER#2024-03-22#ORD-002' for an order belonging to that customer. Now a single Query call with PK='CUSTOMER#CUST-42' returns the customer profile AND all their orders in one network round-trip.

This is powerful for read-heavy access patterns, but it comes with a real cost: the model is harder to reason about, harder to query ad-hoc (e.g., for analytics), and painful if your access patterns change. Single-table design is a deliberate trade: you sacrifice flexibility for performance and cost efficiency. It's the right call for a microservice with stable, well-understood access patterns. It's the wrong call for an exploratory analytics workload — use Athena or Redshift for that.

single_table_design.pyPYTHON

import boto3
from boto3.dynamodb.conditions import Key
from decimal import Decimal

dynamodb = boto3.resource(
    'dynamodb',
    region_name='us-east-1',
    endpoint_url='http://localhost:8000'
)

# One table to rule them all — stores Customers AND Orders together
# Generic key names PK / SK are conventional in single-table design
table = dynamodb.create_table(
    TableName='ECommerceApp',
    KeySchema=[
        {'AttributeName': 'PK', 'KeyType': 'HASH'},
        {'AttributeName': 'SK', 'KeyType': 'RANGE'}
    ],
    AttributeDefinitions=[
        {'AttributeName': 'PK', 'AttributeType': 'S'},
        {'AttributeName': 'SK', 'AttributeType': 'S'}
    ],
    BillingMode='PAY_PER_REQUEST'
)
table.wait_until_exists()

# ─── Write a CUSTOMER record ─────────────────────────────────────────────────
# PK and SK both describe what the item IS — the value encodes entity type
table.put_item(Item={
    'PK':           'CUSTOMER#CUST-42',   # partition key identifies the customer
    'SK':           'PROFILE',            # sort key distinguishes record type
    'entity_type':  'Customer',           # explicit type tag — helps with filtering
    'full_name':    'Alex Rivera',
    'email':        'alex@example.com',
    'loyalty_tier': 'Gold'
})

# ─── Write two ORDERS for the same customer ──────────────────────────────────
# Same PK as the customer — they share a partition, so one Query fetches both
table.put_item(Item={
    'PK':          'CUSTOMER#CUST-42',
    'SK':          'ORDER#2024-01-15#ORD-001',   # ORDER# prefix separates orders from profile
    'entity_type': 'Order',
    'total_usd':   Decimal('59.99'),
    'status':      'DELIVERED'
})
table.put_item(Item={
    'PK':          'CUSTOMER#CUST-42',
    'SK':          'ORDER#2024-06-10#ORD-003',
    'entity_type': 'Order',
    'total_usd':   Decimal('29.99'),
    'status':      'PROCESSING'
})

print("Inserted customer profile and 2 orders into single table")

# ─── Fetch customer profile + ALL orders in ONE query call ───────────────────
# begins_with('ORDER#') filters to just orders, skipping the PROFILE item
all_items_response = table.query(
    KeyConditionExpression=Key('PK').eq('CUSTOMER#CUST-42')
)
print(f"\nAll items for CUST-42 ({all_items_response['Count']} total):")
for item in all_items_response['Items']:
    print(f"  [{item['entity_type']}] SK={item['SK']}")

# Get ONLY the orders with a begins_with filter on the sort key
orders_only_response = table.query(
    KeyConditionExpression=
        Key('PK').eq('CUSTOMER#CUST-42') &
        Key('SK').begins_with('ORDER#')
)
print(f"\nOrders only ({orders_only_response['Count']} found):")
for order in orders_only_response['Items']:
    print(f"  {order['SK']} | ${order['total_usd']} | {order['status']}")

Output

Inserted customer profile and 2 orders into single table

All items for CUST-42 (3 total):

[Customer] SK=PROFILE

[Order] SK=ORDER#2024-01-15#ORD-001

[Order] SK=ORDER#2024-06-10#ORD-003

Orders only (2 found):

ORDER#2024-01-15#ORD-001 | $59.99 | DELIVERED

ORDER#2024-06-10#ORD-003 | $29.99 | PROCESSING

💡Pro Tip: Map Your Access Patterns Before You Touch the Console

With DynamoDB, your data model is a direct function of your access patterns. Before creating a single table, write down every query your app needs to make (e.g., 'get customer by ID', 'get all orders for a customer', 'get all shipped orders'). Each access pattern should map to either a direct key lookup or a Query on the base table or a GSI. If you skip this step and model by entity like a SQL developer, you'll redesign your table at 2am under production pressure.

📊 Production Insight

Single-table design can reduce read costs dramatically, but it's brittle when access patterns evolve.

Adding a new query pattern later may require a full data migration — not just a new index.

Rule: only use single-table design when your access patterns are stable and you understand the trade-off. Otherwise, multiple tables with GSIs are safer.

🎯 Key Takeaway

One table with PK/SK encoding = replaces SQL joins with partition queries.

Great for stable access patterns, painful for ad-hoc analysis.

Trade flexibility for performance: know when to walk away from STD.

Error Handling, Retries, and the Cost of Not Planning for Throttling

DynamoDB is reliable, but it's not magic. When you exceed your provisioned throughput or hit a hot partition, DynamoDB returns ProvisionedThroughputExceededException. The SDK handles retries with exponential backoff by default — but that default can be too slow for user-facing requests. Understanding how to configure retries and handle errors is critical for production.

Every AWS SDK client includes a retry mechanism. By default, it retries up to 3 times with exponential backoff. That means a single failed request might take 1 + 2 + 4 = 7 seconds before the SDK gives up. For a customer-facing checkout, that's an unacceptable delay. You need to implement your own retry logic with jitter, and more importantly, detect the root cause — is it a burst of traffic (transient) or a sustained pattern (needs rearchitecting)?

Another silent killer: conditional writes that fail because of a stale version. If you use conditional writes for optimistic locking (e.g., update_item with ConditionExpression), a concurrent write will throw ConditionalCheckFailedException. Your code must catch that and retry the entire read-modify-write cycle. This is where the 'lost update' problem hides — never assume a conditional write succeeds.

DynamoDB also has a 1MB limit on Query/Scan results. If your query returns more than 1MB of data, you'll get paginated results via LastEvaluatedKey. Failing to paginate is a classic bug: you get the first page, assume it's complete, and silently miss data. Always check LastEvaluatedKey in a loop until it's null.

retry_with_jitter.pyPYTHON

import boto3
import time
import random
from botocore.exceptions import ClientError

dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Orders')

def put_item_with_retry(item, max_retries=3):
    """Put an item with exponential backoff and jitter.
    
    AWS SDK default retry is fine for background jobs, but for user-facing
    endpoints we want faster fallback and better logging.
    """
    for attempt in range(max_retries):
        try:
            table.put_item(Item=item)
            return True
        except ClientError as e:
            code = e.response['Error']['Code']
            if code == 'ProvisionedThroughputExceededException':
                # Exponential backoff with jitter: base = 0.1s, factor = 2, cap = 2s
                sleep_time = min(0.1 * (2 ** attempt) + random.random() * 0.1, 2.0)
                print(f"Retry {attempt+1} after {sleep_time:.2f}s — throttled")
                time.sleep(sleep_time)
            else:
                # Non-throttling error — re-raise immediately
                raise
    print("Failed after max retries — consider failing over or returning 503")
    return False

# Usage: put_item_with_retry({'customer_id': 'CUST-42', ...})

# ─── Full pagination example for Query ───────────────────────────────────────
def get_all_orders(customer_id):
    """Handle pagination correctly — never assume one page is all the data."""
    all_items = []
    last_key = None
    while True:
        query_kwargs = {
            'KeyConditionExpression': Key('customer_id').eq(customer_id)
        }
        if last_key:
            query_kwargs['ExclusiveStartKey'] = last_key
        
        response = table.query(**query_kwargs)
        all_items.extend(response['Items'])
        
        last_key = response.get('LastEvaluatedKey')
        if not last_key:
            break
    return all_items

Output

(No direct output — run in your environment. The retry function prints retry attempts.)

Mental Model

Mental Model: DynamoDB Throttling vs SQL Deadlock

Think of throttling as a traffic jam, not a crash.

SQL deadlock stops everything — you need to kill a transaction.
DynamoDB throttling only affects requests to a hot partition; other partitions keep working.
The fix for throttling is almost never to increase total capacity — it's to redistribute the load.
Exponential backoff with jitter (rather than fixed retries) prevents thundering herd when the partition recovers.

📊 Production Insight

Default SDK retries will mask hot partitions for minutes — you lose observability.

Always log the partition key of throttled requests so you can spot skew patterns.

Rule: instrument your code with metrics on retry count per partition key. Alert if any key exceeds 1% throttle rate.

🎯 Key Takeaway

Retries with jitter are mandatory, but they are a band-aid — fix the partition key skew.

Always paginate Query results; LastEvaluatedKey is the check.

Conditional write failures need full read-modify-write retry — log and alert on pattern.

Who This Actually Applies To — And Who Should Walk Away

DynamoDB isn't a general-purpose database. It's a hammer for a specific set of nails. If you're building an e-commerce cart, a gaming leaderboard, a session store, or an IoT telemetry pipeline — you're in the right room. If you're looking for complex joins, ad-hoc analytical queries, or ACID transactions across five tables with rollbacks — you need Postgres or Aurora, and you need to leave before the architecture astronauts convince you otherwise.

This assumes you've run a production database before. You should understand partitioning, indexes, and the difference between OLTP and OLAP. If the phrase "eventual consistency" makes you nervous, go read the DynamoDB paper first. We're not selling you a magic button. We're teaching you a tool that punishes ignorance at scale.

You should be comfortable with JSON document modeling and have a strong opinion about when denormalization makes sense. If your first instinct is "but normalization is always better," you'll fight the tool and lose. The prerequisite is humility — accepting that relational orthodoxy hurts in a key-value world.

BeforeYouStart.sqlSQL

// io.thecodeforge — database tutorial

// The only query you'll run to validate you're ready:
// If this feels wrong, you need a different database

CREATE TABLE orders (
  order_id UUID PRIMARY KEY,
  user_id UUID,
  total DECIMAL(10,2),
  status VARCHAR(20)
);

CREATE INDEX idx_user_status ON orders(user_id, status);

// In DynamoDB, you'd model this as:
// PK: USER#<user_id>
// SK: ORDER#<order_id>
// Plus a GSI on status if you need it.
// Different paradigm. Same outcome. Different cost structure.

Output

CREATE TABLE

CREATE INDEX

// Query planner uses index for: WHERE user_id = ? AND status = ?

// In DynamoDB: Query on PK=USER#abc with SK begins_with ORDER#

// Then filter client-side or filter expression

// Both return data. Only one bills you per read operation.

⚠ The 'Works On My Machine' Trap:

Local DynamoDB and DynamoDB Local are not equivalent to production. Local has no throttling, no partition splitting, no hot key behavior. If you only test locally with 10k items, you haven't tested. Period.

🎯 Key Takeaway

DynamoDB is for workloads that fit a single-table design with predictable access patterns. If you need arbitrary joins or ad-hoc queries, you're using the wrong tool from day one.

What You Actually Need Before Touching the Console

Prerequisites aren't a checkbox exercise. They're a survival kit. First, you need a clear mental model of your access patterns — not your schema. Write them down. "Get order history for user" is a pattern. "Show me all orders from last week" is a different pattern. If you can't enumerate every query your app will make before writing a single PutItem, you're not ready.

Second, you need a basic understanding of throughput math. One RCU reads one item up to 4KB strongly consistent, or two items eventually consistent. One WCU writes one item up to 1KB. This isn't trivia — it's how you estimate your bill before your manager asks why the DevOps cost tripled.

Third, bring a realistic dataset. Not 10 records. Not 10 million. But enough to see your hot keys emerge. 100k items with skewed distribution will show you what your production 10M items look like. If you test with uniform distribution, you're testing a lie.

If you can't articulate your read-to-write ratio and your peak concurrency within 20% accuracy, go back to the whiteboard. DynamoDB punishes vague requirements with vague bills.

CapacityPlanning.sqlSQL

// io.thecodeforge — database tutorial

// Before you write code, estimate capacity

-- Assumptions:
-- 1000 users, each writes 5 orders/hour (avg 2KB each)
-- Reads: 10x reads per write (user checks order status often)

-- Write capacity:
-- 5000 writes/hour = ~1.4 writes/second
-- Each write needs 1 WCU (since item < 1KB: 2KB / 1KB = 2 WCU)
-- Total WCU/s: 2.8 (round to 3)

-- Read capacity (eventually consistent):
-- 14000 reads/hour = ~3.9 reads/second
-- Each read = half RCU (eventually consistent)
-- 3.9 RCU/s (with headroom: 5 RCU/s)

-- Total burst capacity (5 min): 5 * 60 * 5 = 1500 reads
-- If you spike to 50 reads/second, you have 30 seconds before throttle
-- That's your emergency window. Don't waste it.

Output

Estimated WCU/s: 3

Estimated RCU/s: 5

5-minute burst capacity: 1500 reads

Time to throttle at 10x peak: 30 seconds

Consider: On-demand vs provisioned? At this scale, on-demand is cheaper.

💡Senior Shortcut: The 80/20 Rule

80% of your read capacity will be consumed by 20% of your users — the 'hot keys.' Identify them early. If one user's partition key gets hammered, you have a hot key. Use partition key design to spread load. Or use DynamoDB Accelerator (DAX) to cache reads and absorb spikes.

🎯 Key Takeaway

Prerequisites aren't about knowing DynamoDB. They're about knowing your workload cold. Access patterns first, schema second, capacity third. Reverse that order and you'll be throttled before you ship.

Advanced Features You'll Actually Use: TTL, Transactions, and Streams

DynamoDB isn't just get and put. Three advanced features separate a toy from a production system. Time To Live (TTL) lets you expire records automatically. Set a TTL attribute on your order items — DynamoDB deletes them after the timestamp passes. No cron jobs, no Lambda triggers to clean up stale data.

Transactions give you ACID across up to 25 items in a single request. Need to debit one account and credit another? Wrap both operations in a TransactWrite. If either fails, neither happens. This is your escape hatch from eventual consistency when your business demands atomicity.

DynamoDB Streams capture every write in order. Feed them to a Lambda for real-time indexing, search sync, or event-driven workflows. Combined with the Change Data Capture pattern, streams replace half your polling code. Production teams use streams to materialize views in Elasticsearch or sync to a Redshift replica without touching the source table.

Ignore these three and you're writing workarounds DynamoDB already handles.

Example.sqlSQL

// io.thecodeforge — database tutorial

-- Set TTL on 'expiresAt' column (epoch seconds)
ALTER TABLE orders
TTL = expiresAt;

-- Transaction: update inventory and order status atomically
BEGIN TRANSACTION
  UPDATE inventory SET quantity = quantity - 1
  WHERE product_id = 'P123' AND quantity > 0;
  UPDATE orders SET status = 'fulfilled'
  WHERE order_id = 'O789' AND status = 'pending';
COMMIT;

Output

-- TTL enabled on orders.expiresAt

-- Transaction committed successfully

⚠ Production Trap:

TTL deletes are eventually consistent. Never rely on them for immediate capacity recovery. Your batch job must still handle stale data for up to 48 hours.

🎯 Key Takeaway

Use TTL for automatic cleanup, transactions for atomic multi-item writes, and streams for real-time event processing.

Why Your RDBMS Reflexes Break on DynamoDB

You learned SQL joins, normalized schemas, and vertical scaling. Throw that out. DynamoDB is a different beast with different trade-offs.

In an RDBMS, you model for query flexibility. Normalize into tables, join at read time. In DynamoDB, you model for access patterns. Denormalize into a single table, fetch everything in one query. Joins don't exist. You prep your data at write time so reads are fast. That single-table design isn't a quirk — it's the point.

RDBMS scales vertically. More CPU, more RAM on one box. DynamoDB scales horizontally. One partition can handle 3000 RCU / 1000 WCU. Beyond that, it splits automatically. You don't provision for peaks — you design partition keys that distribute traffic evenly. A bad key (like a timestamp) creates a hot partition and throttled customers.

Cost model flips too. RDBMS charges per query complexity. DynamoDB charges per read/write unit. A full table scan in DynamoDB cost the same as a single-item get — both consume read capacity. Optimize for fewer, larger requests, not smaller queries.

Stop trying to make DynamoDB act like PostgreSQL. Learn its rules, or get burned.

Example.sqlSQL

// io.thecodeforge — database tutorial

-- RDBMS: normalized query with joins
SELECT o.*, i.product_name
FROM orders o
JOIN order_items i ON o.order_id = i.order_id
WHERE o.customer_id = 'C42';

-- DynamoDB: single table, denormalized
SELECT * FROM shop_table
WHERE PK = 'CUSTOMER#C42'
  AND SK BETWEEN 'ORDER#' AND 'ORDER#ZZZ';

Output

-- RDBMS: returns 5 rows across 2 tables

-- DynamoDB: returns 5 items, all data in one query

🔥Senior Shortcut:

If your DynamoDB query needs a JOIN, you designed the partition key wrong. Model for the query, not the data.

🎯 Key Takeaway

DynamoDB trades relational flexibility for predictable performance at scale. Denormalize, pre-join, and optimize for access patterns.

Who Should Read This — And Who’s Wasting Their Time

This guide is built for backend engineers, senior developers, and architects designing systems where predictable latency at scale matters more than relational integrity. You’ll get the most out of DynamoDB if you’re building high-traffic e-commerce platforms, IoT event pipelines, real-time leaderboards, or session stores — any workload that demands single-digit millisecond reads and writes on terabytes of data with automatic scaling. You should stop reading now if your primary need is complex joins, ad-hoc analytical queries, multi-row transactions across five tables, or strict referential integrity enforced by a schema. Likewise, if your team is small and your database fits on a laptop, a managed relational database like PostgreSQL will save you months of cognitive overhead. DynamoDB is not a general-purpose hammer; it is a specialized power tool for access patterns you know in advance. This guide assumes you already understand partition keys, sort keys, and that dynamodb punishes scans. If you’re still writing WHERE clauses naturally, start with NoSQL fundamentals first.

WhoThisIsFor.sqlSQL

// io.thecodeforge — database tutorial
// 25 lines max
-- Bad fit: complex joins and reporting
SELECT orders.id, customers.name
FROM orders
JOIN customers ON orders.customer_id = customers.id
WHERE orders.status = 'shipped';

-- Good fit: single-table access by known key
-- Table: orders-table
-- Partition key: CustomerId
-- Sort key: OrderDate
-- Query: get all orders for one customer
SELECT * FROM orders-table
WHERE CustomerId = 'abc123'
AND OrderDate BETWEEN '2025-01-01' AND '2025-01-31';

Output

-- Returns all orders for customer abc123 in January 2025

⚠ Production Trap:

Do not adopt DynamoDB just because your CTO heard it scales infinitely. If your access patterns are unknown or your schema is relational, you will pay in complexity and monthly bills.

🎯 Key Takeaway

Match the database to your access pattern, not the hype.

Prerequisites: Tools and Concepts You Need First

Before you open the AWS console, make sure you’ve installed the AWS CLI version 2 and configured credentials with an IAM user that has dynamodb:GetItem, dynamodb:PutItem, dynamodb:Query, and dynamodb:CreateTable permissions. You should also have a local DynamoDB instance running via Docker for fast iteration without cost surprises. Conceptually, you must be comfortable with the fact that DynamoDB is schema-on-write: you define only the primary key at table creation, and any attribute can appear on any item. That means your application code enforces data shape — not the database. You also need to understand read capacity units (RCUs) and write capacity units (WCUs) as financial levers, not just technical ones. A single strongly consistent read of a 4KB item costs 1 RCU; a transactional read costs 2 RCUs. Estimate your traffic before creating a table. Finally, adopt a mindset shift: you design for known query patterns first, then build the table structure. If you cannot list your top five access patterns on a whiteboard, go back to product spec — DynamoDB will punish you for guessing.

PrerequisitesCheck.sqlSQL

// io.thecodeforge — database tutorial
// 25 lines max
-- Check AWS CLI configuration
aws configure list

-- Start local DynamoDB (Docker)
docker run -p 8000:8000 amazon/dynamodb-local

-- List tables on local instance
aws dynamodb list-tables --endpoint-url http://localhost:8000

-- Create table with partition + sort key
aws dynamodb create-table \
    --table-name orders \
    --attribute-definitions AttributeName=CustomerId,AttributeType=S AttributeName=OrderDate,AttributeType=S \
    --key-schema AttributeName=CustomerId,KeyType=HASH AttributeName=OrderDate,KeyType=RANGE \
    --billing-mode PAY_PER_REQUEST \
    --endpoint-url http://localhost:8000

Output

{

"TableDescription": {

"TableName": "orders",

"TableStatus": "ACTIVE"

}

⚠ Production Trap:

Never use the root account credentials for the CLI. Create an IAM user with least-privilege policies. A mistake in the console can cost thousands.

🎯 Key Takeaway

Local testing and IAM hygiene prevent expensive cloud surprises.

DynamoDB Single-Table Design: Complete Guide

Single-table design is a core pattern for DynamoDB that leverages its key-value and document nature to store multiple entity types in one table. Unlike relational databases, DynamoDB doesn't support joins, so you denormalize data and use composite keys (partition key + sort key) to model relationships. For example, in an e-commerce app, you might store users, orders, and products in one table. The partition key could be a customer ID, and the sort key could be a prefix like 'USER#', 'ORDER#', or 'PRODUCT#' followed by a unique identifier. This allows you to fetch all related data for a customer with a single query. Access patterns drive the design: you define how you'll query (e.g., get all orders for a user, get product details) and structure keys accordingly. Use Global Secondary Indexes (GSIs) for alternative access patterns. This approach reduces costs (fewer tables, less throughput) and complexity, but requires careful planning. For instance, to store a user and their orders:

```sql -- Single-table schema example (conceptual) -- Partition Key: CustomerID (e.g., 'CUST#123') -- Sort Key: EntityType#ID (e.g., 'USER#123', 'ORDER#456') -- Attributes: entityType, name, email, orderDate, totalAmount

-- Query all entities for a customer SELECT * FROM ecommerce WHERE PK = 'CUST#123';

-- Query only orders for a customer SELECT * FROM ecommerce WHERE PK = 'CUST#123' AND SK BEGINS_WITH 'ORDER#'; ```

This pattern is powerful but requires upfront design; changing access patterns later is difficult. Use tools like NoSQL Workbench to model.

single-table-design.sqlSQL

-- Example: Single-table design for e-commerce
-- Table: ecommerce
-- Partition Key: PK (String)
-- Sort Key: SK (String)

-- Insert a user
INSERT INTO ecommerce VALUE {'PK': 'CUST#123', 'SK': 'USER#123', 'entityType': 'User', 'name': 'Alice', 'email': 'alice@example.com'};

-- Insert an order for the user
INSERT INTO ecommerce VALUE {'PK': 'CUST#123', 'SK': 'ORDER#456', 'entityType': 'Order', 'orderDate': '2024-01-15', 'totalAmount': 99.99};

-- Query all items for customer 123
SELECT * FROM ecommerce WHERE PK = 'CUST#123';

-- Query only orders
SELECT * FROM ecommerce WHERE PK = 'CUST#123' AND SK BEGINS_WITH 'ORDER#';

-- Query a specific order
SELECT * FROM ecommerce WHERE PK = 'CUST#123' AND SK = 'ORDER#456';

💡Plan Access Patterns First

📊 Production Insight

In production, use a single-table design to reduce costs and complexity. Monitor query patterns and adjust GSIs as needed. Avoid overloading a single partition key; distribute writes across partitions.

🎯 Key Takeaway

Single-table design uses composite keys to store multiple entity types in one table, enabling efficient queries without joins, but requires upfront access pattern planning.

DAX: DynamoDB Accelerator for Caching

DAX (DynamoDB Accelerator) is a fully managed, in-memory cache for DynamoDB that delivers up to 10x read performance improvement. It sits between your application and DynamoDB, caching frequently accessed data to reduce response times from milliseconds to microseconds. DAX is ideal for read-heavy workloads like gaming leaderboards, social media feeds, or e-commerce product catalogs. It handles cache invalidation automatically and is strongly consistent for writes. To use DAX, you create a cluster (choose instance size and node count) and update your SDK client to point to the DAX endpoint. DAX supports both eventually consistent and strongly consistent reads. For example, in a product catalog:

```sql -- Without DAX: direct DynamoDB query SELECT * FROM products WHERE PK = 'PROD#123';

-- With DAX: same query, but SDK points to DAX endpoint -- DAX caches the result; subsequent reads are faster ```

DAX is not free; costs depend on instance type and data size. Use it only when sub-millisecond reads are critical. For write-heavy or rarely accessed data, DAX may not be cost-effective. Also, DAX does not support all DynamoDB features (e.g., transactions, some expressions). Test your workload before deploying.

dax-usage.sqlSQL

-- Example: Using DAX with DynamoDB (conceptual SQL)
-- Assume DAX cluster endpoint is 'mycluster.dax.amazonaws.com:8111'

-- Application code (not SQL) would configure DAX client:
-- const daxClient = new DynamoDBClient({ endpoint: 'mycluster.dax.amazonaws.com:8111' });

-- Query via DAX (same DynamoDB API)
SELECT * FROM products WHERE PK = 'PROD#123';

-- DAX caches the result; subsequent identical queries are served from cache.
-- Cache TTL is 5 minutes by default.
-- For strongly consistent reads, DAX bypasses cache and reads from DynamoDB.

-- To invalidate cache on write:
UPDATE products SET price = 19.99 WHERE PK = 'PROD#123';
-- DAX automatically invalidates the cached item for that key.

-- Note: DAX does not support transactions or certain filter expressions.
-- Use DynamoDB directly for those operations.

🔥When to Use DAX

📊 Production Insight

In production, monitor DAX cache hit rates via CloudWatch. If hit rate is low, reconsider using DAX. For critical reads, use strongly consistent reads sparingly as they bypass cache. Plan for DAX cluster sizing based on data size and request rate.

🎯 Key Takeaway

DAX is an in-memory cache that speeds up DynamoDB reads by up to 10x, ideal for read-heavy workloads, but adds cost and doesn't support all features.

DynamoDB Streams and Lambda Integration

DynamoDB Streams capture a time-ordered sequence of item-level changes (inserts, updates, deletes) in a DynamoDB table. Each stream record contains the item's old and new images. You can integrate Streams with AWS Lambda to trigger functions on data changes, enabling event-driven architectures like real-time analytics, cross-region replication, or cache invalidation. For example, when a new order is placed, a Lambda function can send a notification or update a search index. To set up, enable Streams on your table (choose view type: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE, or NEW_AND_OLD_IMAGES), then create a Lambda trigger from the stream. The Lambda receives a batch of records and processes them. Streams guarantee at-least-once delivery, so your function must be idempotent. For example:

```sql -- Enable Streams on table 'orders' with NEW_AND_OLD_IMAGES ALTER TABLE orders ENABLE STREAMS WITH VIEW_TYPE = 'NEW_AND_OLD_IMAGES';

-- Lambda function (pseudo-code) processes stream records: -- exports.handler = async (event) => { -- for (const record of event.Records) { -- if (record.eventName === 'INSERT') { -- const newOrder = record.dynamodb.NewImage; -- // Send notification -- } -- } -- }; ```

Streams have a 24-hour retention period. Use them for real-time processing, not for long-term storage. Costs are based on read request units and Lambda invocations.

streams-lambda.sqlSQL

-- Example: Enable DynamoDB Streams (SQL-like syntax)
-- Table: orders
ALTER TABLE orders ENABLE STREAMS WITH VIEW_TYPE = 'NEW_AND_OLD_IMAGES';

-- Sample stream record (JSON) sent to Lambda:
-- {
--   "eventID": "1",
--   "eventName": "INSERT",
--   "eventSource": "aws:dynamodb",
--   "dynamodb": {
--     "Keys": {"PK": {"S": "ORDER#123"}},
--     "NewImage": {"PK": {"S": "ORDER#123"}, "total": {"N": "99.99"}},
--     "SequenceNumber": "111",
--     "SizeBytes": 26,
--     "StreamViewType": "NEW_AND_OLD_IMAGES"
--   }
-- }

-- Lambda handler (Node.js example):
exports.handler = async (event) => {
  for (const record of event.Records) {
    console.log('Event ID: ' + record.eventID);
    console.log('Event Name: ' + record.eventName);
    if (record.eventName === 'INSERT') {
      const newItem = record.dynamodb.NewImage;
      // Process new item (e.g., send to SQS)
    }
  }
  return `Successfully processed ${event.Records.length} records.`;
};

⚠ Idempotency and Error Handling

📊 Production Insight

In production, monitor stream iterator age via CloudWatch to detect processing delays. Use batch size and parallelization factors to tune throughput. For high-volume streams, consider using Kinesis as a buffer before Lambda.

🎯 Key Takeaway

DynamoDB Streams capture changes in real-time and integrate with Lambda for event-driven processing, enabling reactive architectures.

● Production incidentPOST-MORTEMseverity: high

Hot Partition Killed Our Checkout Flow on Prime Day

Symptom

During a flash sale, the 'Orders' table started returning ProvisionedThroughputExceededException on writes for a specific subset of orders. Read latencies for those items spiked from 5ms to over 2 seconds. The rest of the table worked fine.

Assumption

The dev team assumed that using PAY_PER_REQUEST billing would automatically handle traffic spikes and that the partition key 'order_id' with high cardinality would distribute load evenly.

Root cause

The partition key was 'customer_id', and one customer (a large reseller) was placing 200+ orders per second via an automated API, while all other customers combined generated only 50 writes/sec. The automated reseller bot hit the same partition every time, far exceeding that partition's throughput capacity even though the table-level limits were not exceeded.

Fix

Switch to a high-cardinality partition key: 'order_id' itself, with a GSI on 'customer_id' for customer-centric queries. Plus add a write shard — append a random suffix (0-9) to the customer_id before using it as a partition key, then query all shards when fetching by customer. Implemented exponential backoff and client-side retry with jitter to handle transient throttling.

Key lesson

Never trust PAY_PER_REQUEST to save you from a single skewed access pattern. Re-evaluate partition key cardinality before any high-traffic event.
If you must use a low-cardinality key like customer_id for writes, implement write sharding with a random suffix and fan-out queries.
Always test with realistic traffic distribution in a staging environment, not just uniform load.
Set up CloudWatch alarms for ThrottledRequests per partition (available via the 'DynamoDBThrottle' metric) to catch hot partitions early.

Production debug guideLearn the exact commands and checks to diagnose the most common DynamoDB issues in production.5 entries

Symptom · 01

GetItem or Query returns inconsistent results immediately after a write

→

Fix

Check if you're using eventual consistency (default). If the read must see the latest write, add ConsistentRead=True to the request. Remember: strongly consistent reads cost 2x RCUs and are not available on GSIs.

Symptom · 02

Write requests fail with ProvisionedThroughputExceededException

→

Fix

Inspect CloudWatch metrics for ThrottledRequests. Check if a single partition key is receiving disproportionate traffic. If using provisioned capacity, consider switching to on-demand temporarily or increasing write capacity. If the pattern persists, redesign the partition key.

Symptom · 03

Query returns fewer items than expected

→

Fix

Verify the KeyConditionExpression and filter expressions. Remember: FilterExpression removes items AFTER the query reads them — you're still paying for the read capacity of filtered-out items. Use a GSI with the filter attribute as the sort key to eliminate scan waste.

Symptom · 04

Scan consumes 100% of read capacity and takes ages

→

Fix

Stop the scan immediately. Scans should never run on production tables during business hours. If you must scan, use a parallel scan with Segment and TotalSegments, limit the result set with Limit parameter, and run during off-peak hours with low read capacity settings.

Symptom · 05

GSI query returns no results even though base table has matching data

→

Fix

Check GSI status — it may still be backfilling after creation. Also note that GSIs are eventually consistent: after a write to the base table, there is a propagation delay (usually <1 second). If you need immediate consistency, read from the base table using the primary key.

★ DynamoDB Quick Debug Cheat SheetThe top 3 DynamoDB problems and their immediate fix commands. Run these when you get paged.

Throttled writes — ProvisionedThroughputExceededException on a few items only−

Immediate action

Increase write capacity units 5x via CLI or console. Then investigate partition key skew.

Commands

aws dynamodb update-table --table-name Orders --provisioned-throughput WriteCapacityUnits=500 --region us-east-1

aws cloudwatch get-metric-statistics --namespace AWS/DynamoDB --metric-name ThrottledRequests --dimensions Name=TableName,Value=Orders --statistics Sum --period 60 --start-time "$(date -u -d '5 minutes ago' +%Y-%m-%dT%H:%M:%SZ)"

Fix now

Temporarily switch to PAY_PER_REQUEST billing if the spike is legitimate. Update billing mode: aws dynamodb update-table --table-name Orders --billing-mode PAY_PER_REQUEST. Monitor costs after.

Query on GSI returns empty results after a recent write+

Large Scan operation timing out or costing too many RCUs+

DynamoDB vs PostgreSQL (Relational SQL)

Feature / Aspect	DynamoDB (NoSQL)	PostgreSQL (Relational SQL)
Data Model	Key-value + document (schemaless per item)	Fixed schema with typed columns
Query Flexibility	Only by primary key or GSI — patterns must be pre-planned	Any column at any time via ad-hoc SQL
Join Support	None — use single-table design or denormalization	Full JOIN support across tables
Horizontal Scaling	Automatic, infinite — no config needed	Complex — requires read replicas or sharding
Read Consistency	Eventually consistent by default; strongly consistent optional	ACID-compliant, always strongly consistent
Schema Changes	Add attributes anytime — no migration needed	ALTER TABLE required — can lock tables at scale
Pricing Model	Pay per read/write unit or on-demand	Pay for server/instance size (always on)
Best For	Predictable, high-throughput access patterns at scale	Complex queries, reporting, transactional integrity

⚙ Quick Reference

13 commands from this guide

File	Command / Code	Purpose
create_orders_table.py	dynamodb = boto3.resource(	Partition Keys and Sort Keys
orders_crud.py	from boto3.dynamodb.conditions import Key	Writing and Reading Data
orders_gsi.py	from boto3.dynamodb.conditions import Key	Global Secondary Indexes
single_table_design.py	from boto3.dynamodb.conditions import Key	Single-Table Design
retry_with_jitter.py	from botocore.exceptions import ClientError	Error Handling, Retries, and the Cost of Not Planning for Th
BeforeYouStart.sql	CREATE TABLE orders (	Who This Actually Applies To
Example.sql	ALTER TABLE orders	Advanced Features You'll Actually Use
Example.sql	SELECT o.*, i.product_name	Why Your RDBMS Reflexes Break on DynamoDB
WhoThisIsFor.sql	SELECT orders.id, customers.name	Who Should Read This
PrerequisitesCheck.sql	aws configure list	Prerequisites
single-table-design.sql	INSERT INTO ecommerce VALUE {'PK': 'CUST#123', 'SK': 'USER#123', 'entityType': '...	DynamoDB Single-Table Design
dax-usage.sql	SELECT * FROM products WHERE PK = 'PROD#123';	DAX
streams-lambda.sql	ALTER TABLE orders ENABLE STREAMS WITH VIEW_TYPE = 'NEW_AND_OLD_IMAGES';	DynamoDB Streams and Lambda Integration

Key takeaways

DynamoDB's partition key is hashed to determine physical storage location

choose high-cardinality values or you'll create hot partitions that throttle your entire application.

The sort key is a query tool, not just an identifier

use composite sort key values like 'DATE#ID' to enable efficient range queries that stay within a single partition.

Global Secondary Indexes let you query by a different key, but they cost money, add eventual consistency lag, and are not a substitute for proper upfront key design

they supplement it.

Single-table design co-locates related entities under one partition key, eliminating the need for joins and reducing network round-trips

but it only pays off when your access patterns are stable and well-defined.

Always implement retry logic with exponential backoff and jitter, but treat high retry counts as a symptom of a partition key problem, not a normal condition.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

Can you explain the difference between a partition key and a sort key, a...

Q02SENIOR

What is a hot partition in DynamoDB, how does it happen, and what strate...

Q03SENIOR

If a colleague proposes creating separate DynamoDB tables for each entit...

Q04SENIOR

What are the key differences between strongly consistent and eventually ...

Q01 of 04SENIOR

Can you explain the difference between a partition key and a sort key, and give an example of when you'd use a composite primary key over a simple one?

ANSWER

A partition key determines which physical partition stores the item — DynamoDB hashes it to distribute load. A sort key defines the order of items within that partition. Use a composite primary key (partition + sort) when you need to query a range of items that share the same partition key. For example, in an e-commerce system, customer_id as partition key and order_date as sort key lets you efficiently fetch all orders for a customer within a date range. A simple primary key (partition only) is fine when you always access items by a single unique ID, like a session token table where each session is fetched individually.

FAQ · 5 QUESTIONS

Frequently Asked Questions

What is the difference between DynamoDB GetItem and Query?

Can I change my DynamoDB partition key after creating the table?

Is DynamoDB always eventually consistent, or can I get strongly consistent reads?

How do I handle time-series data in DynamoDB?

What is the difference between PAY_PER_REQUEST and provisioned capacity?

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

✓ Verified

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

🔥

That's NoSQL. Mark it forged?

12 min read · try the examples if you haven't