DynamoDB Basics Explained: Tables, Keys, and Real-World Patterns
DynamoDB basics demystified — learn partition keys, sort keys, single-table design, and when to choose DynamoDB over SQL with battle-tested code examples.
N
Naren · Founder
Plain-English first. Then code. Then the interview question.
DynamoDB is a fully managed NoSQL key-value and document database with single-digit-millisecond latency at any scale
Partition key determines which physical node stores your data; choose high-cardinality attributes to avoid hot partitions
Sort key enables efficient range queries within a partition; composite sort keys like 'DATE#ID' unlock powerful access patterns
Global Secondary Indexes let you query by a different key, but they add cost and eventual consistency lag
Single-table design co-locates related entities (customers + orders) in one table to replace joins with fast partition queries
The biggest mistake: modeling tables like SQL schemas instead of mapping access patterns first
Plain-English First
Imagine a massive library with billions of books. Instead of organizing them alphabetically (which slows down as the library grows), you assign every book a unique locker number and go straight to that locker instantly — no browsing required. DynamoDB works the same way: it uses a 'key' to jump directly to your data in milliseconds, no matter if you have 100 rows or 100 billion. The catch? You have to decide how to label your lockers BEFORE you start filling them.
Every app eventually hits a wall with its database. SQL tables start choking on hundreds of millions of rows, joins get expensive, and scaling horizontally becomes an engineering nightmare. Amazon DynamoDB was built specifically to shatter that wall — it's the database that powers Amazon's own shopping cart, which handles millions of writes per second during Prime Day without breaking a sweat. If you're building anything at scale on AWS, DynamoDB will cross your path sooner or later.
The core problem DynamoDB solves is predictable performance at any scale. Traditional relational databases slow down as data grows because query planners scan more rows and indexes get larger. DynamoDB sidesteps this entirely by requiring you to declare your access patterns upfront. You design your keys so that every query hits exactly one partition — think of it as pre-building the fast path before traffic arrives, not scrambling to optimize after the fact.
By the end of this article you'll understand how DynamoDB's partition and sort keys actually work under the hood, how to model a real e-commerce order system without a single SQL join, why single-table design exists and when it's overkill, and the two mistakes that will silently ruin your DynamoDB performance in production.
Partition Keys and Sort Keys: The Engine Under the Hood
Every DynamoDB table has a primary key. That key comes in two flavors: a simple primary key (just a partition key) or a composite primary key (partition key + sort key). Understanding the difference isn't just syntax trivia — it determines what queries you can run efficiently.
The partition key is hashed by DynamoDB internally to decide which physical server (partition) stores your item. This is why it's sometimes called a hash key. Every read and write for that item goes to exactly that one partition. The golden rule: items with the same partition key live on the same server. That's powerful — it means related data is co-located — but it also means if one partition key gets hit with 90% of your traffic, you have a 'hot partition' problem and performance tanks.
The sort key (also called a range key) doesn't affect which partition stores the item, but it determines the order of items within that partition. This lets you query a range — 'give me all orders for customer-42 placed in the last 30 days' — in a single, efficient request. That range query is only possible because the sort key values are stored in sorted order on disk within each partition. Think of the partition key as the drawer label in a filing cabinet, and the sort key as the alphabetical tabs inside that drawer.
create_orders_table.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import boto3
# Create a DynamoDB client pointing at a local DynamoDB instance for dev# In production, remove 'endpoint_url' and use your AWS region
dynamodb = boto3.resource(
'dynamodb',
region_name='us-east-1',
endpoint_url='http://localhost:8000' # local DynamoDB for safe testing
)
# Create the Orders table with a COMPOSITE primary key:# partition_key = customer_id (groups all orders for one customer together)# sort_key = order_date#order_id (allows range queries by date)
table = dynamodb.create_table(
TableName='Orders',
KeySchema=[
{
'AttributeName': 'customer_id', # partition key'KeyType': 'HASH'
},
{
'AttributeName': 'order_date_order_id', # sort key (composite value)'KeyType': 'RANGE'
}
],
AttributeDefinitions=[
# Only KEY attributes go here — DynamoDB is schemaless for everything else
{'AttributeName': 'customer_id', 'AttributeType': 'S'}, # S = String
{'AttributeName': 'order_date_order_id','AttributeType': 'S'}
],
BillingMode='PAY_PER_REQUEST' # on-demand pricing — no capacity planning needed for dev
)
# Wait until the table is actually ready before writing to it
table.wait_until_exists()
print(f"Table status: {table.table_status}")
print(f"Table name: {table.table_name}")
Output
Table status: ACTIVE
Table name: Orders
Watch Out: Attribute Definitions Are NOT Your Schema
You only list attributes in AttributeDefinitions if they are used in a key (primary key or GSI/LSI key). Listing every attribute there is a common mistake that confuses people coming from SQL — DynamoDB is schemaless for non-key attributes, so each item can have completely different fields.
Production Insight
Hot partitions are the #1 DynamoDB killer. One skewed partition key (like 'status' with 3 values) can throttle your entire table.
Always validate partition key cardinality against your peak traffic per key.
Rule: if you cannot predict the max writes per partition key value, redesign before deploying.
High cardinality = good. Low cardinality = hot partition.
Design your keys so every query hits exactly one partition.
Writing and Reading Data — With a Real E-Commerce Pattern
Now that the table exists, let's populate it with real data and then query it the way a production app would. The key insight here is that DynamoDB offers two fundamentally different operations: GetItem (fetch by exact primary key — always fast, O(1)) and Query (fetch all items sharing a partition key, optionally filtered by sort key — still fast because it stays within one partition). There's also Scan, which reads every item in the table — treat this like a last resort in production.
For our e-commerce example, the pattern is: partition key = customer_id, sort key = a string that combines the date and the order ID separated by a # symbol. That separator trick is deliberately chosen. Because DynamoDB sorts strings lexicographically, the ISO date format (YYYY-MM-DD) sorts chronologically. So 'give me all orders after 2024-01-01' becomes a simple sort key condition — begins_with or between — without scanning the whole table.
This is the core discipline of DynamoDB design: your sort key isn't just an ID, it's a query tool. Every character in it is a deliberate choice about what queries you want to be able to answer efficiently.
orders_crud.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
import boto3
from boto3.dynamodb.conditions importKey
from decimal import Decimal# DynamoDB uses Decimal, not float, for numbers
dynamodb = boto3.resource(
'dynamodb',
region_name='us-east-1',
endpoint_url='http://localhost:8000'
)
table = dynamodb.Table('Orders')
# ─── WRITE: Put three orders for the same customer ─────────────────────────# Sort key format: "YYYY-MM-DD#ORDER-ID" — date first so range queries work
orders_to_insert = [
{
'customer_id': 'CUST-42',
'order_date_order_id': '2024-01-15#ORD-001','total_usd': Decimal('59.99'),
'status': 'DELIVERED',
'items': ['Mechanical Keyboard', 'USB Hub']
},
{
'customer_id': 'CUST-42',
'order_date_order_id': '2024-03-22#ORD-002','total_usd': Decimal('149.00'),
'status': 'SHIPPED',
'items': ['Monitor Stand']
},
{
'customer_id': 'CUST-42',
'order_date_order_id': '2024-06-10#ORD-003','total_usd': Decimal('29.99'),
'status': 'PROCESSING',
'items': ['Desk Mat']
}
]
for order in orders_to_insert:
table.put_item(Item=order) # put_item: create or fully overwrite an itemprint("Inserted 3 orders for CUST-42")
# ─── READ: Get ONE specific order by its full primary key ───────────────────
response = table.get_item(
Key={
'customer_id': 'CUST-42',
'order_date_order_id': '2024-03-22#ORD-002'
}
)
specific_order = response.get('Item', {})
print(f"\nSpecific order: {specific_order['order_date_order_id']} — ${specific_order['total_usd']}")
# ─── QUERY: Get all orders for CUST-42 placed on or after 2024-03-01 ────────# 'between' on the sort key stays inside one partition — this is efficient!
range_response = table.query(
KeyConditionExpression=
Key('customer_id').eq('CUST-42') &
Key('order_date_order_id').between('2024-03-01', '2024-12-31')
)
print(f"\nOrders from March 2024 onwards ({range_response['Count']} found):")
for order in range_response['Items']:
print(f" {order['order_date_order_id']} | ${order['total_usd']} | {order['status']}")
# ─── UPDATE: Change just the status field without rewriting the whole item ───
table.update_item(
Key={
'customer_id': 'CUST-42',
'order_date_order_id': '2024-03-22#ORD-002'
},
UpdateExpression='SET#s = :new_status', # #s avoids 'status' reserved word conflictExpressionAttributeNames={'#s': 'status'},ExpressionAttributeValues={':new_status': 'DELIVERED'}
)
print("\nUpdated ORD-002 status to DELIVERED")
Output
Inserted 3 orders for CUST-42
Specific order: 2024-03-22#ORD-002 — $149.00
Orders from March 2024 onwards (2 found):
2024-03-22#ORD-002 | $149.00 | SHIPPED
2024-06-10#ORD-003 | $29.99 | PROCESSING
Updated ORD-002 status to DELIVERED
Pro Tip: 'status' Is a Reserved Word in DynamoDB
DynamoDB has hundreds of reserved words (STATUS, NAME, DATE, etc.). If you use them directly in an UpdateExpression or FilterExpression you'll get a cryptic validation error. Always use ExpressionAttributeNames with a # prefix alias for any attribute that might be reserved. Keeping this habit prevents a whole class of mysterious runtime bugs.
Production Insight
Always test your composite sort key format with real data before production.
A typo like 'order_date' vs 'order_date_order_id' can break range queries silently.
Rule: document the exact sort key pattern for your team — one person's separator choice is another's bug.
Design your sort key to be a query tool, not just an ID.
Never use Scan in production — pre-plan access patterns.
Global Secondary Indexes: Querying Data a Different Way
Here's the problem you'll hit in week two of using DynamoDB: your table is perfectly designed for 'get all orders by customer', but product management just asked for 'get all orders with status SHIPPED'. That query doesn't match your partition key at all. In SQL, you'd add an index. In DynamoDB, you add a Global Secondary Index (GSI).
A GSI is essentially a separate, automatically maintained copy of your table organized around a different key. You define a new partition key (and optionally a sort key), and DynamoDB replicates writes to that index asynchronously. Queries against a GSI are just as fast as queries against the base table — because the same partitioning logic applies.
The word 'global' means the index spans the entire table, not just one partition. There's also a Local Secondary Index (LSI), which must share the base table's partition key but can have a different sort key — and it must be defined at table creation time, not added later. GSIs can be added to existing tables, making them far more flexible in practice.
The design trade-off: every GSI costs you money (storage + write capacity) and adds a few milliseconds of eventual consistency lag. You're not getting something for free — you're choosing which query patterns deserve their own fast path.
orders_gsi.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import boto3
from boto3.dynamodb.conditions importKey
dynamodb = boto3.resource(
'dynamodb',
region_name='us-east-1',
endpoint_url='http://localhost:8000'
)
# Add a GSI to the existing Orders table so we can query by order status# GSI partition key: status | GSI sort key: order_date_order_id# This lets us ask: "give me all SHIPPED orders, most recent first"
dynamodb.meta.client.update_table(
TableName='Orders',
AttributeDefinitions=[
{'AttributeName': 'status', 'AttributeType': 'S'},
{'AttributeName': 'order_date_order_id', 'AttributeType': 'S'}
],
GlobalSecondaryIndexUpdates=[
{
'Create': {
'IndexName': 'StatusByDate-GSI',
'KeySchema': [
{'AttributeName': 'status', 'KeyType': 'HASH'},
{'AttributeName': 'order_date_order_id', 'KeyType': 'RANGE'}
],
'Projection': {
# INCLUDE only the fields we actually need — saves cost vs ALL'ProjectionType': 'INCLUDE',
'NonKeyAttributes': ['customer_id', 'total_usd']
}
}
}
]
)
print("GSI creation initiated (takes a moment to backfill)...")
# Wait for the GSI to become ACTIVE before querying it
waiter = dynamodb.meta.client.get_waiter('table_exists')
waiter.wait(TableName='Orders')
table = dynamodb.Table('Orders')
# ─── Query via GSI: all SHIPPED orders ──────────────────────────────────────
gsi_response = table.query(
IndexName='StatusByDate-GSI', # tell DynamoDB to use our new indexKeyConditionExpression=Key('status').eq('SHIPPED')
)
print(f"\nAll SHIPPED orders ({gsi_response['Count']} found):")
for order in gsi_response['Items']:
print(f" Customer: {order['customer_id']} | "
f"Order: {order['order_date_order_id']} | "
f"Total: ${order['total_usd']}")
Output
GSI creation initiated (takes a moment to backfill)...
GSIs are eventually consistent — a write to the base table will propagate to the GSI within milliseconds, but not instantly. This means a query on a GSI immediately after a write might not reflect the latest data. For use cases where that matters (e.g., financial transactions), always read from the base table using the full primary key, which guarantees strong consistency.
Production Insight
GSI write capacity is subtracted from your table's write capacity if using provisioned mode — a common surprise.
If you have 3 GSIs, each write to the base table consumes 4 write capacity units (1 base + 3 GSI). Plan accordingly.
Rule: use ProjectionType INCLUDE to keep GSI storage costs down — projecting ALL attributes is expensive.
Key Takeaway
GSIs let you query by a different key, but cost money and add eventual consistency.
LSIs share the partition key but must be defined at table creation — plan ahead.
Design GSI projections carefully: include only the attributes you need.
Single-Table Design: Why One Table Often Beats Many
Single-table design (STD) is the concept that trips up virtually every developer migrating from SQL. The instinct is to create one DynamoDB table per entity — an Orders table, a Customers table, a Products table — mirroring what you'd do in a relational database. But DynamoDB wasn't designed for that, and doing it that way throws away its biggest advantage.
In SQL, you join across tables at query time. DynamoDB has no joins. Instead, single-table design stores heterogeneous entity types in the same table, using generic attribute names like PK and SK (partition key and sort key) whose values encode the entity type. For example: PK='CUSTOMER#CUST-42', SK='PROFILE' for a customer record, and PK='CUSTOMER#CUST-42', SK='ORDER#2024-03-22#ORD-002' for an order belonging to that customer. Now a single Query call with PK='CUSTOMER#CUST-42' returns the customer profile AND all their orders in one network round-trip.
This is powerful for read-heavy access patterns, but it comes with a real cost: the model is harder to reason about, harder to query ad-hoc (e.g., for analytics), and painful if your access patterns change. Single-table design is a deliberate trade: you sacrifice flexibility for performance and cost efficiency. It's the right call for a microservice with stable, well-understood access patterns. It's the wrong call for an exploratory analytics workload — use Athena or Redshift for that.
single_table_design.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import boto3
from boto3.dynamodb.conditions importKeyfrom decimal importDecimal
dynamodb = boto3.resource(
'dynamodb',
region_name='us-east-1',
endpoint_url='http://localhost:8000'
)
# One table to rule them all — stores Customers AND Orders together# Generic key names PK / SK are conventional in single-table design
table = dynamodb.create_table(
TableName='ECommerceApp',
KeySchema=[
{'AttributeName': 'PK', 'KeyType': 'HASH'},
{'AttributeName': 'SK', 'KeyType': 'RANGE'}
],
AttributeDefinitions=[
{'AttributeName': 'PK', 'AttributeType': 'S'},
{'AttributeName': 'SK', 'AttributeType': 'S'}
],
BillingMode='PAY_PER_REQUEST'
)
table.wait_until_exists()
# ─── Write a CUSTOMER record ─────────────────────────────────────────────────# PK and SK both describe what the item IS — the value encodes entity type
table.put_item(Item={
'PK': 'CUSTOMER#CUST-42', # partition key identifies the customer
'SK': 'PROFILE', # sort key distinguishes record type
'entity_type': 'Customer', # explicit type tag — helps with filtering'full_name': 'Alex Rivera',
'email': 'alex@example.com',
'loyalty_tier': 'Gold'
})
# ─── Write two ORDERS for the same customer ──────────────────────────────────# Same PK as the customer — they share a partition, so one Query fetches both
table.put_item(Item={
'PK': 'CUSTOMER#CUST-42',
'SK': 'ORDER#2024-01-15#ORD-001', # ORDER# prefix separates orders from profile'entity_type': 'Order',
'total_usd': Decimal('59.99'),
'status': 'DELIVERED'
})
table.put_item(Item={
'PK': 'CUSTOMER#CUST-42',
'SK': 'ORDER#2024-06-10#ORD-003','entity_type': 'Order',
'total_usd': Decimal('29.99'),
'status': 'PROCESSING'
})
print("Inserted customer profile and 2 orders into single table")
# ─── Fetch customer profile + ALL orders in ONE query call ───────────────────# begins_with('ORDER#') filters to just orders, skipping the PROFILE item
all_items_response = table.query(
KeyConditionExpression=Key('PK').eq('CUSTOMER#CUST-42')
)
print(f"\nAll items for CUST-42 ({all_items_response['Count']} total):")
for item in all_items_response['Items']:
print(f" [{item['entity_type']}] SK={item['SK']}")
# Get ONLY the orders with a begins_with filter on the sort key
orders_only_response = table.query(
KeyConditionExpression=
Key('PK').eq('CUSTOMER#CUST-42') &Key('SK').begins_with('ORDER#')
)
print(f"\nOrders only ({orders_only_response['Count']} found):")
for order in orders_only_response['Items']:
print(f" {order['SK']} | ${order['total_usd']} | {order['status']}")
Output
Inserted customer profile and 2 orders into single table
All items for CUST-42 (3 total):
[Customer] SK=PROFILE
[Order] SK=ORDER#2024-01-15#ORD-001
[Order] SK=ORDER#2024-06-10#ORD-003
Orders only (2 found):
ORDER#2024-01-15#ORD-001 | $59.99 | DELIVERED
ORDER#2024-06-10#ORD-003 | $29.99 | PROCESSING
Pro Tip: Map Your Access Patterns Before You Touch the Console
With DynamoDB, your data model is a direct function of your access patterns. Before creating a single table, write down every query your app needs to make (e.g., 'get customer by ID', 'get all orders for a customer', 'get all shipped orders'). Each access pattern should map to either a direct key lookup or a Query on the base table or a GSI. If you skip this step and model by entity like a SQL developer, you'll redesign your table at 2am under production pressure.
Production Insight
Single-table design can reduce read costs dramatically, but it's brittle when access patterns evolve.
Adding a new query pattern later may require a full data migration — not just a new index.
Rule: only use single-table design when your access patterns are stable and you understand the trade-off. Otherwise, multiple tables with GSIs are safer.
Key Takeaway
One table with PK/SK encoding = replaces SQL joins with partition queries.
Great for stable access patterns, painful for ad-hoc analysis.
Trade flexibility for performance: know when to walk away from STD.
Error Handling, Retries, and the Cost of Not Planning for Throttling
DynamoDB is reliable, but it's not magic. When you exceed your provisioned throughput or hit a hot partition, DynamoDB returns ProvisionedThroughputExceededException. The SDK handles retries with exponential backoff by default — but that default can be too slow for user-facing requests. Understanding how to configure retries and handle errors is critical for production.
Every AWS SDK client includes a retry mechanism. By default, it retries up to 3 times with exponential backoff. That means a single failed request might take 1 + 2 + 4 = 7 seconds before the SDK gives up. For a customer-facing checkout, that's an unacceptable delay. You need to implement your own retry logic with jitter, and more importantly, detect the root cause — is it a burst of traffic (transient) or a sustained pattern (needs rearchitecting)?
Another silent killer: conditional writes that fail because of a stale version. If you use conditional writes for optimistic locking (e.g., update_item with ConditionExpression), a concurrent write will throw ConditionalCheckFailedException. Your code must catch that and retry the entire read-modify-write cycle. This is where the 'lost update' problem hides — never assume a conditional write succeeds.
DynamoDB also has a 1MB limit on Query/Scan results. If your query returns more than 1MB of data, you'll get paginated results via LastEvaluatedKey. Failing to paginate is a classic bug: you get the first page, assume it's complete, and silently miss data. Always check LastEvaluatedKey in a loop until it's null.
retry_with_jitter.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import boto3
import time
import random
from botocore.exceptions importClientError
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Orders')
defput_item_with_retry(item, max_retries=3):
"""Put an item with exponential backoff and jitter.
AWSSDK default retry is fine for background jobs, but for user-facing
endpoints we want faster fallback and better logging.
"""
for attempt inrange(max_retries):
try:
table.put_item(Item=item)
returnTrueexceptClientErroras e:
code = e.response['Error']['Code']
if code == 'ProvisionedThroughputExceededException':
# Exponential backoff with jitter: base = 0.1s, factor = 2, cap = 2s
sleep_time = min(0.1 * (2 ** attempt) + random.random() * 0.1, 2.0)
print(f"Retry {attempt+1} after {sleep_time:.2f}s — throttled")
time.sleep(sleep_time)
else:
# Non-throttling error — re-raise immediatelyraiseprint("Failed after max retries — consider failing over or returning 503")
returnFalse# Usage: put_item_with_retry({'customer_id': 'CUST-42', ...})# ─── Full pagination example for Query ───────────────────────────────────────defget_all_orders(customer_id):
"""Handle pagination correctly — never assume one page is all the data."""
all_items = []
last_key = NonewhileTrue:
query_kwargs = {
'KeyConditionExpression': Key('customer_id').eq(customer_id)
}
if last_key:
query_kwargs['ExclusiveStartKey'] = last_key
response = table.query(**query_kwargs)
all_items.extend(response['Items'])
last_key = response.get('LastEvaluatedKey')
ifnot last_key:
breakreturn all_items
Output
(No direct output — run in your environment. The retry function prints retry attempts.)
Mental Model: DynamoDB Throttling vs SQL Deadlock
SQL deadlock stops everything — you need to kill a transaction.
DynamoDB throttling only affects requests to a hot partition; other partitions keep working.
The fix for throttling is almost never to increase total capacity — it's to redistribute the load.
Exponential backoff with jitter (rather than fixed retries) prevents thundering herd when the partition recovers.
Production Insight
Default SDK retries will mask hot partitions for minutes — you lose observability.
Always log the partition key of throttled requests so you can spot skew patterns.
Rule: instrument your code with metrics on retry count per partition key. Alert if any key exceeds 1% throttle rate.
Key Takeaway
Retries with jitter are mandatory, but they are a band-aid — fix the partition key skew.
Always paginate Query results; LastEvaluatedKey is the check.
Conditional write failures need full read-modify-write retry — log and alert on pattern.
● Production incidentPOST-MORTEMseverity: high
Hot Partition Killed Our Checkout Flow on Prime Day
Symptom
During a flash sale, the 'Orders' table started returning ProvisionedThroughputExceededException on writes for a specific subset of orders. Read latencies for those items spiked from 5ms to over 2 seconds. The rest of the table worked fine.
Assumption
The dev team assumed that using PAY_PER_REQUEST billing would automatically handle traffic spikes and that the partition key 'order_id' with high cardinality would distribute load evenly.
Root cause
The partition key was 'customer_id', and one customer (a large reseller) was placing 200+ orders per second via an automated API, while all other customers combined generated only 50 writes/sec. The automated reseller bot hit the same partition every time, far exceeding that partition's throughput capacity even though the table-level limits were not exceeded.
Fix
Switch to a high-cardinality partition key: 'order_id' itself, with a GSI on 'customer_id' for customer-centric queries. Plus add a write shard — append a random suffix (0-9) to the customer_id before using it as a partition key, then query all shards when fetching by customer. Implemented exponential backoff and client-side retry with jitter to handle transient throttling.
Key lesson
Never trust PAY_PER_REQUEST to save you from a single skewed access pattern. Re-evaluate partition key cardinality before any high-traffic event.
If you must use a low-cardinality key like customer_id for writes, implement write sharding with a random suffix and fan-out queries.
Always test with realistic traffic distribution in a staging environment, not just uniform load.
Set up CloudWatch alarms for ThrottledRequests per partition (available via the 'DynamoDBThrottle' metric) to catch hot partitions early.
Production debug guideLearn the exact commands and checks to diagnose the most common DynamoDB issues in production.5 entries
Symptom · 01
GetItem or Query returns inconsistent results immediately after a write
→
Fix
Check if you're using eventual consistency (default). If the read must see the latest write, add ConsistentRead=True to the request. Remember: strongly consistent reads cost 2x RCUs and are not available on GSIs.
Symptom · 02
Write requests fail with ProvisionedThroughputExceededException
→
Fix
Inspect CloudWatch metrics for ThrottledRequests. Check if a single partition key is receiving disproportionate traffic. If using provisioned capacity, consider switching to on-demand temporarily or increasing write capacity. If the pattern persists, redesign the partition key.
Symptom · 03
Query returns fewer items than expected
→
Fix
Verify the KeyConditionExpression and filter expressions. Remember: FilterExpression removes items AFTER the query reads them — you're still paying for the read capacity of filtered-out items. Use a GSI with the filter attribute as the sort key to eliminate scan waste.
Symptom · 04
Scan consumes 100% of read capacity and takes ages
→
Fix
Stop the scan immediately. Scans should never run on production tables during business hours. If you must scan, use a parallel scan with Segment and TotalSegments, limit the result set with Limit parameter, and run during off-peak hours with low read capacity settings.
Symptom · 05
GSI query returns no results even though base table has matching data
→
Fix
Check GSI status — it may still be backfilling after creation. Also note that GSIs are eventually consistent: after a write to the base table, there is a propagation delay (usually <1 second). If you need immediate consistency, read from the base table using the primary key.
★ DynamoDB Quick Debug Cheat SheetThe top 3 DynamoDB problems and their immediate fix commands. Run these when you get paged.
Throttled writes — ProvisionedThroughputExceededException on a few items only−
Immediate action
Increase write capacity units 5x via CLI or console. Then investigate partition key skew.
If your use case demands strong consistency on GSI, restructure the base table to include that attribute in the primary key or switch to DynamoDB Streams + materialized view pattern.
Large Scan operation timing out or costing too many RCUs+
Immediate action
Cancel the scan if it's running. Add a Limit clause and use pagination with LastEvaluatedKey. Never let a scan run unlimited.
Create a GSI that matches the filter condition so you can use Query instead of Scan. Alternatively, use DynamoDB Export to S3 (PITR) for analytical workloads.
DynamoDB vs PostgreSQL (Relational SQL)
Feature / Aspect
DynamoDB (NoSQL)
PostgreSQL (Relational SQL)
Data Model
Key-value + document (schemaless per item)
Fixed schema with typed columns
Query Flexibility
Only by primary key or GSI — patterns must be pre-planned
Any column at any time via ad-hoc SQL
Join Support
None — use single-table design or denormalization
Full JOIN support across tables
Horizontal Scaling
Automatic, infinite — no config needed
Complex — requires read replicas or sharding
Read Consistency
Eventually consistent by default; strongly consistent optional
ACID-compliant, always strongly consistent
Schema Changes
Add attributes anytime — no migration needed
ALTER TABLE required — can lock tables at scale
Pricing Model
Pay per read/write unit or on-demand
Pay for server/instance size (always on)
Best For
Predictable, high-throughput access patterns at scale
DynamoDB's partition key is hashed to determine physical storage location
choose high-cardinality values or you'll create hot partitions that throttle your entire application.
2
The sort key is a query tool, not just an identifier
use composite sort key values like 'DATE#ID' to enable efficient range queries that stay within a single partition.
3
Global Secondary Indexes let you query by a different key, but they cost money, add eventual consistency lag, and are not a substitute for proper upfront key design
they supplement it.
4
Single-table design co-locates related entities under one partition key, eliminating the need for joins and reducing network round-trips
but it only pays off when your access patterns are stable and well-defined.
5
Always implement retry logic with exponential backoff and jitter, but treat high retry counts as a symptom of a partition key problem, not a normal condition.
Common mistakes to avoid
4 patterns
×
Choosing a low-cardinality partition key (e.g., 'status' with only 3 values)
Symptom
One or two partitions get all the traffic (hot partition), throughput is throttled even though you're under your overall capacity limit, and you see ProvisionedThroughputExceededException on specific items.
Fix
Use a high-cardinality partition key like user_id or order_id. If you truly need to query by status, add a GSI with status as the partition key and accept that traffic will be distributed across those few partitions — or use write sharding by appending a random suffix to the partition key and querying all shards in parallel.
×
Using Scan instead of Query for production reads
Symptom
Table reads are slow and cost spikes dramatically as the table grows; a 10GB table Scan consumes 10GB of read capacity units every time regardless of how many items match.
Fix
Redesign your access pattern to use a Query (requires a known partition key). If you genuinely need to search without a partition key, add a GSI whose partition key matches your search field. Reserve Scan strictly for one-off data migrations or admin scripts that run during off-peak hours with a low-throughput limit set via the Limit parameter.
×
Treating update_item like a safe partial update without condition expressions
Symptom
Concurrent Lambda functions both read an item, both compute a new value, and both write back — the last write wins silently, losing the first update (classic lost update race condition).
Fix
Use ConditionExpression in update_item to implement optimistic locking. Add a version attribute to each item, increment it on every write, and include ConditionExpression='version = :expected_version' in your update. DynamoDB will reject the write if another process already incremented the version, and you can retry safely.
×
Assuming GSIs provide strong consistency
Symptom
A query on a GSI immediately after a write returns stale or missing data, causing user-facing errors (e.g., showing an order as 'not yet shipped' right after updating it).
Fix
Never use GSI queries for strongly consistent reads. If the read must reflect the latest write, use the base table with a full primary key GetItem with ConsistentRead=True. For use cases that require strong consistency on a secondary index, consider using DynamoDB Streams + a separate cache or materialized view.
INTERVIEW PREP · PRACTICE MODE
Interview Questions on This Topic
Q01SENIOR
Can you explain the difference between a partition key and a sort key, a...
Q02SENIOR
What is a hot partition in DynamoDB, how does it happen, and what strate...
Q03SENIOR
If a colleague proposes creating separate DynamoDB tables for each entit...
Q04SENIOR
What are the key differences between strongly consistent and eventually ...
Q01 of 04SENIOR
Can you explain the difference between a partition key and a sort key, and give an example of when you'd use a composite primary key over a simple one?
ANSWER
A partition key determines which physical partition stores the item — DynamoDB hashes it to distribute load. A sort key defines the order of items within that partition. Use a composite primary key (partition + sort) when you need to query a range of items that share the same partition key. For example, in an e-commerce system, customer_id as partition key and order_date as sort key lets you efficiently fetch all orders for a customer within a date range. A simple primary key (partition only) is fine when you always access items by a single unique ID, like a session token table where each session is fetched individually.
Q02 of 04SENIOR
What is a hot partition in DynamoDB, how does it happen, and what strategies do you use to prevent it in a high-traffic application?
ANSWER
A hot partition occurs when one partition key value receives significantly more traffic than others, causing that partition's throughput ceiling to be hit even though the table-level capacity is not exhausted. Common causes: using a low-cardinality key like 'status' (only 3 values), or a single customer generating disproportionate traffic. Prevention strategies: 1) Choose a high-cardinality partition key (e.g., order_id, user_id). 2) Implement write sharding — append a random suffix (0-9) to a natural key like customer_id and then query all shards in parallel when reading. 3) Use a GSI with the 'hot' attribute as the partition key and accept the multitenancy trade-off. 4) Cache frequently read data in ElastiCache to offload reads. 5) Monitor CloudWatch ThrottledEvents per partition using the DynamoDB Throttle metric.
Q03 of 04SENIOR
If a colleague proposes creating separate DynamoDB tables for each entity type in a new microservice — one for Users, one for Orders, one for Products — how would you evaluate that approach and what alternative would you consider?
ANSWER
I'd start by asking: what are the primary access patterns? If most queries are 'get user by ID' and 'get order by ID' independently, multiple tables might be fine — simple, easy to reason about, no single-table complexity. But if the app frequently needs to fetch a user together with their recent orders or purchase history, a single-table design would be more efficient because it co-locates related entities under the same partition key, eliminating the need for client-side joins and multiple round trips. I'd list all access patterns first. If there are many cross-entity queries, I'd recommend single-table design with PK/SK encoding. If the patterns are isolated per entity, I'd stick with multiple tables — but I'd caution against the SQL-schema mindset. Also, with multiple tables you can still use GSIs per table. The trap is creating 5 tables before you understand the access patterns — that's when you refactor at 2am.
Q04 of 04SENIOR
What are the key differences between strongly consistent and eventually consistent reads in DynamoDB, and when should you use each?
ANSWER
Eventually consistent reads are the default — they read from one replica and may return stale data up to ~1 second after a write. They cost 0.5 read capacity units per read (for items up to 4KB). Strongly consistent reads read from the primary replica and guarantee the latest committed data — they cost 1 RCU per read. Use strongly consistent reads when stale data would cause problems: payment status, account balance, any row used to compute a subsequent write (like a counter). Use eventual consistency for read-heavy, tolerance-tolerant data: product listings, user profiles, analytics dashboards. Strongly consistent reads are not available on GSIs, which is a key limitation.
01
Can you explain the difference between a partition key and a sort key, and give an example of when you'd use a composite primary key over a simple one?
SENIOR
02
What is a hot partition in DynamoDB, how does it happen, and what strategies do you use to prevent it in a high-traffic application?
SENIOR
03
If a colleague proposes creating separate DynamoDB tables for each entity type in a new microservice — one for Users, one for Orders, one for Products — how would you evaluate that approach and what alternative would you consider?
SENIOR
04
What are the key differences between strongly consistent and eventually consistent reads in DynamoDB, and when should you use each?
SENIOR
FAQ · 5 QUESTIONS
Frequently Asked Questions
01
What is the difference between DynamoDB GetItem and Query?
GetItem fetches a single item when you know its complete primary key (partition key + sort key) — it's always O(1) and the fastest possible read. Query fetches multiple items that share the same partition key, optionally filtered by sort key conditions. Query is efficient because it reads only one partition, but it returns a collection rather than a single item. Never use Scan when Query will do the job.
Was this helpful?
02
Can I change my DynamoDB partition key after creating the table?
No — the primary key (partition key and sort key) is set at table creation and cannot be changed. If you need a different key structure, you have to create a new table and migrate your data to it. This is the most important reason to design your access patterns carefully before creating your DynamoDB table.
Was this helpful?
03
Is DynamoDB always eventually consistent, or can I get strongly consistent reads?
DynamoDB offers both. By default, reads are eventually consistent, meaning you might read slightly stale data for a few milliseconds after a write. You can request strongly consistent reads by setting ConsistentRead=True in your GetItem or Query call — this guarantees you see the latest committed data. Strongly consistent reads cost twice as many read capacity units, and they are not available on Global Secondary Indexes at all.
Was this helpful?
04
How do I handle time-series data in DynamoDB?
Use a composite sort key with a timestamp (e.g., 'YYYY-MM-DDTHH:MM:SS') and the partition key as the device or sensor ID. This lets you query 'get all readings from sensor-42 in the last hour' efficiently. For very high write rates (millions per second), use write sharding — partition on a high-cardinality key like a random UUID, and include a timestamp as sort key. For historical data, use Time to Live (TTL) to auto-expire old records and export them to S3 via DynamoDB Streams.
Was this helpful?
05
What is the difference between PAY_PER_REQUEST and provisioned capacity?
PAY_PER_REQUEST (on-demand) charges per read/write request with no capacity planning — great for unpredictable workloads or new applications. Provisioned capacity lets you specify RCUs and WCUs, which gives cost predictability and can be more cost-effective at steady, high throughput. On-demand costs about 2-3x more per request than provisioned for high volumes. You can switch between modes once per 24 hours, but note that on-demand has a per-partition throughput limit (~1,000 WCU per partition) that can still cause throttling if traffic is extremely skewed.