Idempotency — How a Missing Key Doubled Customer Charges
Duplicate orders from retried POSTs without idempotency keys cost customers twice.
- Idempotency guarantees retries produce the same result as the first attempt
- HTTP methods GET, PUT, DELETE are idempotent; POST is not by default
- Idempotency keys (Idempotency-Key header) make POST operations safe to retry
- Store processed keys with the full response to avoid side effects on duplicates
- Performance cost: one extra storage lookup per request, negligible compared to duplicate chargebacks
- Always include request method and path in the lookup to prevent cross-endpoint collisions
Imagine you're pressing the elevator button for floor 5. Whether you press it once or ten times in frustration, the elevator still only comes to floor 5 once — pressing it more doesn't summon ten elevators. That's idempotency: doing the same action multiple times produces the same result as doing it once. In API terms, if your network hiccups and your app accidentally fires the same 'place order' request three times, an idempotent API makes sure you only get charged and only receive one order.
Every distributed system lives with one uncomfortable truth: networks lie. Requests time out. Connections drop. Load balancers retry. Your mobile client sends a payment request, the server processes it, but the response never makes it back. The client, seeing no answer, tries again. Now you've potentially charged someone twice for one purchase. This isn't a theoretical edge case — it's one of the most common and costly bugs in production. Idempotency is the engineering principle that prevents it.
Idempotency solves the retry problem. When an operation is idempotent, clients can safely resend a request any number of times without worrying about duplicating side effects. The server recognises a repeated request and returns the same outcome it gave the first time, rather than blindly executing again. This shifts the complexity from the client — which shouldn't have to agonise over whether to retry — to the server, which is the right place to handle it.
By the end of this article you'll understand exactly which HTTP methods are idempotent and why, how to implement idempotency keys for non-idempotent operations like payments, what to store server-side to make retries safe, and the subtle mistakes that make developers think they've built idempotency when they haven't. You'll be able to design APIs that stay correct even when the network doesn't cooperate.
What is Idempotency in API Design?
Idempotency means that sending the same request multiple times produces the same server state as sending it once. The key is server state, not response code. A DELETE that returns 404 on second call is still idempotent because the resource is still gone. In production, retries happen all the time — network timeouts, client errors, load balancer health checks. If your API isn't idempotent, every retry risks duplicating side effects.
Think of it this way: idempotency is the server's responsibility to handle retries safely. Clients shouldn't have to track which requests succeeded. They just need to know they can resend until they get a valid response. This shifts the complexity to where it belongs — the server.
The elevator button analogy works but misses a subtle point: elevators are naturally idempotent because the control system tracks which floor is already selected. Your API needs to do the same for every mutation operation. That's what idempotency keys provide.
A payment API without idempotency is a support ticket factory. Every duplicate charge triggers a refund request, eats into margins, and erodes customer trust. That's why Stripe and PayPal require an Idempotency-Key header on every charge request. It's not optional — it's the difference between a reliable system and a constant firefighting operation.
HTTP Method Idempotency: Which Methods Are Safe to Retry?
The HTTP specification defines idempotency for methods, but many developers misinterpret it. GET, HEAD, PUT, DELETE, OPTIONS, and TRACE are idempotent by definition. POST, PATCH, and CONNECT are not. Crucially, idempotency in HTTP refers to the server's side effects from multiple identical requests, not the response codes. A PUT that creates a new resource if it doesn't exist is idempotent: calling it twice creates exactly one resource (the second call replaces it). A DELETE returns 404 on the second call, but the server state is unchanged — that's idempotent.
The danger zone is when developers assume idempotency based on method alone. For example, a PUT endpoint that increments a counter violates idempotency. Similarly, a PATCH that applies partial updates can be non-idempotent if the operation is not designed as an absolute change (e.g., "add 5 to balance" is not idempotent; "set balance to 100" is).
Always verify the actual side effects under retry scenarios. A good mental model: ask yourself "If I send this exact request twice, will the server's final state be identical to sending it once?" If yes, the operation is idempotent. For PATCH, use JSON Patch or merge patch semantics that replace fields entirely, not incrementally.
There's another trap: HEAD and OPTIONS are rarely used in practice, but they're idempotent by spec. Don't rely on them for state mutations — they shouldn't have any.
Many teams assume PATCH with JSON Patch is always idempotent. JSON Patch operations like "replace" are idempotent, but "add" to an array is not. Read the spec carefully — the operation type matters, not just the PATCH method.
Implementing Idempotency Keys: Generation, Transmission, and Storage
An idempotency key is a unique identifier that the client generates and sends with each request. The server uses the key to detect duplicates. The key must be globally unique (UUID v4 is standard) and must be generated client-side before the first attempt. Never let the server generate the key — if the request times out before the server responds, the client won't know the key and can't safely retry.
Transmission: Use the Idempotency-Key header. The server extracts it before processing. If the header is missing for a POST endpoint that requires idempotency, return 400 Bad Request. This forces clients to be explicit about retry safety.
Storage: Store the key along with the response status, body, and headers. The storage system must support atomic check-and-set. Redis with SET NX is the most common choice. In a relational database, use a table with a unique constraint on the key column. The response is cached until a TTL (typically 24 hours) to cover retry windows. After TTL expires, the key can be reused, but clients should not retry beyond that window. The TTL must be longer than the maximum expected retry duration.
But a subtle detail: the same key must not be reused for different operations. Always include the request method and path in the lookup key to prevent cross-endpoint collisions. For example, a key used for a payment should not accidentally match a key used for a refund.
Also, consider using ULID instead of UUID v4 if you need time-ordered keys for range scans in your storage. UUID v4 is random, which can cause index fragmentation in databases. ULID preserves sortability while maintaining uniqueness.
Concurrency and Race Conditions: Protecting Idempotency Under Load
The hardest part of idempotency is handling concurrent requests with the same key. Imagine a client sends a request, but the server's response is delayed. The client times out and retries. Both requests arrive at the server at nearly the same time. Without atomicity, both pass the idempotency check and both execute the operation, causing duplicates.
To prevent this, the check and store must be atomic. In Redis, use SET NX with the key name and an expiry. If SET returns OK, this request is the first to claim the key; proceed with the operation. If it returns nil, another request already claimed the key; wait for the first request to complete and retrieve its cached response. In SQL, use INSERT ... ON CONFLICT DO NOTHING and check the number of rows affected.
For even tighter guarantees, use distributed locks (e.g., Redlock) around the idempotency check for critical operations like payment processing. But locks add latency and increase complexity. Evaluate whether your business logic can tolerate a small window of potential duplicates before implementing locking.
Also consider: what if the first request fails after storing the key? You need a strategy to release the lock on failure. The simplest approach is to set a very short pre-processing state (e.g., "processing") and delete the key on error. But if the server crashes mid-processing, you'll leave a stale key. Mitigate by setting a short initial TTL (e.g., 10 seconds) and extending it once processing completes.
Using distributed locks adds 10–50ms of latency per operation. For high-throughput systems processing millions of requests per day, that overhead adds up fast. Prefer atomic database operations or Redis SET NX over locking unless the cost of a duplicate is astronomical (e.g., financial settlements).
- SET NX is like taking the room key. If you get it, you're in.
- If someone else has it, you wait at the door until they leave the result.
- TTL is the checkout time — after that, the room is available again.
Production Pitfalls: TTL, Cleanup, and Error Responses
Idempotency keys don't live forever. They need a TTL (Time To Live) that covers the maximum expected retry window. Common choices are 24 hours for normal APIs and up to 7 days for payment-related endpoints. After TTL, the key can be reused — but a client that retries after the TTL might create a duplicate. Mitigate this by logging any late retry and considering a dead-letter queue.
Storage cleanup is essential. In Redis, TTL is handled automatically. In SQL, schedule a job to delete expired rows. Without cleanup, the table grows unbounded and performance degrades. Partitioning by creation date helps.
Error responses matter: if the idempotency key is missing or invalid, return 400 Bad Request. If a request arrives after the key has expired, return 429 Too Many Requests with a Retry-After header explaining the retry window. If the incoming request body differs from the stored request for the same key, return 422 Unprocessable Entity. These clear semantics prevent clients from misinterpreting failures.
One more trap: compressing the response body before caching. If you cache a compressed response, make sure the content-encoding header is also stored. Serving a compressed response to a client that expects uncompressed data will break parsing.
Choosing the right TTL is a trade-off: too short and you risk duplicate processing, too long and you waste storage. For most APIs, 24 hours is the sweet spot. For payment APIs, 7 days aligns with chargeback windows. Add a 1-hour grace period where expired keys are kept for audit logging but not used for deduplication — this helps in forensic investigations.
Idempotency in Event-Driven and Asynchronous Systems
Idempotency doesn't stop at synchronous HTTP APIs. In event-driven systems, a single event can be delivered multiple times (at-least-once semantics). Message queues like Kafka and SQS guarantee delivery but not deduplication. Your consumer must handle duplicate events idempotently.
The pattern is similar: use a deduplication identifier in the event payload (e.g., event_id). Before processing, check if that ID has been processed. Store processed IDs in a database or cache with a TTL that covers the event retention period. This is especially important for side-effecting events like "payment captured" or "order fulfilled".
A key difference: in event-driven systems, you often don't have a client to retry — the system retries automatically if processing fails. So your deduplication window must be longer than the total retry duration across all retry attempts. Using an infinite retention policy is possible but costly. Practical approach: store processed events for at least 7 days, and use a background job to archive older entries.
Important: ensure the dedup check and the event processing happen in the same transaction if possible. In Kafka, you can store the offset with the dedup key to enable exactly-once semantics. Otherwise, an event processed but the offset not committed could lead to double processing after a rebalance.
Idempotency in Distributed Transactions (Saga Pattern)
When a single operation spans multiple services (e.g., an e-commerce order that charges a customer, deducts inventory, and schedules shipping), you need a saga to coordinate. Each step in a saga can fail, and the saga must compensate (undo) previous steps. Idempotency is critical here because the coordinator may retry a step after a timeout, and compensating actions must also be idempotent to avoid double refunds.
The key idea: each saga step must have its own idempotency key, derived from the saga ID and the step name (e.g., 'saga_123:charge', 'saga_123:refund'). This prevents a retry of the charge step from accidentally being treated as a new charge. A common mistake is using the same idempotency key for a charge and its corresponding refund. This could cause the refund to be skipped if the charge key is still in the dedup store. Always use different keys for forward and compensating actions.
Testing Idempotency in Your API
Idempotency is not something you think about once and forget. You need automated tests that verify retry safety under realistic conditions. The key scenarios: same request twice, concurrent requests with same key, request after key expiry, and request with different body for same key.
Write integration tests that send a request with an idempotency key, then send the exact same request again. Assert that the second response matches the first in status, headers, and body. For concurrent tests, use a barrier to send two identical requests simultaneously. This exposes race conditions in the idempotency check.
Also test negative cases: missing idempotency key should return 400, expired key should return 429, mismatched body should return 422. Your tests should cover both the happy path and the failure modes that cause production incidents.
For contract testing, use tools like Pact or Postman collections to enforce that all POST endpoints accept an Idempotency-Key header. This forces the team to implement idempotency from day one.
Don't forget to test the storage layer: simulate a Redis outage and verify that the idempotency check degrades gracefully (e.g., returns 503). A hard failure on storage should not silently accept duplicates.
Idempotency and Retry Strategies — How Clients Should Use Idempotency Keys
Idempotency is a server-side contract, but the client must follow rules too. The client generates the idempotency key before the first request and reuses it on every retry. The key must be unique per operation — never reuse a key across different operations. If the key is reused, the server may return the wrong cached response.
Retry strategy matters: clients should use exponential backoff with jitter to avoid overwhelming the server. Each retry sends the same idempotency key. If the server responds with 409 Conflict (key already used but first request still processing), the client should retry with the same key after a brief delay. If the server returns 429 Too Many Requests (key expired), the client must generate a new key and resubmit as a fresh operation.
Important: the client must not change the request body between retries. If the body changes, the idempotency key is invalid for the new request — the server should return 422. The client should always send the exact same payload on retries.
For mobile apps, store the key and request payload locally until a definitive success or irrecoverable failure is received. This prevents duplicate charges even if the app is killed and restarted.
Double Charge from Payment Gateway Retry
- Idempotency is not optional for any operation that creates side effects.
- Always assume clients will retry.
- Store the full response, not just a processed flag.
- Define a TTL that exceeds the longest expected retry window (at least 24 hours).
Key takeaways
Common mistakes to avoid
5 patternsUsing idempotency key without storing the response body
Reusing the same idempotency key for different operations
Setting TTL too short (e.g., 60 seconds)
Missing atomic check-and-set for concurrent requests
Client regenerating the idempotency key on each retry
Interview Questions on This Topic
Explain idempotency in REST APIs. Which HTTP methods are idempotent and why?
Frequently Asked Questions
That's Fundamentals. Mark it forged?
11 min read · try the examples if you haven't