Senior 5 min · June 25, 2026

Design a Digital Wallet: Avoiding Double-Spend and Data Loss at Scale

Design a digital wallet system that prevents double-spend, survives crashes, and scales to millions of users.

N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.

Follow
Production
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
 ● Production Incident 🔎 Debug Guide ⚙ Triage Commands
Quick Answer

Design a digital wallet by using an append-only ledger with idempotency keys, a balance cache with write-through, and a two-phase commit for cross-service transactions. Never update balances in place — always log transactions first.

✦ Definition~90s read
What is Design a Digital Wallet?

A digital wallet system is a service that manages user balances, processes transactions (credit/debit), and ensures every transaction is atomic, durable, and idempotent. It's the core ledger behind payments, rewards, and banking apps.

Think of a digital wallet like a bank teller's drawer.
Plain-English First

Think of a digital wallet like a bank teller's drawer. Every time you deposit or withdraw, the teller writes it in a logbook first, then updates the cash count. If the power goes out mid-transaction, the logbook tells them exactly where they left off. Your digital wallet does the same: it writes a transaction record before touching the balance. That way, if the server crashes, you never lose money or double-count.

I've seen a startup lose $200k in one night because their wallet system double-spent on concurrent withdrawals. The fix was a single index and a database constraint they'd skipped. Most wallet designs are wrong from day one — they treat balances as mutable values instead of derived sums. Here's what actually works in production.

The core problem is simple: you need to move money between accounts without ever losing a penny or creating money out of thin air. Banks solved this centuries ago with double-entry bookkeeping. Yet most developers reinvent it badly, using UPDATE balance = balance - amount without any protection against race conditions or crashes.

By the end of this, you'll be able to design a wallet that handles 10,000 transactions per second per shard, survives database failovers without data loss, and scales horizontally. You'll know exactly which database isolation level to use, how to implement idempotency, and when to reach for distributed transactions vs. eventual consistency.

The Ledger-First Pattern: Why You Never Update Balances Directly

The single biggest mistake in wallet design is treating a balance as a mutable field you UPDATE. That's how you lose money. The correct pattern is append-only ledger: every transaction is a row in a transactions table. The balance is a derived value — SELECT SUM(amount) FROM transactions WHERE account_id = ?. But that query is slow at scale, so you cache the balance with a write-through pattern.

Here's the flow: when a transaction comes in, you INSERT a row into the ledger with status='pending'. Then you atomically update the balance cache. If the INSERT succeeds but the cache update fails, you have a pending transaction that never settles. That's why you need a reconciler job that periodically processes pending transactions.

In production, you'll want a materialized balance column on the account table that you update in the same database transaction as the ledger insert. This gives you both the audit trail and fast reads. The key is the order: ledger first, balance second. If the server crashes after the ledger insert but before the balance update, the reconciler picks it up.

wallet_ledger.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
-- io.thecodeforge — System Design tutorial

-- Atomic transaction: ledger insert + balance update
BEGIN;

-- 1. Insert into ledger
INSERT INTO transactions (id, account_id, amount, type, status, idempotency_key, created_at)
VALUES ('txn_123', 'acct_456', -5000, 'WITHDRAWAL', 'COMPLETED', 'idem_789', NOW());

-- 2. Update balance (atomically, with check)
UPDATE accounts
SET balance = balance - 5000,
    version = version + 1
WHERE id = 'acct_456'
  AND balance >= 5000  -- prevent overdraft
  AND version = 42;    -- optimistic lock

-- If UPDATE affected 0 rows, rollback and retry or reject
COMMIT;
Output
Rows affected: 1 (ledger insert), 1 (balance update). If balance insufficient, UPDATE returns 0 rows, COMMIT fails, transaction rolled back.
Production Trap: The Lost Update
If you UPDATE balance without a WHERE clause that checks the current balance, you'll allow overdrafts. Always use WHERE balance >= amount for debits. Also, never use SELECT ... FOR UPDATE on the account row — it causes contention. Use optimistic locking with a version column instead.
Balance Update Strategy Decision Tree
IfSingle database, <10k TPS
UseUse materialized balance column with ledger insert in same transaction
IfSharded database, high throughput
UseUse ledger-only with periodic balance recomputation + write-through cache
IfNeed strict audit trail
UseAlways use append-only ledger, never UPDATE balance in place without a corresponding ledger entry
Digital Wallet: Avoiding Double-Spend & Data Loss THECODEFORGE.IO Digital Wallet: Avoiding Double-Spend & Data Loss Architecture patterns for safe, scalable wallet operations Ledger-First Pattern Append-only log; never update balance directly Idempotency Keys Unique key per request to safely retry Optimistic Locking Version check on balance before write Sharding the Wallet Partition by user ID for horizontal scale Reconciler Pattern Background job to detect and fix inconsistencies ⚠ Missing idempotency leads to double-spend on retry Always enforce idempotency keys at the API layer THECODEFORGE.IO
thecodeforge.io
Digital Wallet: Avoiding Double-Spend & Data Loss
Design Digital Wallet
Append-Only Ledger FlowTHECODEFORGE.IOAppend-Only Ledger FlowNever UPDATE balance directly — derive itClient RequestDeposit or withdrawal actionAppend TransactionINSERT row into transactions tableIdempotency CheckUNIQUE constraint on idempotency_keyDerive BalanceSELECT SUM(amount) FROM transactionsReturn ResultNew balance sent to client⚠ Never UPDATE balance — always recompute from ledgerTHECODEFORGE.IO
thecodeforge.io
Append-Only Ledger Flow
Design Digital Wallet

Idempotency Keys: The Only Way to Safely Retry

Networks fail. Clients retry. If you process the same withdrawal twice, you've created money. The fix is idempotency keys: a unique string the client sends with every request. Your service deduplicates based on that key. The key must be unique per account — a UNIQUE constraint on (idempotency_key, account_id) in the ledger table.

When a retry arrives, you look up the idempotency key. If it exists, return the previous result. If not, process normally. This is simple but has a gotcha: what if the first request timed out after the server committed the transaction but before the client got the response? The retry will find the key and return success — correct. But what if the first request never reached the server? The retry will process fresh — also correct.

The real danger is when the idempotency key is generated incorrectly. Never let the client generate it — they might reuse keys across different transactions. Generate it server-side from a unique request fingerprint, or require the client to provide a UUID that you validate is unique per account.

idempotency_middleware.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# io.thecodeforge — System Design tutorial

import uuid
from flask import request, g
from your_db import db

def idempotency_middleware():
    idem_key = request.headers.get('Idempotency-Key')
    if not idem_key:
        # Generate server-side if client didn't provide
        idem_key = str(uuid.uuid4())
    
    # Check if already processed
    existing = db.execute(
        "SELECT result FROM idempotency_cache WHERE key = ? AND account_id = ?",
        (idem_key, g.account_id)
    ).fetchone()
    if existing:
        return existing['result']  # Return cached response
    
    # Store key with status 'processing' to prevent concurrent duplicates
    db.execute(
        "INSERT INTO idempotency_cache (key, account_id, status) VALUES (?, ?, 'processing')",
        (idem_key, g.account_id)
    )
    # ... process transaction ...
    # On success, update status to 'completed' and store result
Output
If key exists, returns cached result immediately. Otherwise, inserts key and proceeds. Duplicate inserts fail on UNIQUE constraint.
Senior Shortcut: Idempotency Key Expiry
Set a TTL on idempotency keys (e.g., 24 hours). After that, the client must generate a new key. This prevents the cache from growing unbounded. Use Redis with EXPIRE for automatic cleanup.

Concurrency Control: Optimistic Locking Beats Pessimistic Every Time

When two withdrawals hit the same account simultaneously, you need to prevent both from succeeding if the balance is insufficient. Pessimistic locking (SELECT ... FOR UPDATE) serializes access — it works but kills throughput. Optimistic locking with a version column is better: you read the current version, compute the new balance, and UPDATE WHERE version = old_version. If another transaction updated first, your UPDATE affects 0 rows, and you retry.

In practice, you'll combine this with the ledger-first pattern. The ledger insert doesn't need a lock — it's append-only. The balance update uses optimistic locking. If the update fails, you roll back the ledger insert (or mark it as 'failed'). This gives you high concurrency with minimal contention.

One gotcha: if you have a materialized balance column, the UPDATE must be in the same database transaction as the ledger insert. If you're using a separate cache (Redis), you can't easily roll back the cache. That's why write-through cache with database as source of truth is safer.

optimistic_lock.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- io.thecodeforge — System Design tutorial

-- Read current state
SELECT balance, version FROM accounts WHERE id = 'acct_456';
-- Assume balance=100, version=5

-- Attempt update with optimistic lock
UPDATE accounts
SET balance = balance - 30,
    version = version + 1
WHERE id = 'acct_456'
  AND version = 5
  AND balance >= 30;

-- Check rows affected
-- If 1: success. If 0: conflict or insufficient funds, retry from read.
Output
If no concurrent update, 1 row affected. If another transaction committed first, 0 rows affected — retry.
Never Do This: SELECT FOR UPDATE on Hot Accounts
Using SELECT ... FOR UPDATE on a popular account (e.g., a merchant account receiving many payments) will cause queueing and timeouts. Optimistic locking avoids this. If you must use pessimistic locking, keep the transaction short and consider sharding the account into sub-accounts.
Optimistic vs Pessimistic LockingTHECODEFORGE.IOOptimistic vs Pessimistic LockingConcurrency control for wallet withdrawalsOptimistic LockingUses version column on account rowUPDATE ... WHERE version = :old_versionRetry on version mismatch (low cost)High throughput under contentionBest for high-scale wallet systemsPessimistic LockingSELECT ... FOR UPDATE blocks rowsSerializes all writes to accountSimple to reason aboutKills throughput under loadRisk of deadlocks and timeoutsOptimistic wins for scale — retries are cheaper than locksTHECODEFORGE.IO
thecodeforge.io
Optimistic vs Pessimistic Locking
Design Digital Wallet

Sharding the Wallet: How to Scale Beyond a Single Database

When you have millions of accounts, a single database becomes a bottleneck. The natural shard key is account_id. All transactions for an account go to the same shard. This keeps the ledger and balance local to one shard, avoiding distributed transactions.

But sharding introduces new problems: cross-shard transactions (e.g., transferring from account A on shard 1 to account B on shard 2). You can't do a single database transaction across shards. The solution is a two-phase commit (2PC) or, more practically, a saga pattern with compensating transactions.

For most wallet systems, you avoid cross-shard transfers by routing through a central ledger (like a clearing house). Each shard debits the source account and credits a 'clearing' account on the same shard. A separate async process moves funds between clearing accounts across shards. This is eventual consistency, but it's safe if you track the state of each leg.

shard_routing.systemdesignSYSTEMDESIGN
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// io.thecodeforge — System Design tutorial

// Pseudocode for shard routing
function getShard(accountId) {
    // Consistent hashing or range-based
    return hash(accountId) % NUM_SHARDS;
}

function transfer(fromAccount, toAccount, amount) {
    fromShard = getShard(fromAccount);
    toShard = getShard(toAccount);
    
    if (fromShard == toShard) {
        // Single shard: use local transaction
        shardDB[fromShard].execute("BEGIN; ... COMMIT;");
    } else {
        // Cross-shard: use saga pattern
        // 1. Debit fromAccount on fromShard
        // 2. If success, credit toAccount on toShard
        // 3. If credit fails, issue compensating credit back to fromAccount
        // Use idempotency keys to handle failures
    }
}
Output
Single-shard transfers are atomic. Cross-shard transfers use saga with compensating transactions.
Interview Gold: Shard Rebalancing
When you add shards, you need to move data. Use virtual nodes (consistent hashing) to minimize data movement. Or use a directory service that maps account_id to shard, allowing you to move accounts one at a time.

Handling Failures: The Reconciler Pattern

No matter how careful you are, transactions will get stuck in 'pending' state. Network partitions, database failovers, or bugs can leave a transaction half-processed. You need a reconciler: a background job that scans for pending transactions older than a threshold and either completes or reverses them.

The reconciler must be idempotent. It checks the current state of the account and decides whether the transaction was actually applied. For example, if a withdrawal has status='pending' but the balance already reflects the debit, the reconciler marks it 'completed'. If the balance doesn't reflect the debit, it either retries or marks it 'failed'.

In production, run the reconciler every minute on a small set of recent pending transactions. Use a database index on (status, created_at) to make the scan efficient. Alert if any transaction stays pending for more than 5 minutes.

reconciler.pyPYTHON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# io.thecodeforge — System Design tutorial

def reconcile_pending_transactions():
    # Find transactions pending for > 60 seconds
    pending = db.execute("""
        SELECT id, account_id, amount, type
        FROM transactions
        WHERE status = 'pending'
          AND created_at < NOW() - INTERVAL 60 SECOND
        LIMIT 100
    """).fetchall()
    
    for txn in pending:
        # Check current balance
        balance = db.execute(
            "SELECT balance FROM accounts WHERE id = ?",
            (txn.account_id,)
        ).fetchone()['balance']
        
        # Determine if transaction was already applied
        # This logic depends on your design; here we assume balance includes pending
        if (txn.type == 'WITHDRAWAL' and balance < expected_balance_without_txn) or \
           (txn.type == 'DEPOSIT' and balance > expected_balance_without_txn):
            # Already applied, mark completed
            db.execute("UPDATE transactions SET status = 'completed' WHERE id = ?", (txn.id,))
        else:
            # Not applied, attempt to apply now
            # Use optimistic lock to avoid race
            rows = db.execute("""
                UPDATE accounts
                SET balance = balance + ?
                WHERE id = ? AND version = ?
            """, (txn.amount if txn.type == 'DEPOSIT' else -txn.amount,
                   txn.account_id, current_version))
            if rows:
                db.execute("UPDATE transactions SET status = 'completed' WHERE id = ?", (txn.id,))
            else:
                # Conflict, skip for next run
                pass
Output
Pending transactions are resolved within 1-2 minutes. Alerts fire if any stay pending longer.
Production Trap: Reconciler Double-Processing
If your reconciler runs concurrently with the main transaction handler, you can get race conditions. Use a distributed lock (e.g., Redis Redlock) on the transaction ID to ensure only one process handles it. Or use a status field with atomic CAS (compare-and-swap) to claim ownership.

When Not to Use This: Simpler Alternatives for Low-Risk Systems

If you're building a loyalty points wallet or a in-game currency that can be reissued, you don't need this level of rigor. You can use a simple UPDATE balance with a database transaction and accept occasional inconsistencies that you fix with a daily reconciliation script. The overhead of idempotency keys, optimistic locking, and reconcilers is only justified when real money is on the line.

Also, if your transaction volume is under 100 TPS and you have a single database, you can skip sharding. Use a single ACID-compliant database with materialized balances and be done. Don't over-engineer.

But if you're handling payments, banking, or any system where a single error costs real money, follow the patterns above. The cost of implementing them upfront is far less than the cost of a production incident.

Senior Shortcut: Start Simple, Add Complexity Later
Start with a single database, materialized balance, and idempotency keys. Add sharding and reconcilers only when you hit performance limits. Premature sharding is the root of all evil in wallet design.
● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom
Every 3 hours, the wallet service would OOM-kill its container. Heap dump showed millions of Transaction objects in memory.
Assumption
Memory leak in the transaction processing loop. Team spent a week hunting for unclosed connections.
Root cause
The balance cache was loading all transactions for an account into memory to compute balance on every read. For power users with 100k+ transactions, this blew the heap. No pagination, no streaming.
Fix
Added a materialized balance column updated atomically with each transaction. Cache now stores only the balance, not the transaction list. Balance computation uses SELECT SUM(amount) with index on (account_id, created_at) for occasional audits.
Key lesson
  • Never load the full ledger into memory to compute a balance — store the running total and update it atomically.
Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries
Symptom · 01
Duplicate transactions in ledger for same idempotency key
Fix
1. Check if UNIQUE constraint exists on (idempotency_key, account_id). 2. If not, add it. 3. Delete duplicates manually after verifying correct state. 4. Add application-level check before insert.
Symptom · 02
Balance goes negative despite CHECK constraint
Fix
1. Verify CHECK(balance >= 0) is enforced. 2. Check if multiple threads bypass the constraint by reading stale balance. 3. Switch to optimistic locking. 4. Run a full ledger replay to find the offending transaction.
Symptom · 03
Reconciler marks transactions as completed that were never applied
Fix
1. Check reconciler logic: it must verify balance change before marking completed. 2. Add a 'reconciled_at' timestamp. 3. Implement a dry-run mode that logs decisions without applying. 4. Fix the condition that incorrectly assumes applied.
★ Digital Wallet Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.
Duplicate transaction detected: `Duplicate entry 'idem_789' for key 'idempotency_unique'`
Immediate action
Check if the duplicate was from a retry or a bug.
Commands
SELECT * FROM transactions WHERE idempotency_key = 'idem_789';
SELECT * FROM idempotency_cache WHERE key = 'idem_789';
Fix now
If duplicate is real, delete the later one and ensure idempotency key is unique per account.
Balance negative for account 'acct_456'+
Immediate action
Check if CHECK constraint is missing or bypassed.
Commands
SELECT balance, version FROM accounts WHERE id = 'acct_456';
SELECT SUM(amount) FROM transactions WHERE account_id = 'acct_456';
Fix now
Add CHECK(balance >= 0) and audit recent transactions. Use optimistic locking to prevent race conditions.
Pending transactions not being reconciled+
Immediate action
Check reconciler is running and index exists.
Commands
SELECT COUNT(*) FROM transactions WHERE status = 'pending' AND created_at < NOW() - INTERVAL 5 MINUTE;
SHOW PROCESSLIST; -- check for reconciler thread
Fix now
Ensure index on (status, created_at). Restart reconciler if stuck. Add alert if pending count > threshold.
High database CPU due to balance queries+
Immediate action
Check if ledger-only balance computation is causing full table scans.
Commands
EXPLAIN SELECT SUM(amount) FROM transactions WHERE account_id = 'acct_456';
SHOW INDEX FROM transactions;
Fix now
Add index on (account_id, created_at). Consider materialized balance column with write-through cache.
Feature / AspectMaterialized Balance (Recommended)Ledger-Only Balance
Read performanceO(1) — single row readO(n) — SUM over all transactions (slow at scale)
Write complexityMust update balance atomically with ledger insertSimple append, but read is expensive
Data loss riskLow — balance can be recomputed from ledgerNone — ledger is source of truth
ConcurrencyOptimistic locking neededNo lock needed on balance, but SUM query may lock rows
Use caseHigh-read, high-write systemsAudit-heavy, low-write systems

Key takeaways

1
Always append to the ledger before updating the balance
the ledger is the source of truth.
2
Idempotency keys with a UNIQUE constraint are non-negotiable for safe retries.
3
Optimistic locking beats pessimistic locking for high-concurrency wallet systems.
4
A reconciler job is mandatory
no system is perfect, and stuck transactions will happen.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
How would you handle concurrent withdrawals on the same account without ...
Q02SENIOR
When would you choose a materialized balance column over computing balan...
Q03SENIOR
What happens if a database failover occurs mid-transaction during a wall...
Q04JUNIOR
What is idempotency and why is it critical in wallet systems?
Q05SENIOR
A customer reports that their withdrawal succeeded but the balance didn'...
Q06SENIOR
Design a wallet system that supports 100k transactions per second across...
Q01 of 06SENIOR

How would you handle concurrent withdrawals on the same account without losing money?

ANSWER
Use optimistic locking with a version column. Read the current balance and version, compute new balance, then UPDATE accounts SET balance = new_balance, version = version + 1 WHERE id = ? AND version = old_version AND balance >= amount. If 0 rows affected, retry or reject. This avoids locks and scales better than SELECT FOR UPDATE.
FAQ · 4 QUESTIONS

Frequently Asked Questions

01
How do I prevent double-spending in a digital wallet?
02
What's the difference between a ledger and a balance in wallet design?
03
How do I handle database failover in a wallet system?
04
What is the best database for a digital wallet?
N
Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.

Follow
Verified
production tested
June 25, 2026
last updated
1,663
articles · all by Naren
🔥

That's Real World. Mark it forged?

5 min read · try the examples if you haven't

Previous
Design Airbnb
35 / 40 · Real World
Next
Design a Stock Exchange