Senior 5 min · June 25, 2026

Design a Digital Wallet: Avoiding Double-Spend and Data Loss at Scale

Q: How do I prevent double-spending in a digital wallet?

Use idempotency keys and optimistic locking. Each transaction has a unique idempotency key stored with a UNIQUE constraint. On retry, the system returns the cached result. Optimistic locking on the balance update ensures that concurrent withdrawals don't both succeed if the balance is insufficient.

Q: What's the difference between a ledger and a balance in wallet design?

The ledger is an append-only log of every transaction. The balance is a derived value (SUM of ledger entries) or a materialized column. The ledger provides auditability and is the source of truth. The balance is a performance optimization for fast reads. Never update the balance without a corresponding ledger entry.

Q: How do I handle database failover in a wallet system?

Ensure transactions are idempotent. Use a reconciler that scans for pending transactions after failover and resolves them. The ledger insert must be durable before the balance update. Use synchronous replication for the ledger to avoid data loss on failover.

Q: What is the best database for a digital wallet?

A relational database with ACID transactions (PostgreSQL, MySQL with InnoDB) is the safest choice. NoSQL databases lack the transactional guarantees needed for accurate balances. For high throughput, shard PostgreSQL or use CockroachDB for distributed transactions.

Design a digital wallet system that prevents double-spend, survives crashes, and scales to millions of users.

Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.

✓ Production

production tested

June 25, 2026

last updated

1,663

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Design a digital wallet by using an append-only ledger with idempotency keys, a balance cache with write-through, and a two-phase commit for cross-service transactions. Never update balances in place — always log transactions first.

✦ Definition~90s read

What is Design a Digital Wallet?

A digital wallet system is a service that manages user balances, processes transactions (credit/debit), and ensures every transaction is atomic, durable, and idempotent. It's the core ledger behind payments, rewards, and banking apps.

★

Think of a digital wallet like a bank teller's drawer.

Plain-English First

Think of a digital wallet like a bank teller's drawer. Every time you deposit or withdraw, the teller writes it in a logbook first, then updates the cash count. If the power goes out mid-transaction, the logbook tells them exactly where they left off. Your digital wallet does the same: it writes a transaction record before touching the balance. That way, if the server crashes, you never lose money or double-count.

I've seen a startup lose $200k in one night because their wallet system double-spent on concurrent withdrawals. The fix was a single index and a database constraint they'd skipped. Most wallet designs are wrong from day one — they treat balances as mutable values instead of derived sums. Here's what actually works in production.

The core problem is simple: you need to move money between accounts without ever losing a penny or creating money out of thin air. Banks solved this centuries ago with double-entry bookkeeping. Yet most developers reinvent it badly, using UPDATE balance = balance - amount without any protection against race conditions or crashes.

By the end of this, you'll be able to design a wallet that handles 10,000 transactions per second per shard, survives database failovers without data loss, and scales horizontally. You'll know exactly which database isolation level to use, how to implement idempotency, and when to reach for distributed transactions vs. eventual consistency.

The Ledger-First Pattern: Why You Never Update Balances Directly

The single biggest mistake in wallet design is treating a balance as a mutable field you UPDATE. That's how you lose money. The correct pattern is append-only ledger: every transaction is a row in a transactions table. The balance is a derived value — SELECT SUM(amount) FROM transactions WHERE account_id = ?. But that query is slow at scale, so you cache the balance with a write-through pattern.

Here's the flow: when a transaction comes in, you INSERT a row into the ledger with status='pending'. Then you atomically update the balance cache. If the INSERT succeeds but the cache update fails, you have a pending transaction that never settles. That's why you need a reconciler job that periodically processes pending transactions.

In production, you'll want a materialized balance column on the account table that you update in the same database transaction as the ledger insert. This gives you both the audit trail and fast reads. The key is the order: ledger first, balance second. If the server crashes after the ledger insert but before the balance update, the reconciler picks it up.

wallet_ledger.sqlSQL

-- io.thecodeforge — System Design tutorial

-- Atomic transaction: ledger insert + balance update
BEGIN;

-- 1. Insert into ledger
INSERT INTO transactions (id, account_id, amount, type, status, idempotency_key, created_at)
VALUES ('txn_123', 'acct_456', -5000, 'WITHDRAWAL', 'COMPLETED', 'idem_789', NOW());

-- 2. Update balance (atomically, with check)
UPDATE accounts
SET balance = balance - 5000,
    version = version + 1
WHERE id = 'acct_456'
  AND balance >= 5000  -- prevent overdraft
  AND version = 42;    -- optimistic lock

-- If UPDATE affected 0 rows, rollback and retry or reject
COMMIT;

Output

Rows affected: 1 (ledger insert), 1 (balance update). If balance insufficient, UPDATE returns 0 rows, COMMIT fails, transaction rolled back.

Production Trap: The Lost Update

If you UPDATE balance without a WHERE clause that checks the current balance, you'll allow overdrafts. Always use WHERE balance >= amount for debits. Also, never use SELECT ... FOR UPDATE on the account row — it causes contention. Use optimistic locking with a version column instead.

Balance Update Strategy Decision Tree

IfSingle database, <10k TPS

→

UseUse materialized balance column with ledger insert in same transaction

IfSharded database, high throughput

→

UseUse ledger-only with periodic balance recomputation + write-through cache

IfNeed strict audit trail

→

UseAlways use append-only ledger, never UPDATE balance in place without a corresponding ledger entry

thecodeforge.io

Digital Wallet: Avoiding Double-Spend & Data Loss

Design Digital Wallet

thecodeforge.io

Append-Only Ledger Flow

Design Digital Wallet

Idempotency Keys: The Only Way to Safely Retry

Networks fail. Clients retry. If you process the same withdrawal twice, you've created money. The fix is idempotency keys: a unique string the client sends with every request. Your service deduplicates based on that key. The key must be unique per account — a UNIQUE constraint on (idempotency_key, account_id) in the ledger table.

When a retry arrives, you look up the idempotency key. If it exists, return the previous result. If not, process normally. This is simple but has a gotcha: what if the first request timed out after the server committed the transaction but before the client got the response? The retry will find the key and return success — correct. But what if the first request never reached the server? The retry will process fresh — also correct.

The real danger is when the idempotency key is generated incorrectly. Never let the client generate it — they might reuse keys across different transactions. Generate it server-side from a unique request fingerprint, or require the client to provide a UUID that you validate is unique per account.

idempotency_middleware.pyPYTHON

# io.thecodeforge — System Design tutorial

import uuid
from flask import request, g
from your_db import db

def idempotency_middleware():
    idem_key = request.headers.get('Idempotency-Key')
    if not idem_key:
        # Generate server-side if client didn't provide
        idem_key = str(uuid.uuid4())
    
    # Check if already processed
    existing = db.execute(
        "SELECT result FROM idempotency_cache WHERE key = ? AND account_id = ?",
        (idem_key, g.account_id)
    ).fetchone()
    if existing:
        return existing['result']  # Return cached response
    
    # Store key with status 'processing' to prevent concurrent duplicates
    db.execute(
        "INSERT INTO idempotency_cache (key, account_id, status) VALUES (?, ?, 'processing')",
        (idem_key, g.account_id)
    )
    # ... process transaction ...
    # On success, update status to 'completed' and store result

Output

If key exists, returns cached result immediately. Otherwise, inserts key and proceeds. Duplicate inserts fail on UNIQUE constraint.

Senior Shortcut: Idempotency Key Expiry

Set a TTL on idempotency keys (e.g., 24 hours). After that, the client must generate a new key. This prevents the cache from growing unbounded. Use Redis with EXPIRE for automatic cleanup.

Concurrency Control: Optimistic Locking Beats Pessimistic Every Time

When two withdrawals hit the same account simultaneously, you need to prevent both from succeeding if the balance is insufficient. Pessimistic locking (SELECT ... FOR UPDATE) serializes access — it works but kills throughput. Optimistic locking with a version column is better: you read the current version, compute the new balance, and UPDATE WHERE version = old_version. If another transaction updated first, your UPDATE affects 0 rows, and you retry.

In practice, you'll combine this with the ledger-first pattern. The ledger insert doesn't need a lock — it's append-only. The balance update uses optimistic locking. If the update fails, you roll back the ledger insert (or mark it as 'failed'). This gives you high concurrency with minimal contention.

One gotcha: if you have a materialized balance column, the UPDATE must be in the same database transaction as the ledger insert. If you're using a separate cache (Redis), you can't easily roll back the cache. That's why write-through cache with database as source of truth is safer.

optimistic_lock.sqlSQL

-- io.thecodeforge — System Design tutorial

-- Read current state
SELECT balance, version FROM accounts WHERE id = 'acct_456';
-- Assume balance=100, version=5

-- Attempt update with optimistic lock
UPDATE accounts
SET balance = balance - 30,
    version = version + 1
WHERE id = 'acct_456'
  AND version = 5
  AND balance >= 30;

-- Check rows affected
-- If 1: success. If 0: conflict or insufficient funds, retry from read.

Output

If no concurrent update, 1 row affected. If another transaction committed first, 0 rows affected — retry.

Never Do This: SELECT FOR UPDATE on Hot Accounts

Using SELECT ... FOR UPDATE on a popular account (e.g., a merchant account receiving many payments) will cause queueing and timeouts. Optimistic locking avoids this. If you must use pessimistic locking, keep the transaction short and consider sharding the account into sub-accounts.

thecodeforge.io

Optimistic vs Pessimistic Locking

Design Digital Wallet

Sharding the Wallet: How to Scale Beyond a Single Database

When you have millions of accounts, a single database becomes a bottleneck. The natural shard key is account_id. All transactions for an account go to the same shard. This keeps the ledger and balance local to one shard, avoiding distributed transactions.

But sharding introduces new problems: cross-shard transactions (e.g., transferring from account A on shard 1 to account B on shard 2). You can't do a single database transaction across shards. The solution is a two-phase commit (2PC) or, more practically, a saga pattern with compensating transactions.

For most wallet systems, you avoid cross-shard transfers by routing through a central ledger (like a clearing house). Each shard debits the source account and credits a 'clearing' account on the same shard. A separate async process moves funds between clearing accounts across shards. This is eventual consistency, but it's safe if you track the state of each leg.

shard_routing.systemdesignSYSTEMDESIGN

// io.thecodeforge — System Design tutorial

// Pseudocode for shard routing
function getShard(accountId) {
    // Consistent hashing or range-based
    return hash(accountId) % NUM_SHARDS;
}

function transfer(fromAccount, toAccount, amount) {
    fromShard = getShard(fromAccount);
    toShard = getShard(toAccount);
    
    if (fromShard == toShard) {
        // Single shard: use local transaction
        shardDB[fromShard].execute("BEGIN; ... COMMIT;");
    } else {
        // Cross-shard: use saga pattern
        // 1. Debit fromAccount on fromShard
        // 2. If success, credit toAccount on toShard
        // 3. If credit fails, issue compensating credit back to fromAccount
        // Use idempotency keys to handle failures
    }
}

Output

Single-shard transfers are atomic. Cross-shard transfers use saga with compensating transactions.

Interview Gold: Shard Rebalancing

When you add shards, you need to move data. Use virtual nodes (consistent hashing) to minimize data movement. Or use a directory service that maps account_id to shard, allowing you to move accounts one at a time.

Handling Failures: The Reconciler Pattern

No matter how careful you are, transactions will get stuck in 'pending' state. Network partitions, database failovers, or bugs can leave a transaction half-processed. You need a reconciler: a background job that scans for pending transactions older than a threshold and either completes or reverses them.

The reconciler must be idempotent. It checks the current state of the account and decides whether the transaction was actually applied. For example, if a withdrawal has status='pending' but the balance already reflects the debit, the reconciler marks it 'completed'. If the balance doesn't reflect the debit, it either retries or marks it 'failed'.

In production, run the reconciler every minute on a small set of recent pending transactions. Use a database index on (status, created_at) to make the scan efficient. Alert if any transaction stays pending for more than 5 minutes.

reconciler.pyPYTHON

# io.thecodeforge — System Design tutorial

def reconcile_pending_transactions():
    # Find transactions pending for > 60 seconds
    pending = db.execute("""
        SELECT id, account_id, amount, type
        FROM transactions
        WHERE status = 'pending'
          AND created_at < NOW() - INTERVAL 60 SECOND
        LIMIT 100
    """).fetchall()
    
    for txn in pending:
        # Check current balance
        balance = db.execute(
            "SELECT balance FROM accounts WHERE id = ?",
            (txn.account_id,)
        ).fetchone()['balance']
        
        # Determine if transaction was already applied
        # This logic depends on your design; here we assume balance includes pending
        if (txn.type == 'WITHDRAWAL' and balance < expected_balance_without_txn) or \
           (txn.type == 'DEPOSIT' and balance > expected_balance_without_txn):
            # Already applied, mark completed
            db.execute("UPDATE transactions SET status = 'completed' WHERE id = ?", (txn.id,))
        else:
            # Not applied, attempt to apply now
            # Use optimistic lock to avoid race
            rows = db.execute("""
                UPDATE accounts
                SET balance = balance + ?
                WHERE id = ? AND version = ?
            """, (txn.amount if txn.type == 'DEPOSIT' else -txn.amount,
                   txn.account_id, current_version))
            if rows:
                db.execute("UPDATE transactions SET status = 'completed' WHERE id = ?", (txn.id,))
            else:
                # Conflict, skip for next run
                pass

Output

Pending transactions are resolved within 1-2 minutes. Alerts fire if any stay pending longer.

Production Trap: Reconciler Double-Processing

If your reconciler runs concurrently with the main transaction handler, you can get race conditions. Use a distributed lock (e.g., Redis Redlock) on the transaction ID to ensure only one process handles it. Or use a status field with atomic CAS (compare-and-swap) to claim ownership.

When Not to Use This: Simpler Alternatives for Low-Risk Systems

If you're building a loyalty points wallet or a in-game currency that can be reissued, you don't need this level of rigor. You can use a simple UPDATE balance with a database transaction and accept occasional inconsistencies that you fix with a daily reconciliation script. The overhead of idempotency keys, optimistic locking, and reconcilers is only justified when real money is on the line.

Also, if your transaction volume is under 100 TPS and you have a single database, you can skip sharding. Use a single ACID-compliant database with materialized balances and be done. Don't over-engineer.

But if you're handling payments, banking, or any system where a single error costs real money, follow the patterns above. The cost of implementing them upfront is far less than the cost of a production incident.

Senior Shortcut: Start Simple, Add Complexity Later

Start with a single database, materialized balance, and idempotency keys. Add sharding and reconcilers only when you hit performance limits. Premature sharding is the root of all evil in wallet design.

● Production incidentPOST-MORTEMseverity: high

The 4GB Container That Kept Dying

Symptom

Every 3 hours, the wallet service would OOM-kill its container. Heap dump showed millions of Transaction objects in memory.

Assumption

Memory leak in the transaction processing loop. Team spent a week hunting for unclosed connections.

Root cause

The balance cache was loading all transactions for an account into memory to compute balance on every read. For power users with 100k+ transactions, this blew the heap. No pagination, no streaming.

Fix

Added a materialized balance column updated atomically with each transaction. Cache now stores only the balance, not the transaction list. Balance computation uses SELECT SUM(amount) with index on (account_id, created_at) for occasional audits.

Key lesson

Never load the full ledger into memory to compute a balance — store the running total and update it atomically.

Production debug guideSystematic recovery paths for the failure modes engineers actually hit.3 entries

Symptom · 01

Duplicate transactions in ledger for same idempotency key

→

Fix

1. Check if UNIQUE constraint exists on (idempotency_key, account_id). 2. If not, add it. 3. Delete duplicates manually after verifying correct state. 4. Add application-level check before insert.

Symptom · 02

Balance goes negative despite CHECK constraint

→

Fix

1. Verify CHECK(balance >= 0) is enforced. 2. Check if multiple threads bypass the constraint by reading stale balance. 3. Switch to optimistic locking. 4. Run a full ledger replay to find the offending transaction.

Symptom · 03

Reconciler marks transactions as completed that were never applied

→

Fix

1. Check reconciler logic: it must verify balance change before marking completed. 2. Add a 'reconciled_at' timestamp. 3. Implement a dry-run mode that logs decisions without applying. 4. Fix the condition that incorrectly assumes applied.

★ Digital Wallet Triage Cheat SheetFirst-response commands for when things go wrong — copy-paste ready.

Duplicate transaction detected: `Duplicate entry 'idem_789' for key 'idempotency_unique'`−

Immediate action

Check if the duplicate was from a retry or a bug.

Commands

SELECT * FROM transactions WHERE idempotency_key = 'idem_789';

SELECT * FROM idempotency_cache WHERE key = 'idem_789';

Fix now

If duplicate is real, delete the later one and ensure idempotency key is unique per account.

Balance negative for account 'acct_456'+

Pending transactions not being reconciled+

High database CPU due to balance queries+

Feature / Aspect	Materialized Balance (Recommended)	Ledger-Only Balance
Read performance	O(1) — single row read	O(n) — SUM over all transactions (slow at scale)
Write complexity	Must update balance atomically with ledger insert	Simple append, but read is expensive
Data loss risk	Low — balance can be recomputed from ledger	None — ledger is source of truth
Concurrency	Optimistic locking needed	No lock needed on balance, but SUM query may lock rows
Use case	High-read, high-write systems	Audit-heavy, low-write systems

Key takeaways

Always append to the ledger before updating the balance

the ledger is the source of truth.

Idempotency keys with a UNIQUE constraint are non-negotiable for safe retries.

Optimistic locking beats pessimistic locking for high-concurrency wallet systems.

A reconciler job is mandatory

no system is perfect, and stuck transactions will happen.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

How would you handle concurrent withdrawals on the same account without ...

Q02SENIOR

When would you choose a materialized balance column over computing balan...

Q03SENIOR

What happens if a database failover occurs mid-transaction during a wall...

Q04JUNIOR

What is idempotency and why is it critical in wallet systems?

Q05SENIOR

A customer reports that their withdrawal succeeded but the balance didn'...

Q06SENIOR

Design a wallet system that supports 100k transactions per second across...

Q01 of 06SENIOR

How would you handle concurrent withdrawals on the same account without losing money?

ANSWER

Use optimistic locking with a version column. Read the current balance and version, compute new balance, then UPDATE accounts SET balance = new_balance, version = version + 1 WHERE id = ? AND version = old_version AND balance >= amount. If 0 rows affected, retry or reject. This avoids locks and scales better than SELECT FOR UPDATE.

FAQ · 4 QUESTIONS

Frequently Asked Questions

How do I prevent double-spending in a digital wallet?

What's the difference between a ledger and a balance in wallet design?

How do I handle database failover in a wallet system?

What is the best database for a digital wallet?

Naren Founder & Principal Engineer

20+ years shipping large-scale distributed systems. Everything here is grounded in real deployments.

✓ Verified

production tested

June 25, 2026

last updated

1,663

articles · all by Naren

🔥

That's Real World. Mark it forged?

5 min read · try the examples if you haven't