Event Sourcing Explained: Architecture, Patterns and Production Pitfalls
Event Sourcing demystified — learn how append-only event logs replace mutable state, how CQRS fits in, and the real gotchas that trip up engineers in production.
- Event Sourcing persists every state change as an immutable event, not the current snapshot.
- The event log becomes the sole source of truth; projections derive read models.
- CQRS separates write and read models — you query from projections, not from events.
- Snapshots prevent unbounded replay cost: take them every ~500 events.
- Schema evolution demands versioned deserialization — events outlive your code.
- Idempotency is non-negotiable: duplicate events must produce the same state.
Imagine your bank never just overwrites your balance — instead it keeps every single deposit and withdrawal in a ledger, and your current balance is just the total of all those entries. Event Sourcing works the same way: instead of saving 'the current state' of something, you save every change that ever happened to it as a sequence of immutable events. Want to know what your data looked like last Tuesday at 3pm? Just replay the events up to that point. It's like having a full undo history for your entire application.
One thing people miss: the ledger itself is write-only. You never erase or edit past entries. That means you also never lose intent — each event captures what changed and why, not just the final number.
Here's the practical difference: with CRUD, a buggy UPDATE can silently corrupt your data. With Event Sourcing, every state is derived from an auditable chain. You can always trace back and fix a projection without losing the original history. That's a game-changer for regulated industries.
But here's the catch: the ledger grows forever. If you don't plan for growth, replay times blow up. Snapshots are your escape hatch, but they add complexity. Don't let the simplicity of the analogy fool you — the operational cost is real.
Most databases are built around a lie of convenience: they store only the present. Your users table holds today's email address, not the five addresses the user had before it. Your orders table shows 'CANCELLED' but not who cancelled it, when, or why. This feels fine until the day your CEO asks "why did revenue drop on the 14th?" and your answer is silence — the data that would have told you is gone, overwritten by the next UPDATE statement. Event Sourcing is the architectural answer to that silence.
The problem Event Sourcing solves is deceptively simple: traditional CRUD systems treat every write as a destructive operation. State changes obliterate the history that caused them. This creates three compounding pain points — no audit trail, no ability to reconstruct past state, and a tight coupling between the write model and read model that makes complex business domains nearly impossible to express cleanly. Event Sourcing decouples all three by making the event log the source of truth, and deriving all state from it.
By the end of this article you'll understand how to design an event store from first principles, why snapshots exist and when to reach for them, how CQRS and Event Sourcing compose together, and — critically — the production gotchas around schema evolution, idempotency, and eventual consistency that separate engineers who've shipped this from engineers who've only read about it. Code examples are in Java but the patterns are language-agnostic.
Here's the thing: if your team isn't ready to own an immutable log, you'll reintroduce CRUD patterns inside your event store within six months. I've seen it. It gets ugly. Event Sourcing demands operational discipline — backup integrity checks, schema governance, and projection monitoring. It's not a library you drop in; it's an architectural commitment. Expect at least two sprints of learning curve before your team stops treating events as glorified INSERTs.
One more thing: don't start with Event Sourcing on day one of a greenfield project. Start with CRUD and a simple audit log. Migrate to full ES when the pain of lost history outweighs the operational overhead. I've watched teams adopt ES prematurely and spend more time managing the infrastructure than building features.
What is Event Sourcing?
Event Sourcing is a core concept in System Design. Rather than starting with a dry definition, let's see it in action and understand why it exists.
In a traditional CRUD system, you model the world with mutable state. A user's email is updated in place — the old value is gone. With Event Sourcing, every state change is captured as an event: UserEmailChanged, UserRelocated, OrderPlaced. These events are stored in append-only order. The current state is derived by replaying (or applying) all events for a given aggregate.
This means you can query the past. Want to know the user's email as of last Tuesday? Replay events up to that timestamp. The history is not lost — it's the primary data model.
Here's a concrete trap: I've seen teams store the current state in a separate table and then try to retroactively build an event log from it. That doesn't work — you've already lost the intent behind each change. Event Sourcing forces you to capture the 'why' alongside the 'what'.
Another trap: thinking Event Sourcing means you don't need a traditional database at all. You still need one — for projections, snapshots, and operational metadata. The event store is one piece of the puzzle; it doesn't replace relational storage for read models.
Here's the real gotcha: Event Sourcing inverts your mental model of 'saving data'. You stop thinking about 'what is the current state' and start thinking about 'what happened'. That shift takes your team a solid sprint or two to internalize. Don't expect it to click overnight. The engineers who treat events as just 'INSERT statements with extra columns' will fight the pattern the whole way.
One more thing: I've seen teams cram too much into a single event. Keep events granular. A 'CustomerUpdated' event with 15 optional fields is a code smell. Prefer separate event types for separate intents — EmailChanged, AddressChanged, StatusChanged. It makes projections easier to write and schema evolution less painful.
Additional insight from production: the most successful event-sourced systems I've seen treat event design as a first-class modelling exercise. Spend a sprint with your domain experts defining event boundaries before writing a single line of code. That investment pays back tenfold when you need to add new projections later.
Let me be blunt: if you're building an event store without snapshots from day one, you're designing a performance disaster. We'll get to that shortly.
Also consider: event naming matters. Use past-tense verbs (OrderPlaced, not PlaceOrder). This distinguishes the event (something that happened) from the command (something you want to happen). Mixing them up leads to confusion in the event handling pipeline.
Designing the Event Store
The event store is the backbone of any event-sourced system. It must provide: Append-only writes (no updates, no deletes), Optimistic concurrency control via aggregate version numbers, Efficient stream reading (by aggregate ID and optionally by event type), Snapshot support to avoid full replays.
A common implementation uses a relational database table with columns: aggregate_id, aggregate_type, version, event_type, event_data (JSON/JSONB), created_at, and a metadata JSON field. The primary key is (aggregate_id, version).
For high throughput, you might use a dedicated event store like EventStoreDB or Axon Server, but for many systems a standard PostgreSQL table with proper indexing works well up to thousands of events per second. One thing engineers miss: your event store needs to support transactional outbox if you're publishing events to a message bus — otherwise you risk publishing events that never get stored, or storing events that never get published.
Another nuance: event ordering is critical for aggregates. If you use a distributed event store, ensure all events for the same aggregate land on the same partition. I've seen teams use random sharding and then wonder why projections produce inconsistent results.
Indexing strategy matters: a composite index on (aggregate_id, version) is essential for stream reads. For global projections, an index on (created_at) or (event_type, created_at) helps. Don't over-index — writes are append-only, but indexes still add overhead.
Here's something nobody tells you: your event store will grow faster than you expect. A system handling 100 events per second with 1KB payloads generates ~8.6GB per day. At that rate, you'll hit 3TB in a year. Plan your retention and archival strategy before you go to production, not after.
Another practical detail: benchmark your event store's write throughput before going live. A simple test: generate 10,000 events with realistic payloads and measure p99 write latency. If it's above 50ms, your storage choice or indexing is wrong. Tune before you ship.
One more tip: use a separate schema or database for the event store to avoid accidental table drops or migrations affecting your read models. We keep our event store in a dedicated 'events' schema with restricted write permissions.
When choosing between a generic SQL store and a specialised event store, consider the team's familiarity and operational overhead. A specialised store gives you subscriptions, projections, and clustering built-in, but adds a new system to learn and maintain. The SQL approach is simpler but requires more boilerplate for features like stream subscriptions and snapshot management.
Enrichment: Partitioning the event store by aggregate ID is critical for scaling writes. Use hash-based partitioning to spread load while keeping per-aggregate ordering. Don't use range partitioning — it'll create hot spots on active aggregates.
Also: consider using a separate event store instance for high-frequency aggregates to isolate performance. We once had a single event store handling 50K events/s for order events and a few hundred for user events — contention on the primary key index caused latency spikes for both.
Another production pattern: use a write-ahead log (WAL) before the event store to absorb bursts. Write to a fast local log, then flush to the durable event store asynchronously. This smooths out latency spikes at the cost of a small window of data loss risk.
- Every write is an append to the log; there is no in-place update.
- Readers (projections) consume the log from their last known position.
- Snapshotting is like taking a checkpoint: you can start from the snapshot instead of the beginning.
- The log is immutable, so you never need locks for writes to different aggregates.
- Concurrency conflicts are detected at write time via version check, not at read time.
CQRS and Event Sourcing: A Symbiotic Pair
Event Sourcing naturally pairs with CQRS (Command Query Responsibility Segregation) because: Write side: commands produce events that are appended to the event store. Read side: events are consumed by one or more projections that build specialised read models.
This separation solves a core tension: you don't want complex read queries against your event store (which is optimised for write). Instead, you maintain dedicated read tables that are optimised for your query patterns. The trade-off is eventual consistency — the read model lags behind the write model by the time it takes to project the events.
CQRS is optional with Event Sourcing, but the combination unlocks powerful patterns like multiple read views of the same data, read model rebuilds from scratch (reprojection), and easy integration with different storage technologies for reads vs writes. One practical pattern you'll see in production: use a separate Elasticsearch index for full-text search, rebuilt from events, while keeping the main read model in PostgreSQL.
Don't blindly add CQRS. If your only read is "get the aggregate state by ID", you don't need it. The complexity of maintaining multiple projections isn't free.
I once worked on a system where the team built three separate projections for the same entity because each query needed a different shape. That's fine until a schema change requires updating all three. We ended up with a single canonical projection that served most queries, and only kept the extra ones for performance-critical paths.
The real pain point with CQRS isn't the initial setup — it's the ongoing cost. Every schema change to events means updating every projection that consumes that event. If you have 12 projections consuming OrderPlaced, you touch 12 files. That's fine until someone forgets one in a code review and the discrepancy silently corrupts that read model.
Here's another caution: when you use CQRS, never let a read model influence a write decision. If you check a read model to decide whether to allow a command, you've introduced a race condition. The write side must always validate against the event store, not a projection. I've debugged production bugs where a stale projection caused double-spending. The fix was to move the check to the command handler where it queried the event store directly.
Practical tip: start with one projection. Only add more when you measure a concrete performance need. Premature projection proliferation is a common source of technical debt.
Also consider: if your projections are slow because they do complex joins, you can pre-join data at projection time. For example, instead of joining Order and Customer tables at query time, denormalise customer info into the order read model when the order event is processed. This makes reads fast at the cost of storage and more projection logic.
Enrichment: Monitor per-projection lag separately. A global lag metric can hide one projection that's stuck. Use a dedicated table recording the last event ID processed by each projection. Alert if any projection hasn't advanced in more than 5 minutes.
Another pattern: use materialized views for projections that need to aggregate across streams. They can be refreshed periodically or on-demand, but be careful with refresh performance on large datasets.
Snapshots and Performance Optimisation
Without snapshots, every time you need the current state of an aggregate you replay every event since the beginning of time. For aggregates with 100k+ events, that's a hundred thousand database reads and object instantiations per request — catastrophic for latency.
Snapshots store the state of an aggregate at a specific version. When loading, you fetch the most recent snapshot (or create one if none exists) and then replay only the events after that snapshot version. This collapses the replay cost from O(total events) to O(post-snapshot events).
Snapshot frequency is a trade-off: too frequent and you waste write overhead; too rare and replay is still expensive. A good starting point is every 100–1000 events, or every hour for low-volume aggregates. You can also take snapshots on-demand after a state-changing command. In production, you'll often combine both: snapshot every N events plus a periodic time-based snapshot for aggregates that rarely change.
A common failure: snapshot corruption. Always store a version hash alongside the snapshot and verify it on every load. If it doesn't match, fall back to full replay and rebuild the snapshot.
Memory cache for snapshots? Use Redis with a TTL — but invalidate it when a new event for that aggregate is stored. Otherwise, you'll serve stale data from the in-memory cache while the snapshot store already has a newer version.
Here's the trap most teams hit: they don't back up snapshots because 'snapshots can be rebuilt from events'. True, but rebuilding 10M events for a single aggregate takes 45 minutes. If you lose all snapshots after a crash, your recovery time goes from minutes to hours. Back up snapshots to speed recovery — just don't treat them as the truth.
One more pitfall: taking a snapshot too early in an aggregate's lifecycle. If you snapshot after every event on a frequently changing aggregate, you turn your event store into a state store — losing the performance benefit. Snapshot every N events or when the aggregate reaches a certain version threshold.
Performance note: use a dedicated snapshot store (e.g., Redis, DynamoDB) that's fast for point reads. The event store can be slower but more durable. Keep snapshot serialization efficient — use a binary format if latency matters.
A practical tip: implement snapshot versioning. Store the snapshot version alongside the aggregate version. When you load, check that the snapshot version is not older than the last known event version. If it is, replay from snapshot version + 1. This gives you a safety net against snapshot drift.
Enrichment: Consider multi-level snapshots for aggregates with millions of events. Store a daily checkpoint snapshot plus per-1000-event snapshots. When loading, use the closest snapshot before the target version. This reduces worst-case replay to 1000 events.
Also: use a separate thread pool for snapshot materialization to avoid blocking the event writing path. Snapshots can be taken asynchronously after a certain threshold is reached.
Another nuance: if you use Redis for snapshot caching, set an eviction policy that favors high-traffic aggregates. LFU (Least Frequently Used) works better than LRU for workloads where a subset of aggregates sees most of the reads.
Schema Evolution: When Events Outlive Your Code
Events are immutable — once written, they must remain readable forever. This means your event schemas will outlive the code that created them. A common trap: you add a new field to an event and old events break the deserialization.
The solution: never rename, remove, or reorder fields. Only add optional fields. Use a serialization format that supports schema evolution (JSON, Avro, or Protobuf with forward/backward compatibility modes). Each event should carry a schema version (e.g., in metadata). Your event handler must be able to process multiple schema versions.
A robust pattern is to store events as JSONB and use a version-specific deserialization layer that fills defaults for missing fields. For example, v1 events may not have a 'status' field; v2 expects it. The v2 handler supplies a default when deserializing v1 events. You also need to handle the case where you deprecate a field: you cannot remove it from events, but your handlers can ignore it. If you absolutely must change event structure, plan an offline migration with a maintenance window.
Here's the pain point nobody tells you: when you have multiple services consuming the same events, schema evolution becomes a coordination problem. You can't just update one service — you need to ensure all consumers can handle the new schema before you start emitting it. That's where a schema registry (like Confluent's) shines. It lets you enforce compatibility rules and prevent breaking changes from reaching production.
Another trick: use a generic wrapper that includes a type discriminator and version. Then write version-specific deserializers that can be registered dynamically. This avoids massive if-else chains in your handler code.
I've seen teams try to solve schema evolution with a shared JAR that contains all event classes. That works until two services need different versions of the same event. Then you're stuck in dependency hell. A schema registry prevents that by decoupling the wire format from the class definition.
Don't forget to test schema evolution in your CI pipeline. Write a test that replays a set of old events (exported from production) against the latest handler code. If any event fails to deserialize, the pipeline fails. This catches breaking changes before they hit production.
One more thing: when you deprecate an event type, don't delete the event class immediately. Keep it in a 'legacy' package and let your deserialization layer know to skip it or transform it. Deleting the class too early will break replay from the beginning of time.
Consider using Avro with a schema registry from the start, even if you only have one service. It forces discipline and makes future multi-service integration much easier. The upfront cost is negligible compared to the pain of retrofitting.
Enrichment: Schema evolution also affects downstream BI systems. If you publish events to a data lake via Kafka, the schema must be compatible for all consumers. Coordinate schema changes in a separate deployment with monitoring for consumer lag.
Also: think about backward vs forward compatibility. Backward compatibility (new reader can read old data) is easier and usually sufficient. Forward compatibility (old reader can read new data) is harder but necessary when you can't upgrade all consumers simultaneously.
One real-world example: we had an event with a 'price' field that was a decimal. We needed to add a 'currency' field. We made it optional with a default of 'USD'. Old events continued to work, and new events included the currency. The projection that needed currency just used the default for legacy events.
Idempotency and Consistency Guarantees
Event Sourcing systems are event-driven, and events can be delivered multiple times (e.g., after a broker restart, network retry, or projection crash). Without idempotency handling, duplicate events will produce duplicate state changes — corrupting your read models.
Write-side idempotency: each command should be associated with a unique idempotency key (UUID). Before appending an event, check if an event with that key already exists. This prevents double-spending, duplicate orders, etc.
Read-side idempotency: projections must be idempotent. For upserts, use INSERT...ON CONFLICT UPDATE (UPSERT). For set-based updates, ensure the operation is deterministic (e.g., set account balance to the computed value, not increment).
Consistency: Event Sourcing gives you strong consistency within an aggregate (writes are atomic per aggregate via version check). Across aggregates, you have eventual consistency unless you use distributed transactions (which you should avoid). For critical cross-aggregate consistency, consider using sagas or process managers.
Here's a real scenario: we had a projection that used an incrementing counter instead of setting the absolute value. On replay, it doubled every event — took us two hours to find the bug because the numbers looked "close enough".
Idempotency key storage: keep them in a time-bounded store (e.g., Redis with 24h TTL). After that, the risk of duplicate delivery is negligible. Make the idempotency check atomic with the event append — ideally using a database unique constraint, not a check-then-act.
There's a subtlety most docs miss: idempotency keys need to survive the event store write failure scenario. If your app generates a key, checks it doesn't exist, and then the write fails — the key is still unused. A retry will use the same key and succeed. But if the write actually succeeded and only the response was lost, the idempotency check prevents the duplicate. That's the whole point, but it only works if your idempotency store is durable and checked atomically with the write.
Another nuance: idempotency keys for event handlers that process batches. If you replay a batch of events, each event should have its own idempotency check, not a batch-level key. Otherwise, a partial failure in the batch causes the whole batch to be skipped on retry, leading to lost events.
Practical advice: make your projection handlers idempotent by design — write them as if every event could be processed twice. That means using SET x = x (no increments), using INSERT ON CONFLICT, and logging any duplicate attempts for monitoring.
Also, consider using event sourcing frameworks that provide idempotent event handling out of the box, but understand the mechanism so you can debug when it fails. The framework can't fix incorrect business logic that assumes events are never replayed.
Additional: for financial systems, implement idempotency with a separate ledger table that records every processed event ID. This is more durable than a TTL-based cache.
The Missing Event Incident
- Snapshots are a performance optimisation, not a durable source of truth.
- Always verify consistency between event store and snapshot store on read.
- Synchronous replication is worth the latency trade-off for financial data.
- Never trust projection lag as a health indicator — validate data correctness separately.
- Alert on any snapshot version mismatch immediately — don't wait for a manual check.
- Back up the event store independently of snapshot backups. Snapshots can be rebuilt, events cannot.
- Test failover recovery with a full event replay in staging at least once per quarter.
- Don't assume the event store secondary is current — verify replication lag before failing over.
Key takeaways
Common mistakes to avoid
7 patternsUsing the snapshot store as source of truth
Not testing schema evolution on old events
Using incrementing counters in projections
Not monitoring per-projection lag separately
Using distributed transactions for cross-aggregate consistency
Assuming idempotency is automatic with at-least-once delivery
Not planning event store retention and archival
Interview Questions on This Topic
How does Event Sourcing guarantee consistency within an aggregate?
Frequently Asked Questions
That's Architecture. Mark it forged?
17 min read · try the examples if you haven't