Senior 16 min · March 06, 2026

DDD Aggregate Sizing — Why God Objects Kill Payment Systems

One oversized Order aggregate caused payment timeouts from lock escalation.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • DDD structures software around business domains, not technical layers
  • Bounded Contexts define where a domain model applies; each context owns its vocabulary
  • Aggregates enforce transactional consistency; keep them under ~100 child entities
  • Entities have identity; Value Objects are defined by attributes and must be immutable
  • Performance: oversized aggregates cause lock contention and slow transactions
  • Production insight: sharing a database table across contexts couples deployments and kills autonomy
Plain-English First

Imagine a hospital. The billing department calls a patient a 'payer with an account balance.' The doctors call the same person a 'patient with a diagnosis and treatment plan.' Both groups are talking about the same human being, but they each have their own vocabulary, rules, and paperwork — and that's totally fine. Domain-Driven Design says: stop trying to force everyone to share one giant shared definition. Instead, let each department own their own model of the world, speak their own language, and only sync up at the boundaries where they truly need to. That's it. That's DDD.

Most software systems don't fail because of bad algorithms or slow databases. They fail because the code stops making sense — not to the compiler, but to the team building it. Business rules get buried in utility classes. A 'Customer' object ends up carrying seventy fields because six different teams piled their needs into it. Changing one thing breaks three others in ways no one predicted. This is the silent killer of large codebases, and it's exactly the problem Domain-Driven Design was built to solve.

DDD, coined by Eric Evans in his 2003 book 'Domain-Driven Design: Tackling Complexity in the Heart of Software,' is a philosophy and a set of patterns for structuring software around the actual business domain. It argues that the biggest source of complexity isn't technical — it's conceptual. When your code's vocabulary doesn't match the business's vocabulary, every conversation between a developer and a domain expert becomes a translation exercise. Bugs hide in those translations. DDD eliminates the translation layer by making the code speak the same language as the business.

By the end of this article you'll be able to identify Bounded Contexts in a real system, design Aggregates that enforce invariants without becoming god objects, use Value Objects to eliminate primitive obsession, and understand exactly where DDD adds value versus where it becomes overkill. You'll also see the production gotchas that only show up when you're six months into a real DDD implementation — the things Evans' book doesn't warn you about. Raw theory never saved a production outage — applied DDD patterns do.

And that's the whole point: if your code doesn't hurt when you read it after a month, you're not modelling hard enough. DDD hurts at first, then it clicks.

What is Domain-Driven Design Basics?

Domain-Driven Design Basics is a core concept in System Design. Rather than starting with a dry definition, let's see it in action and understand why it exists. DDD is about structuring software around the business domain — the problem space the software is supposed to solve. The core premise: your code's vocabulary should match the domain experts' vocabulary. When it does, communication errors disappear and the code becomes self-documenting. When it doesn't, every feature request becomes a translation exercise that introduces bugs.

Here's what most intro articles skip: DDD isn't just about code structure — it's about power dynamics. When developers own the language instead of the business, they make assumptions that cost money. DDD flips this: domain experts define the terms, developers encode them. That's a governance model, not a design pattern.

Here's the trap most teams hit: they read Eric Evans' book, get excited, and immediately start drawing aggregates on a whiteboard. They skip the hard part – sitting with domain experts and agreeing on terms. Without Ubiquitous Language, DDD is just a fancy way to overengineer your code. Start with the language workshop. Do it until your domain expert can read your unit tests and say 'yes, that's what I meant.'

A real example: I once joined a project where the team had been doing 'DDD' for six months. When I asked the product manager to review a test, she said 'I don't know what 'statusCode' means — we say 'orderStage.'' That's when you know the language is broken. We renamed the enum, and suddenly the tests became reviewable by non-developers. That's the whole point.

Another way to think about it: DDD is a language alignment tool. If you're not aligning, you're leaking abstractions.

io/thecodeforge/ddd/Example.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
package io.thecodeforge.ddd;

/**
 * TheCodeForgeDomain-Driven Design Basics example.
 * Business term 'orderStage' is used, not 'statusCode'.
 */
public class Example {
    public static void main(String[] args) {
        String topic = "Domain-Driven Design Basics";
        System.out.println("Learning: " + topic);
    }
}
Forge Tip:
Type this code yourself rather than copy-pasting. The muscle memory of writing it will help it stick. And more importantly, map the terms in the code back to real business conversations — that's where the real learning happens.
Production Insight
Teams that start with DDD often skip the ubiquitous language step.
They jump straight to aggregate design without agreeing on terms with domain experts.
Rule: never model a single entity without first defining the language with the business.
Another trap: using 'Customer' across contexts without realising each context needs its own version. You'll get a god object with fields nobody needs.
Worse: you'll end up with a database table that serves six teams, and every migration becomes a multi-team negotiation.
Real failure: a fintech startup defined 'Account' globally. After six months, the billing team needed 'Account' to have a 'creditLimit' while the trading team needed 'marginRequirements'. Sharing the class broke both features. Splitting into BillingAccount and TradingAccount fixed it, but cost three weeks of refactoring.
Key Takeaway
DDD is a communication tool first, a code structure second.
If your code doesn't match the business vocabulary, you don't have DDD.
Start with the language, then model.
When to apply DDD?
IfSimple CRUD app with no complex business rules
UseSkip DDD; use transaction script pattern instead.
IfComplex business domain with multiple teams and changing rules
UseApply DDD with Bounded Contexts, Aggregates, and Ubiquitous Language.
IfLegacy system with tangled models, but teams already understand the domain
UseUse DDD to refactor gradually — start with one Bounded Context and add Anti-Corruption Layers.
IfStartup building an MVP — speed over model purity
UsePostpone DDD. Get something working first, then refactor toward DDD as complexity grows.

DDD in Practice: A Real-World Example

Take an e-commerce system. The 'Product' concept means different things to different teams. Inventory cares about stock locations and reorder points. Marketing cares about descriptions, images, and tags. The checkout team cares about price and availability. These are three different models of the same real-world thing. DDD says: don't share one Product class across all three. Create three Bounded Contexts: Inventory Product, Marketing Product, and Checkout Product. Each has its own lifecycle, its own invariants, and its own persistence. Communication between contexts happens through events or data transfer objects (DTOs), never through a shared database table.

The hardest part? Getting leadership to accept that 'Product' will be stored in three different tables. Engineers often resist because it feels like duplication. It's not — it's decoupling. Duplication of data is cheaper than coupling of teams.

In practice, you'll find teams that run a shared Product table because it's 'faster.' It's not – it's cheaper in the short term, but it costs you team autonomy. Once two teams share a table, they share deployment windows, migration schedules, and outage scope. That's not a database decision; it's an organisational decision.

Here's a concrete production story: a company had a single 'Product' table with 120 columns. Marketing wanted to add an 'SEO description' field, but the DBA said no because it would slow queries for inventory. That's a governance problem, not a technical one. The fix was splitting the table — and the teams — into separate contexts. The inventory team's write throughput doubled after removing marketing columns from their table.

And the metric that convinced leadership: after splitting, inventory writes went from 500 ops/s to 1200 ops/s. Marketing got their SEO field in one sprint. Shared nothing wins.

io/thecodeforge/ddd/checkout/Product.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
package io.thecodeforge.ddd.checkout;

import io.thecodeforge.ddd.shared.Money;

/**
 * Checkout context's view of a Product.
 * Only cares about price and availability.
 */
public class Product {
    private String productId;
    private Money price;
    private StockLevel stock;

    public boolean canBePurchased(int quantity) {
        return stock.hasAtLeast(quantity);
    }

    public Money totalPrice(int quantity) {
        return price.multiply(quantity);
    }
}
Mental Model: Different Lenses
  • Inventory Product: stock locations, reorder points, bin numbers.
  • Marketing Product: description, images, tags, SEO metadata.
  • Checkout Product: price, availability, discount eligibility.
Production Insight
The biggest failure here is using a shared Product table across all three contexts.
When Marketing adds a new image field, Inventory must run a migration they don't need.
Rule: each context gets its own database schema and its own service.
Second insight: eventually consistent updates between contexts can cause temporary data mismatches — that's acceptable. Design your UI to handle stale data gracefully.
Also: watch out for teams that use the same message queue topic for cross-context events — that couples your infrastructure. Use separate topics per context.
Real story: an online retailer used a single 'products' table for inventory and checkout. When checkout added a 'reservation_lock' column, inventory queries started timing out because the table grew too wide. Splitting into two tables eliminated the contention. Checkout writes went from 50ms to 5ms.
Key Takeaway
A single real-world entity produces multiple domain models.
Each context owns its own model and its own data.
Never let two contexts share a table.

Bounded Contexts — The Boundary That Prevents Chaos

A Bounded Context is the explicit boundary where a particular domain model applies. Inside the boundary, the language and terms have a specific meaning. Outside, they may mean something else — and that's intentional.

For example, in an e-commerce system, the 'Product' concept differs between inventory management (tracking stock levels) and marketing (tracking tags and images). Inventory doesn't care about the product description; marketing doesn't care about warehouse bin locations. Forcing them to share one model creates a 'Customer' object with 70 fields.

Implementing a Bounded Context means defining a distinct module, service, or package boundary. Within that boundary, use the Ubiquitous Language. Communication between contexts happens through events or APIs, never through shared databases.

Common pitfall: implementing anti-corruption layers too early or too late. Start with a lightweight translation layer (maybe a map) and only jump to full ACL when you see cross-context coupling hurting velocity.

Identifying Bounded Contexts in an existing system is messy. Look for the boundary where a term changes meaning. For example, when the billing team says 'customer' and the support team says 'customer' – they mean different things. That's your context boundary. Also, if a change in one module causes a cascade of test failures in another module, that module is likely violating your context boundary.

One more clue: ask your domain experts to draw a diagram of their business flows. The gaps and overlaps in their drawing are your context boundaries. Don't draw it yourself — let them show you where the seams are.

A subtle sign: if you have a 'Customer' class in a package called 'shared', you almost certainly have a context boundary violation. Shared model classes are a red flag. Rename it to something context-specific, even if it's initially identical.

io/thecodeforge/ddd/inventory/Product.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
package io.thecodeforge.ddd.inventory;

public class Product {
    private String sku;
    private int availableQuantity;
    private int reservedQuantity;

    public boolean isAvailable(int quantity) {
        return (availableQuantity - reservedQuantity) >= quantity;
    }

    // Only Inventory context cares about warehouse location
    private String warehouseZone;

    public void reserve(int quantity) {
        if (!isAvailable(quantity)) throw new InsufficientStockException(sku);
        reservedQuantity += quantity;
    }
}
Mental Model: Teams and Dictionaries
  • A Bounded Context maps to the team that owns it.
  • Two teams using the same word may refer to different concepts.
  • Translation between contexts happens at the boundary via an anti-corruption layer.
  • A context's persistence can be any technology — SQL, NoSQL, even a file — as long as it's private.
Production Insight
The most common failure is using the same database table across contexts.
Shared tables couple transactions and make it impossible to evolve one context without breaking another.
Rule: each Bounded Context gets its own database schema — no exceptions.
But here's the subtle one: even using separate tables in the same database creates deployment coupling. True isolation means separate databases or at least separate schemas.
Another gotcha: people forget that logging is also a context. Don't let shared logging infrastructure leak internal data between contexts.
Real incident: a logistics company shared a 'Shipment' table between tracking and billing. When billing added a 'tariff_code' column, tracking's queries slowed by 40%. They split into two tables; tracking returned to normal, billing got its tariff code. No migration coordination needed after that.
Key Takeaway
Bounded Contexts prevent model pollution.
Each context owns its language and its data store.
Never share a table between two contexts.
When to split a Bounded Context?
IfTwo teams frequently argue over the meaning of a domain term
UseCreate separate contexts with anti-corruption layers
IfA single concept has different lifecycle rules in different parts of the system
UseSplit into contexts and use events to sync
IfPerformance of a module degrades when another module loads its data
UseSplit contexts and assign each its own data store
IfA team deploys independently but is blocked by another team's release schedule
UseSplit contexts — each context should be a separate deployable unit

Aggregates — Consistency Boundaries That Enforce Invariants

An Aggregate is a cluster of domain objects that must be treated as a single unit for data changes. Each Aggregate has a root entity (the Aggregate Root) that controls access to the rest of the cluster. All invariants (business rules) must be satisfied when the aggregate is committed.

For example, an Order should not allow more items than in stock. This invariant is enforced inside the Order aggregate root. The outside world only touches the root — never its children directly. This guarantees transactional consistency without locking huge swaths of the database.

You reference an aggregate by its global identity, not by navigating through other objects. This keeps relationships clean and avoids cascade problems.

Now the part nobody talks about: aggregate boundaries are often wrong the first time. You'll discover this when a seemingly simple business rule change forces a database migration across multiple services. That's the signal that your aggregate boundary wasn't correct.

The rule of thumb: if you can't fit the aggregate's state on a single post-it note during a whiteboard session, it's too big. Aggregates should be small enough that a business expert can reason about their invariants in one breath. If they need to say 'and then also...' your aggregate is too large.

Also consider: aggregates are not just about transactional boundaries; they also define ownership. If two teams need to change the same aggregate, you have an organisational mismatch. That's a Conway's Law problem, and no amount of code will fix it.

Performance nuance: loading a large aggregate from the database means loading all its children. If you have Order with 1000 LineItems, you're loading 1000 rows into memory every time you touch the order. That's a waste if you only need to check the total. Consider splitting into smaller aggregates or using a separate read model for queries.

A practical heuristic: start with a small boundary and expand only when you get a concrete business rule that requires transactional consistency across those objects. Premature aggregation is as dangerous as premature optimization.

io/thecodeforge/ddd/order/Order.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package io.thecodeforge.ddd.order;

import java.util.ArrayList;
import java.util.List;

public class Order {
    private String orderId;
    private List<OrderLine> lineItems = new ArrayList<>();
    private boolean confirmed;

    public void addItem(ProductId productId, int quantity) {
        if (confirmed) throw new IllegalStateException("Order already confirmed");
        lineItems.add(new OrderLine(productId, quantity));
    }

    public Money total() {
        return lineItems.stream()
            .map(OrderLine::subtotal)
            .reduce(Money.ZERO, Money::add);
    }

    public void confirm() {
        if (lineItems.isEmpty()) throw new IllegalStateException("Cannot confirm empty order");
        this.confirmed = true;
    }
}
Aggregate Size Trap
An aggregate that grows too large becomes a performance bottleneck. Each transaction loads the entire aggregate into memory. If your aggregate routinely contains thousands of child entities, you've drawn the boundary too wide. Rethink – maybe those children don't all need transactional consistency.
Production Insight
Large aggregates cause long-running transactions and lock escalation.
Always keep aggregates small — aim for fewer than 100 child objects on average.
If you must update a large collection, consider splitting into sub-aggregates with eventual consistency.
Another trap: using the repository pattern to fetch aggregates but then loading additional unrelated data — that defeats the purpose. Keep repositories aggregate-root specific.
Also: watch out for aggregates that are always read but rarely written. Those are candidates for splitting: reads can be served by a separate read model without locking.
Real failure: a payment provider had an Order aggregate with 500 line items. Each confirmation loaded all 500 into memory. After switching to a lightweight read model for totals, transaction latency dropped from 2s to 50ms.
Key Takeaway
Aggregate boundaries define transactional consistency.
Keep aggregates small to avoid lock contention.
Always access children only through the aggregate root.
When to split an Aggregate?
IfA business rule requires checking a condition across many child entities (e.g., total weight of all items)
UseKeep them in one aggregate if the check must be transactional; otherwise split.
IfThe aggregate has more than 100 child entities on average in production
UseSplit: move child entities into separate aggregates and use eventual consistency for invariants.
IfTwo parts of the aggregate are updated by different teams
UseSplit: ownership boundaries should align with aggregate boundaries.
IfA change to the aggregate root often happens without touching its children
UseConsider splitting: the children might be a separate aggregate.

Entities vs Value Objects — Identity Matters

Entities have a unique identity that persists through time. For example, a 'User' entity is the same user even if they change their email or password. You compare entities by ID, not by field values.

Value Objects, on the other hand, are defined entirely by their attributes. A 'Money' object with amount 100 and currency 'USD' is equal to another Money with the same values. Two order lines with identical product and quantity can be swapped — they have no separate identity.

The rule: if you care about 'who' it is, it's an Entity. If you care about 'what' it is, it's a Value Object. Value Objects should be immutable and side-effect-free.

Production pitfall: performance. If you create a Value Object in a hot loop (e.g., Money inside a large stream operation), object allocation can become a GC problem. Consider using records or value types.

One hidden trap: Value Objects that reference other Value Objects. If an Address contains a City, and City is a Value Object, then Address becomes composed of values. That's fine – but don't give City an ID. The moment you add an ID, City becomes an Entity and the composition semantics change. Stick to 'is-a-value' all the way down.

Another common mistake: using primitive types (string, int) instead of Value Objects to represent domain concepts. This is called primitive obsession. A Price is not a double; it's a Money with currency. A PhoneNumber is not a string; it's a structured value with formatting rules. DDD says: wrap every primitive that has business meaning in a Value Object. The code will tell you what it means.

I once saw a codebase where 'Amount' was a BigDecimal everywhere. Every method that dealt with money had to check the currency manually. After introducing a Money Value Object, the nullability checks and currency mismatch bugs disappeared. That's the power of making the type system work for you.

Another heuristic: if you can replace the object with a literal in a unit test and the test still makes sense, it's likely a Value Object.

io/thecodeforge/ddd/shared/Money.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
package io.thecodeforge.ddd.shared;

import java.math.BigDecimal;

public final class Money {
    public static final Money ZERO = new Money(BigDecimal.ZERO, "USD");

    private final BigDecimal amount;
    private final String currency;

    public Money(BigDecimal amount, String currency) {
        this.amount = amount;
        this.currency = currency;
    }

    public Money add(Money other) {
        if (!this.currency.equals(other.currency)) throw new IllegalArgumentException("Currency mismatch");
        return new Money(this.amount.add(other.amount), this.currency);
    }

    @Override
    public boolean equals(Object o) {
        if (!(o instanceof Money)) return false;
        Money m = (Money) o;
        return amount.equals(m.amount) && currency.equals(m.currency);
    }

    @Override
    public int hashCode() {
        return amount.hashCode() * 31 + currency.hashCode();
    }
}
Mental Model: Person vs Cash
  • Entities have a thread of identity — they change over time.
  • Value Objects are snapshots — they are equal if all attributes match.
  • Value Objects should be immutable to avoid side effects.
  • Never give a Value Object an ID — it violates the definition and causes identity confusion.
Production Insight
Treating a Value Object as an Entity (giving it an ID) bloats the model and leads to identity confusion.
A common mistake is adding an auto-generated ID to Money — now you can't tell if $10 today is the 'same' $10 tomorrow.
Rule: if it has no lifecycle, it's a Value Object. Don't give it an ID.
Performance insight: Value Objects cause more GC pressure than entities because they're created and discarded frequently. Use records or structs in languages that support them.
Also: watch out for Value Objects that contain Entity references — that breaks the semantics. A Value Object should only contain other Value Objects or primitives.
Real story: a trading system used an 'Amount' class with an ID field. Developers started comparing Amounts by ID, leading to two orders with $100 having different ID and being considered different. Removing the ID and switching to Value Object semantics fixed the bug and uncovered three other comparison bugs.
Key Takeaway
Entities have identity; Value Objects have attributes.
Compare entities by ID, value objects by fields.
Make value objects immutable to avoid tracking changes.
When to pick Entity vs Value Object?
IfYou need to track changes over time and preserve history
UseMake it an Entity with a unique ID.
IfTwo objects with same values are interchangeable
UseMake it a Value Object, ensure immutability.
IfThe object has a lifecycle with state transitions (e.g., Order -> Shipped, Delivered)
UseEntity — identity matters across states.
IfThe object is a simple property bag with no independent lifecycle
UseValue Object — e.g., Address, Money, DateRange.

Ubiquitous Language — The One Rule That Makes DDD Work

Ubiquitous Language is the practice of using the same vocabulary in code, conversations, documentation, and domain experts' speech. When a domain expert says 'order is confirmed', the code should have an Order class with a confirm() method — not a 'status update' to a database column named 'flag_34'.

This sounds trivial, but in practice it's the hardest rule to follow. Teams slip into technical jargon or business shortcuts because they're faster in the moment. The result: bugs, misunderstandings, and a growing gap between what the business wants and what the system does.

The commitment is: whenever you discover a term that isn't in the Ubiquitous Language, either introduce it with the team's agreement or rename it. This is a continuous discipline, not a one-time exercise.

Hard truth: most teams fail at Ubiquitous Language within the first six months. The solution isn't a glossary doc — it's pairing developers with domain experts during refinement sessions. If you're not sitting next to the business when you write the code, your language will drift.

The best indicator of healthy Ubiquitous Language: can a product manager walk through the codebase and nod along? If they can't, your language is broken. Schedule regular 'language reviews' where the team reads through the domain model with a domain expert and flags terms that don't match.

One more thing: Ubiquitous Language applies to everything — APIs, event names, database column names. If you have a column called 'cust_status_cd', that's technical debt. Rename it to 'customer_status'. Every new developer will thank you.

A practical exercise: take your most recent user story. Write the acceptance criteria using only domain terms. Then go to your code and see if those terms exist as types, methods, or properties. If not, you have a gap.

And the final test: if you can't onboard a business analyst in two days to write acceptance tests, your language is a barrier.

io/thecodeforge/ddd/language/Order.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
package io.thecodeforge.ddd.language;

import java.time.Instant;

/**
 * Ubiquitous Language: 'order status' not 'flag_34'.
 */
public class Order {
    private String orderId;
    private OrderStatus status;  // matches business term
    private Instant confirmedAt;

    public void confirm() {
        if (status == OrderStatus.CONFIRMED) throw new IllegalStateException("Already confirmed");
        this.status = OrderStatus.CONFIRMED;
        this.confirmedAt = Instant.now();
    }

    public enum OrderStatus {
        PENDING, CONFIRMED, SHIPPED, DELIVERED, CANCELLED
    }
}
Language Rot
The biggest sign of Ubiquitous Language decay is seeing field names like 'statusCode', 'typeFlag', or 'processedInd' in domain entities. These are technical artifacts leaking into the domain model. Rename them to match business terms.
Production Insight
When Ubiquitous Language degrades, code reviews become translation exercises.
New developers spend weeks mapping database columns to domain concepts.
Rule: if a business term doesn't appear in at least one class or method name, fix it immediately.
Another sign: your domain expert can't read your code. If they can't follow a scenario in the unit tests, your language is broken.
Also: be careful with translations. If your company operates in multiple languages, Ubiquitous Language should be in the primary business language (usually English for tech), but you may need to maintain a translation map for international teams.
Real incident: a healthcare startup used 'member' in code but the business said 'patient'. Developers thought they were synonyms. They weren't — 'member' had different insurance rules. Renaming the class from Member to Patient took two weeks but fixed four open production bugs related to incorrect insurance validation.
Key Takeaway
Ubiquitous Language bridges business and code.
Every business term must have a corresponding code symbol.
If you can't find the term in the code, the language is broken.
When to enforce Ubiquitous Language?
IfYou're writing a new domain class or method
UseName it using the business term, not a technical abbreviation.
IfYou find a column named 'flag_34'
UseRename it to the domain meaning. Write a migration.
IfA domain expert cannot understand your unit test
UseRefactor test to use business language. Add a glossary.
IfTwo teams use different terms for the same concept
UseEither align terms or create separate contexts with translation.

Domain Events — How Contexts Communicate Without Coupling

Domain Events are the mechanism Bounded Contexts use to communicate asynchronously. When something important happens in one context (e.g., OrderConfirmed), it publishes an event. Other contexts subscribe and react. This keeps each context independent while still synchronizing state.

A well-designed Domain Event is immutable, includes the aggregate ID and a timestamp, and carries only the data the subscribers need — nothing more. The publishing context does not know who subscribes.

Production hazard: events that grow too large. If you put the entire order object into an OrderConfirmed event, you've coupled the subscriber to the publisher's internal structure. Keep events lean — use IDs and let subscribers fetch the rest via API if needed.

Versioning Domain Events is the part everyone forgets. Once you publish an event, you can't change its schema without breaking subscribers. Use Avro or Protobuf with schema registry, or at minimum, include a version field in the event and never remove fields – only add optional ones.

Another common mistake: publishing events before the transaction commits. If the transaction later rolls back, you've sent an event that never happened. Always publish events after the transaction is committed — use an outbox pattern if necessary.

Real production story: a team was using Domain Events to sync order data between contexts. They published the event before the transaction committed. A network timeout caused the transaction to roll back, but the event had already been consumed by the shipping context. The shipping context kicked off a fulfillment process for an order that didn't exist in the system. Recovering from that was a nightmare. Outbox pattern would have prevented it.

A subtle pitfall: using the same event bus for domain events and integration events. They have different guarantees — domain events are within a context, integration events cross contexts. Mixing them couples your infrastructure.

io/thecodeforge/ddd/shared/events/OrderConfirmed.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
package io.thecodeforge.ddd.shared.events;

import java.time.Instant;

public record OrderConfirmed(
    String eventId,
    String orderId,
    Instant occurredOn,
    Money totalAmount
) {
    public OrderConfirmed {
        if (eventId == null || eventId.isBlank()) throw new IllegalArgumentException("eventId required");
        if (orderId == null) throw new IllegalArgumentException("orderId required");
    }
}
Event Design Rule
Keep Domain Events small — only include the aggregate ID and data that external contexts absolutely need. If a subscriber needs more data, they should query via an API or a read model, not receive it in the event payload.
Production Insight
Events without idempotency keys cause duplicate processing on retries.
Each event must carry a unique eventId so consumers can deduplicate.
Rule: always assume a subscriber may process the same event twice.
Also: choose your event bus carefully. Kafka offers strong ordering but higher latency; RabbitMQ offers low latency but weaker guarantees. Know your trade-offs.
Another trap: using the same topic for multiple event types. Each event type should have its own topic or subject to avoid schema evolution issues.
Real failure: an e-commerce platform used a single 'order_events' topic for OrderPlaced, OrderConfirmed, and OrderShipped. When they added OrderCancelled, schema registry validation broke because consumers expected one schema per topic. Splitting into separate topics fixed it, but took a weekend migration.
Key Takeaway
Domain Events enable decoupled integration between contexts.
Keep events lean with IDs and essential data.
Idempotency is non-negotiable — events can be redelivered.
When to use Domain Events?
IfYou need to notify other contexts about a change
UsePublish a Domain Event after the transaction commits.
IfMultiple subscribers react to the same event
UseUse event-driven approach; ensure idempotency.
IfEvent schema needs to evolve over time
UseUse schema registry, version the event, and never remove fields.
IfEvent must be published reliably
UseImplement outbox pattern to publish after commit.

DDD and Microservices: Mapping Contexts to Services

One of the most common questions when adopting DDD is: how do Bounded Contexts map to microservices? The answer: they often align, but they're not the same thing. A Bounded Context is a conceptual boundary — it defines where a particular model applies. A microservice is a deployment unit — it defines what runs independently.

In many teams, each Bounded Context becomes one or more microservices. But you can also have multiple contexts inside a single service, especially in a modular monolith. The key is that contexts remain isolated in code and data even if they deploy together.

The danger is over-splitting: creating a microservice for every aggregate without considering operational cost. You end up with dozens of services, distributed transactions, and complex orchestration. Start with coarse context boundaries and split only when team autonomy or scaling demands it.

Common pattern: use a 'shared kernel' between contexts that are closely related, but this often turns into a shared mess. Prefer anti-corruption layers and published language via events.

When you do split into microservices, remember that network boundaries are real. You lose the ability to enforce invariants transactionally. You'll need sagas or process managers to maintain consistency. Don't take that lightly.

A pragmatic rule: if you have fewer than 10 developers, don't split into microservices based on DDD boundaries alone. Start as a modular monolith with clear package boundaries. Split when the team grows or when your CI pipeline becomes a bottleneck.

A good litmus test: if deploying a change to one context requires QA sign-off from another team, you've got a coupling problem that splitting won't fix on its own.

io/thecodeforge/ddd/shared/AntiCorruptionLayer.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
package io.thecodeforge.ddd.shared;

// Example anti-corruption layer translating between Inventory Product and Checkout Product
public class ProductTranslator {
    public static CheckoutProduct toCheckoutProduct(InventoryProduct inventoryProduct, Money price) {
        return new CheckoutProduct(
            inventoryProduct.sku(),
            price,
            StockLevel.from(inventoryProduct.availableQuantity())
        );
    }
}
Microservice Over-splitting
Don't create a separate microservice for every aggregate. The operational overhead of many small services often outweighs the benefits. Start with coarse-grained services aligned to Bounded Contexts, and split only when you have clear team boundaries or performance needs.
Production Insight
Splitting a context into multiple microservices introduces network latency and eventual consistency.
You lose the ability to enforce invariants transactionally across those services.
Rule: keep an aggregate's transaction boundary within one service; use sagas for cross-service consistency.
Another gotcha: distributed tracing becomes essential. Without it, debugging a failed saga across five services is a nightmare. Invest in trace IDs from day one.
Real story: a team split their Order context into OrderService, LineItemService, and PaymentService. A single order creation now involved three HTTP calls plus a saga. Latency went from 50ms to 500ms. They merged back into a single service; latency dropped to 60ms and the code was simpler.
Key Takeaway
Bounded Contexts are conceptual; microservices are deployment units.
They align but are not identical. Start coarse, split based on team needs.
Over-splitting kills velocity – measure operational cost before cutting.
When to separate a context into a standalone service?
IfA context is owned by a different team with independent release cadence
UseSplit into a separate service with its own deployment pipeline.
IfA context has different scaling requirements (e.g., high throughput reads vs low writes)
UseConsider separate services with independent scaling.
IfChanges to the context happen infrequently but the rest of the system deploys daily
UseKeep as a separate module within a monolith; only split if the change cadence mismatch causes friction.
IfThe context needs its own data store technology (e.g., graph DB vs relational)
UseSplit into separate service – mixing data stores increases complexity.

Anti-Corruption Layer — Protecting Your Domain from External Models

An Anti-Corruption Layer (ACL) is a protective boundary that translates between two Bounded Contexts. It prevents one context's model from leaking into and corrupting another's. This is especially important when integrating with legacy systems or third-party APIs where you can't control the model.

The ACL typically consists of a set of translators, adapters, and facade services. It maps external concepts to your internal Ubiquitous Language. For example, a legacy CRM's 'Account' object might map to your 'Customer' and 'Contract' aggregates.

Don't overbuild your ACL. Start with simple mapping functions. If you need to handle complex transformations, consider using a separate anti-corruption service. The key is that changes to the external system's model only affect the ACL, not your core domain.

Production gotcha: people often forget to version the ACL's translations. When the external model changes, the ACL must be updated. Without versioning, you'll get silent data corruption. Test ACL mappings with contract tests.

Another important point: the ACL belongs to the consuming context, not the provider. The team that consumes the external data owns the translation. That way they control when and how to update it.

Real-world example: a fintech company integrated with a legacy core banking system. The legacy system had a concept of 'account status' with values 'active', 'dormant', 'closed'. Our domain model had 'CustomerStatus' with 'Active', 'Inactive', 'Suspended'. The ACL translated one to the other. When the legacy system added 'suspended' as a status, the ACL broke silently until a customer complained they couldn't trade. The fix was adding a contract test on the ACL that warned when new statuses appeared.

A recurring pattern: teams put the ACL in the provider's codebase. Wrong. The consumer must own it, because the consumer decides what the domain model looks like.

io/thecodeforge/ddd/acl/LegacyCustomerTranslator.javaJAVA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
package io.thecodeforge.ddd.acl;

import io.thecodeforge.ddd.shared.Money;

public class LegacyCustomerTranslator {
    public static Customer toCustomer(LegacyAccount legacy) {
        return new Customer(
            legacy.getAccountId(),
            legacy.getPrimaryContactName(),
            Money.of(legacy.getBalanceAmount(), legacy.getBalanceCurrency()),
            legacy.getStatus().equals("ACTIVE") ? CustomerStatus.Active : CustomerStatus.Inactive
        );
    }
}
ACL Design Principle
The ACL belongs to the consuming context, not the provider. That way the consuming team controls when and how to update translations. Never let an external team dictate your domain model.
Production Insight
ACLs that are too thin let corruption through.
ACLs that are too thick become a maintenance burden.
Rule: translate only what you need — don't expose the entire external model.
Another failure: relying on runtime reflection for ACL translation — it makes debugging a nightmare. Use explicit mappers.
Also: don't forget to log ACL translation failures. A silent mis-translation can corrupt data for months before anyone notices. Add alerts for mapping errors.
Real incident: a retailer used reflection to map a legacy ERP's product fields to their domain. When the ERP added a 'weight' field, the reflective mapping silently picked it up and populated a 'weight' field that didn't exist in the domain model. The bug went undetected for three months until a shipping cost calculation used the wrong weight. They rewrote the ACL with explicit mapping and added contract tests.
Key Takeaway
Anti-Corruption Layers protect your domain from external models.
Translations should be explicit, versioned, and tested.
Own the ACL from the consuming side.
When to use an Anti-Corruption Layer?
IfIntegrating with a legacy system whose model you can't change
UseImplement ACL to translate to your model.
IfUsing a third-party API with a different vocabulary
UseCreate ACL to isolate your domain from API changes.
IfTwo internal contexts need to communicate but have different models
UseUse ACL on the consuming side to translate.
IfExternal system changes frequently and you want to minimise impact
UseACL with contract tests to catch changes early.

Context Mapping — Visualizing Relationships Between Bounded Contexts

Context Mapping is the practice of documenting the relationships between Bounded Contexts. It's an essential tool for understanding integration points, shared kernels, and translation requirements.

Common relationship patterns include
  • Partnership: two contexts cooperate on a shared goal
  • Shared Kernel: a small shared subset of the model (risky)
  • Customer-Supplier: one context provides data to another
  • Conformist: one context adopts the other's model without translation
  • Anti-Corruption Layer: protects the downstream context
  • Open Host Service: one context exposes a stable API for others
  • Published Language: both sides agree on a common interchange format

Draw a context map on a whiteboard during architecture reviews. It'll surface hidden dependencies. The map should be living documentation — update it whenever you change integration patterns.

Production insight: most teams skip context mapping until they have a broken integration. By then it's too late. Start the map early. Also, avoid shared kernels unless you have a dedicated team to manage them — they rot fast.

One more tip: color-code your context map by team ownership. It makes it immediately obvious where one team's changes can break another. That visual feedback is worth more than a hundred wiki pages.

A practical approach: use a tool like Miro or Structurizr to create a living context map. Link it to your code repositories. When a developer creates a new integration, they should update the map. Make it part of the definition of done.

A real example: a bank had 15 Bounded Contexts but no map. During an upgrade, the Payments team changed their API contract and seven downstream contexts broke silently. After creating a map, they discovered they had five undocumented conformist relationships. They invested in ACLs for each, and subsequent upgrades went smoothly.

io/thecodeforge/ddd/mapping/context-map.jsonJSON
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
{
  "contexts": [
    {
      "name": "Ordering",
      "team": "Checkout",
      "relationships": [
        {
          "type": "Customer-Supplier",
          "target": "Inventory",
          "description": "Ordering queries stock availability"
        },
        {
          "type": "Partnership",
          "target": "Billing",
          "description": "Coordination on payment events"
        }
      ]
    }
  ]
}
Production Insight
Teams skip context mapping until something breaks.
Without a map, you don't know which integrations need ACLs.
Rule: include context mapping in architecture reviews every quarter.
Also: a shared kernel (small shared model) is tempting but becomes a magnet for technical debt. Avoid unless you have a dedicated team.
Real story: a bank discovered 5 conformist relationships only after a failed upgrade. Mapping should be proactive, not reactive.
Key Takeaway
Context mapping visualizes hidden dependencies.
Update it when integration patterns change.
Avoid shared kernels without dedicated ownership.
● Production incidentPOST-MORTEMseverity: high

The God Aggregate That Took Down Payment Processing

Symptom
Payment transactions started timing out after an update added bulk order imports. Database locks escalated, and the entire order-processing pipeline stalled.
Assumption
The team assumed one Aggregate per order — it worked fine for years. They didn't consider the boundary could grow unbounded.
Root cause
The Order aggregate included all line items, discounts, and shipping details in a single transactional boundary. When line items exploded, the aggregate became too large, causing long-lived transactions and lock contention.
Fix
Split the Order aggregate. Moved line items into a separate OrderLine aggregate with its own repository. Used eventual consistency for total recalculation.
Key lesson
  • Aggregate boundaries must be sized by transactional consistency, not by data hierarchy.
  • If an aggregate holds more than ~100 entities on average, redesign it.
  • Test aggregate load under realistic data volumes before going to production.
  • When you split, invest in a dedicated read model for queries that need the full picture.
  • Always monitor transaction duration per aggregate — if it grows over time, your boundary is leaking.
Production debug guideSymptom → Action guide for the most common DDD implementation failures6 entries
Symptom · 01
Changing one domain model breaks unrelated features
Fix
Search for shared repositories that serve multiple Bounded Contexts. Each context should have its own repository interface and implementation. Run a dependency graph to find cross-context imports.
Symptom · 02
Transactions are slow and escalate to database deadlocks
Fix
Check if a single Aggregate is loading too many child entities. Thread dump analysis often shows long-running transactions in a single aggregate root method. Use database monitoring to spot the locked rows.
Symptom · 03
Business logic is duplicated across services
Fix
Verify that each team owns its own domain model. Duplicated business rules often indicate missing Bounded Context boundaries or shared libraries bleeding across contexts. Look for AbstractBase classes used by multiple services.
Symptom · 04
API responses include fields that don't make sense to the caller
Fix
Review whether the API exposes the internal domain model directly. Each Bounded Context should expose its own DTOs via an anti-corruption layer. Check the response schema — if it matches your entity exactly, you're leaking internals.
Symptom · 05
Event handler triggers duplicated data updates or infinite loops
Fix
Check if domain events are being consumed by multiple contexts without idempotency keys. Use event IDs and a deduplication table. Also verify your event bus does not redeliver without deduplication.
Symptom · 06
A single class has fields from multiple business domains with different lifecycles
Fix
Identify the bounded contexts each field truly belongs to. Create separate aggregates per context and use events to sync data when needed.
★ Quick DDD Smell DetectionSpot aggregate boundary violations and context confusion fast
A single class called 'Order' has 50+ fields
Immediate action
Identify which fields belong to different bounded contexts (e.g., billing vs shipping).
Commands
git log --follow Order.java | head
grep -r 'Order\.' src/ | cut -d: -f1 | sort -u
Fix now
Extract a new aggregate (e.g., ShippingOrder) and define an anti-corruption layer.
Changes to a domain model require database migrations across 5 services+
Immediate action
Audit which services share the same table or schema.
Commands
find . -name '*Repository.java' -exec grep -l 'class Order' {} \;
docker compose logs --tail 100 | grep -i 'deadlock'
Fix now
Introduce an event-driven integration between contexts: publish DomainEvents instead of sharing tables.
A service is down, but parts of the system still try to access it and fail silently+
Immediate action
Check if the service is a shared dependency across contexts. If so, you need a circuit breaker or an async event bridge.
Commands
kubectl get pods -l app=order-service --field-selector=status.phase=Running
curl -m 5 http://order-service/health || echo 'DOWN'
Fix now
Implement circuit breaker pattern per context. Each context should degrade gracefully when upstream contexts are unavailable.
New developers spend weeks mapping database columns to domain concepts+
Immediate action
Run a column-to-concept audit. If columns like 'cust_status_cd' exist, your ubiquitous language has rotted.
Commands
SELECT column_name FROM information_schema.columns WHERE table_name='customer';
grep -r 'status_cd\|type_flag' src/main/java/io/thecodeforge/
Fix now
Rename columns and fields to business terms. Write a migration script to update both DB and code in the same deployment.
Two teams argue over the meaning of a domain term+
Immediate action
Schedule a language workshop with domain experts and developers from both teams.
Commands
grep -rn 'Customer' src/ | awk -F: '{print $1}' | sort -u
find . -name 'Customer.java'
Fix now
Define distinct Bounded Contexts with separate Customer models. Add anti-corruption layers at integration points.
DDD Building Blocks Comparison
ConceptHas IdentityMutableTied to a lifecycleExample
EntityYesTypically yesYesUser, Order, Customer
Value ObjectNoNo (immutable)NoMoney, Address, DateRange
AggregateYes (root entity)YesYesOrder aggregate (root + line items)
Domain EventMay have event IDNoNo (record of occurrence)OrderConfirmed, PaymentReceived

Key takeaways

1
DDD is a communication tool first
align code vocabulary with business language before drawing aggregates.
2
Bounded Contexts are the core organizational unit; each context owns its own model, language, and data store.
3
Aggregates enforce transactional consistency
keep them small (under 100 child entities) to avoid lock contention.
4
Entities have identity; Value Objects are defined by attributes and must be immutable.
5
Domain Events enable decoupled integration between contexts
always include idempotency keys and version schemas.
6
Anti-Corruption Layers protect your domain from external models; test translations with contract tests.

Common mistakes to avoid

5 patterns
×

Sharing a single database table across multiple Bounded Contexts

Symptom
Changes to a table schema require coordinated releases across teams; query performance degrades as columns are added for different contexts.
Fix
Split the table into context-specific tables. Each context owns its schema. Use events or APIs to sync data between them.
×

Making Aggregates too large (including hundreds of child entities)

Symptom
Long-running transactions, lock contention, and slow reads. Each aggregate load pulls in thousands of rows even when only a few are needed.
Fix
Split the aggregate into smaller aggregates. Use eventual consistency for invariants that don't need immediate transactional guarantees. Limit aggregate size to ~100 child entities.
×

Skipping the Ubiquitous Language step and jumping straight to aggregate design

Symptom
Domain experts cannot review code or tests; developers and business use different terms leading to misimplementation of features.
Fix
Hold language workshops with domain experts before writing any code. Define a glossary and enforce it in code reviews. Rename any technical term that does not match business language.
×

Forgetting to version Domain Events and not using idempotency keys

Symptom
Consumer services process duplicate events after retries; schema changes break downstream consumers silently.
Fix
Include a unique event ID in every event. Use an outbox pattern to publish after transaction commit. Version events with a schema registry and never remove fields.
×

Using the same class model (Entity) across contexts instead of separate Value Objects or DTOs

Symptom
Unnecessary fields in API responses; changes to one context's logic force recompilation or migration in another context.
Fix
Define separate models per Bounded Context, even if they represent the same real-world entity. Use Anti-Corruption Layers to translate between contexts.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is a Bounded Context and why is it important in DDD?
Q02SENIOR
Explain the difference between an Entity and a Value Object with an exam...
Q03SENIOR
How do you handle eventual consistency when splitting an aggregate?
Q04SENIOR
Describe a production incident where incorrect aggregate boundary caused...
Q05SENIOR
What is the role of an Anti-Corruption Layer and when should you impleme...
Q06SENIOR
How do you identify Bounded Contexts in a legacy monolith?
Q01 of 06JUNIOR

What is a Bounded Context and why is it important in DDD?

ANSWER
A Bounded Context is an explicit boundary where a particular domain model applies. Inside, the language and terms have specific meanings; outside, they may differ. It prevents model pollution by ensuring each context owns its own vocabulary and data store. Without it, teams end up sharing huge models that can't evolve independently.
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
When should I NOT use DDD?
02
How do I identify Bounded Contexts in a legacy system?
03
What is the difference between Shared Kernel and Anti-Corruption Layer?
04
Can DDD work in a monolithic architecture?
05
What is the outbox pattern and why is it needed for Domain Events?
🔥

That's Architecture. Mark it forged?

16 min read · try the examples if you haven't

Previous
Hexagonal Architecture
10 / 13 · Architecture
Next
Twelve Factor App