DDD Aggregate Sizing — Why God Objects Kill Payment Systems
One oversized Order aggregate caused payment timeouts from lock escalation.
- DDD structures software around business domains, not technical layers
- Bounded Contexts define where a domain model applies; each context owns its vocabulary
- Aggregates enforce transactional consistency; keep them under ~100 child entities
- Entities have identity; Value Objects are defined by attributes and must be immutable
- Performance: oversized aggregates cause lock contention and slow transactions
- Production insight: sharing a database table across contexts couples deployments and kills autonomy
Imagine a hospital. The billing department calls a patient a 'payer with an account balance.' The doctors call the same person a 'patient with a diagnosis and treatment plan.' Both groups are talking about the same human being, but they each have their own vocabulary, rules, and paperwork — and that's totally fine. Domain-Driven Design says: stop trying to force everyone to share one giant shared definition. Instead, let each department own their own model of the world, speak their own language, and only sync up at the boundaries where they truly need to. That's it. That's DDD.
Most software systems don't fail because of bad algorithms or slow databases. They fail because the code stops making sense — not to the compiler, but to the team building it. Business rules get buried in utility classes. A 'Customer' object ends up carrying seventy fields because six different teams piled their needs into it. Changing one thing breaks three others in ways no one predicted. This is the silent killer of large codebases, and it's exactly the problem Domain-Driven Design was built to solve.
DDD, coined by Eric Evans in his 2003 book 'Domain-Driven Design: Tackling Complexity in the Heart of Software,' is a philosophy and a set of patterns for structuring software around the actual business domain. It argues that the biggest source of complexity isn't technical — it's conceptual. When your code's vocabulary doesn't match the business's vocabulary, every conversation between a developer and a domain expert becomes a translation exercise. Bugs hide in those translations. DDD eliminates the translation layer by making the code speak the same language as the business.
By the end of this article you'll be able to identify Bounded Contexts in a real system, design Aggregates that enforce invariants without becoming god objects, use Value Objects to eliminate primitive obsession, and understand exactly where DDD adds value versus where it becomes overkill. You'll also see the production gotchas that only show up when you're six months into a real DDD implementation — the things Evans' book doesn't warn you about. Raw theory never saved a production outage — applied DDD patterns do.
And that's the whole point: if your code doesn't hurt when you read it after a month, you're not modelling hard enough. DDD hurts at first, then it clicks.
What is Domain-Driven Design Basics?
Domain-Driven Design Basics is a core concept in System Design. Rather than starting with a dry definition, let's see it in action and understand why it exists. DDD is about structuring software around the business domain — the problem space the software is supposed to solve. The core premise: your code's vocabulary should match the domain experts' vocabulary. When it does, communication errors disappear and the code becomes self-documenting. When it doesn't, every feature request becomes a translation exercise that introduces bugs.
Here's what most intro articles skip: DDD isn't just about code structure — it's about power dynamics. When developers own the language instead of the business, they make assumptions that cost money. DDD flips this: domain experts define the terms, developers encode them. That's a governance model, not a design pattern.
Here's the trap most teams hit: they read Eric Evans' book, get excited, and immediately start drawing aggregates on a whiteboard. They skip the hard part – sitting with domain experts and agreeing on terms. Without Ubiquitous Language, DDD is just a fancy way to overengineer your code. Start with the language workshop. Do it until your domain expert can read your unit tests and say 'yes, that's what I meant.'
A real example: I once joined a project where the team had been doing 'DDD' for six months. When I asked the product manager to review a test, she said 'I don't know what 'statusCode' means — we say 'orderStage.'' That's when you know the language is broken. We renamed the enum, and suddenly the tests became reviewable by non-developers. That's the whole point.
Another way to think about it: DDD is a language alignment tool. If you're not aligning, you're leaking abstractions.
DDD in Practice: A Real-World Example
Take an e-commerce system. The 'Product' concept means different things to different teams. Inventory cares about stock locations and reorder points. Marketing cares about descriptions, images, and tags. The checkout team cares about price and availability. These are three different models of the same real-world thing. DDD says: don't share one Product class across all three. Create three Bounded Contexts: Inventory Product, Marketing Product, and Checkout Product. Each has its own lifecycle, its own invariants, and its own persistence. Communication between contexts happens through events or data transfer objects (DTOs), never through a shared database table.
The hardest part? Getting leadership to accept that 'Product' will be stored in three different tables. Engineers often resist because it feels like duplication. It's not — it's decoupling. Duplication of data is cheaper than coupling of teams.
In practice, you'll find teams that run a shared Product table because it's 'faster.' It's not – it's cheaper in the short term, but it costs you team autonomy. Once two teams share a table, they share deployment windows, migration schedules, and outage scope. That's not a database decision; it's an organisational decision.
Here's a concrete production story: a company had a single 'Product' table with 120 columns. Marketing wanted to add an 'SEO description' field, but the DBA said no because it would slow queries for inventory. That's a governance problem, not a technical one. The fix was splitting the table — and the teams — into separate contexts. The inventory team's write throughput doubled after removing marketing columns from their table.
And the metric that convinced leadership: after splitting, inventory writes went from 500 ops/s to 1200 ops/s. Marketing got their SEO field in one sprint. Shared nothing wins.
- Inventory Product: stock locations, reorder points, bin numbers.
- Marketing Product: description, images, tags, SEO metadata.
- Checkout Product: price, availability, discount eligibility.
Bounded Contexts — The Boundary That Prevents Chaos
A Bounded Context is the explicit boundary where a particular domain model applies. Inside the boundary, the language and terms have a specific meaning. Outside, they may mean something else — and that's intentional.
For example, in an e-commerce system, the 'Product' concept differs between inventory management (tracking stock levels) and marketing (tracking tags and images). Inventory doesn't care about the product description; marketing doesn't care about warehouse bin locations. Forcing them to share one model creates a 'Customer' object with 70 fields.
Implementing a Bounded Context means defining a distinct module, service, or package boundary. Within that boundary, use the Ubiquitous Language. Communication between contexts happens through events or APIs, never through shared databases.
Common pitfall: implementing anti-corruption layers too early or too late. Start with a lightweight translation layer (maybe a map) and only jump to full ACL when you see cross-context coupling hurting velocity.
Identifying Bounded Contexts in an existing system is messy. Look for the boundary where a term changes meaning. For example, when the billing team says 'customer' and the support team says 'customer' – they mean different things. That's your context boundary. Also, if a change in one module causes a cascade of test failures in another module, that module is likely violating your context boundary.
One more clue: ask your domain experts to draw a diagram of their business flows. The gaps and overlaps in their drawing are your context boundaries. Don't draw it yourself — let them show you where the seams are.
A subtle sign: if you have a 'Customer' class in a package called 'shared', you almost certainly have a context boundary violation. Shared model classes are a red flag. Rename it to something context-specific, even if it's initially identical.
- A Bounded Context maps to the team that owns it.
- Two teams using the same word may refer to different concepts.
- Translation between contexts happens at the boundary via an anti-corruption layer.
- A context's persistence can be any technology — SQL, NoSQL, even a file — as long as it's private.
Aggregates — Consistency Boundaries That Enforce Invariants
An Aggregate is a cluster of domain objects that must be treated as a single unit for data changes. Each Aggregate has a root entity (the Aggregate Root) that controls access to the rest of the cluster. All invariants (business rules) must be satisfied when the aggregate is committed.
For example, an Order should not allow more items than in stock. This invariant is enforced inside the Order aggregate root. The outside world only touches the root — never its children directly. This guarantees transactional consistency without locking huge swaths of the database.
You reference an aggregate by its global identity, not by navigating through other objects. This keeps relationships clean and avoids cascade problems.
Now the part nobody talks about: aggregate boundaries are often wrong the first time. You'll discover this when a seemingly simple business rule change forces a database migration across multiple services. That's the signal that your aggregate boundary wasn't correct.
The rule of thumb: if you can't fit the aggregate's state on a single post-it note during a whiteboard session, it's too big. Aggregates should be small enough that a business expert can reason about their invariants in one breath. If they need to say 'and then also...' your aggregate is too large.
Also consider: aggregates are not just about transactional boundaries; they also define ownership. If two teams need to change the same aggregate, you have an organisational mismatch. That's a Conway's Law problem, and no amount of code will fix it.
Performance nuance: loading a large aggregate from the database means loading all its children. If you have Order with 1000 LineItems, you're loading 1000 rows into memory every time you touch the order. That's a waste if you only need to check the total. Consider splitting into smaller aggregates or using a separate read model for queries.
A practical heuristic: start with a small boundary and expand only when you get a concrete business rule that requires transactional consistency across those objects. Premature aggregation is as dangerous as premature optimization.
Entities vs Value Objects — Identity Matters
Entities have a unique identity that persists through time. For example, a 'User' entity is the same user even if they change their email or password. You compare entities by ID, not by field values.
Value Objects, on the other hand, are defined entirely by their attributes. A 'Money' object with amount 100 and currency 'USD' is equal to another Money with the same values. Two order lines with identical product and quantity can be swapped — they have no separate identity.
The rule: if you care about 'who' it is, it's an Entity. If you care about 'what' it is, it's a Value Object. Value Objects should be immutable and side-effect-free.
Production pitfall: performance. If you create a Value Object in a hot loop (e.g., Money inside a large stream operation), object allocation can become a GC problem. Consider using records or value types.
One hidden trap: Value Objects that reference other Value Objects. If an Address contains a City, and City is a Value Object, then Address becomes composed of values. That's fine – but don't give City an ID. The moment you add an ID, City becomes an Entity and the composition semantics change. Stick to 'is-a-value' all the way down.
Another common mistake: using primitive types (string, int) instead of Value Objects to represent domain concepts. This is called primitive obsession. A Price is not a double; it's a Money with currency. A PhoneNumber is not a string; it's a structured value with formatting rules. DDD says: wrap every primitive that has business meaning in a Value Object. The code will tell you what it means.
I once saw a codebase where 'Amount' was a BigDecimal everywhere. Every method that dealt with money had to check the currency manually. After introducing a Money Value Object, the nullability checks and currency mismatch bugs disappeared. That's the power of making the type system work for you.
Another heuristic: if you can replace the object with a literal in a unit test and the test still makes sense, it's likely a Value Object.
- Entities have a thread of identity — they change over time.
- Value Objects are snapshots — they are equal if all attributes match.
- Value Objects should be immutable to avoid side effects.
- Never give a Value Object an ID — it violates the definition and causes identity confusion.
Ubiquitous Language — The One Rule That Makes DDD Work
Ubiquitous Language is the practice of using the same vocabulary in code, conversations, documentation, and domain experts' speech. When a domain expert says 'order is confirmed', the code should have an Order class with a confirm() method — not a 'status update' to a database column named 'flag_34'.
This sounds trivial, but in practice it's the hardest rule to follow. Teams slip into technical jargon or business shortcuts because they're faster in the moment. The result: bugs, misunderstandings, and a growing gap between what the business wants and what the system does.
The commitment is: whenever you discover a term that isn't in the Ubiquitous Language, either introduce it with the team's agreement or rename it. This is a continuous discipline, not a one-time exercise.
Hard truth: most teams fail at Ubiquitous Language within the first six months. The solution isn't a glossary doc — it's pairing developers with domain experts during refinement sessions. If you're not sitting next to the business when you write the code, your language will drift.
The best indicator of healthy Ubiquitous Language: can a product manager walk through the codebase and nod along? If they can't, your language is broken. Schedule regular 'language reviews' where the team reads through the domain model with a domain expert and flags terms that don't match.
One more thing: Ubiquitous Language applies to everything — APIs, event names, database column names. If you have a column called 'cust_status_cd', that's technical debt. Rename it to 'customer_status'. Every new developer will thank you.
A practical exercise: take your most recent user story. Write the acceptance criteria using only domain terms. Then go to your code and see if those terms exist as types, methods, or properties. If not, you have a gap.
And the final test: if you can't onboard a business analyst in two days to write acceptance tests, your language is a barrier.
Domain Events — How Contexts Communicate Without Coupling
Domain Events are the mechanism Bounded Contexts use to communicate asynchronously. When something important happens in one context (e.g., OrderConfirmed), it publishes an event. Other contexts subscribe and react. This keeps each context independent while still synchronizing state.
A well-designed Domain Event is immutable, includes the aggregate ID and a timestamp, and carries only the data the subscribers need — nothing more. The publishing context does not know who subscribes.
Production hazard: events that grow too large. If you put the entire order object into an OrderConfirmed event, you've coupled the subscriber to the publisher's internal structure. Keep events lean — use IDs and let subscribers fetch the rest via API if needed.
Versioning Domain Events is the part everyone forgets. Once you publish an event, you can't change its schema without breaking subscribers. Use Avro or Protobuf with schema registry, or at minimum, include a version field in the event and never remove fields – only add optional ones.
Another common mistake: publishing events before the transaction commits. If the transaction later rolls back, you've sent an event that never happened. Always publish events after the transaction is committed — use an outbox pattern if necessary.
Real production story: a team was using Domain Events to sync order data between contexts. They published the event before the transaction committed. A network timeout caused the transaction to roll back, but the event had already been consumed by the shipping context. The shipping context kicked off a fulfillment process for an order that didn't exist in the system. Recovering from that was a nightmare. Outbox pattern would have prevented it.
A subtle pitfall: using the same event bus for domain events and integration events. They have different guarantees — domain events are within a context, integration events cross contexts. Mixing them couples your infrastructure.
DDD and Microservices: Mapping Contexts to Services
One of the most common questions when adopting DDD is: how do Bounded Contexts map to microservices? The answer: they often align, but they're not the same thing. A Bounded Context is a conceptual boundary — it defines where a particular model applies. A microservice is a deployment unit — it defines what runs independently.
In many teams, each Bounded Context becomes one or more microservices. But you can also have multiple contexts inside a single service, especially in a modular monolith. The key is that contexts remain isolated in code and data even if they deploy together.
The danger is over-splitting: creating a microservice for every aggregate without considering operational cost. You end up with dozens of services, distributed transactions, and complex orchestration. Start with coarse context boundaries and split only when team autonomy or scaling demands it.
Common pattern: use a 'shared kernel' between contexts that are closely related, but this often turns into a shared mess. Prefer anti-corruption layers and published language via events.
When you do split into microservices, remember that network boundaries are real. You lose the ability to enforce invariants transactionally. You'll need sagas or process managers to maintain consistency. Don't take that lightly.
A pragmatic rule: if you have fewer than 10 developers, don't split into microservices based on DDD boundaries alone. Start as a modular monolith with clear package boundaries. Split when the team grows or when your CI pipeline becomes a bottleneck.
A good litmus test: if deploying a change to one context requires QA sign-off from another team, you've got a coupling problem that splitting won't fix on its own.
Anti-Corruption Layer — Protecting Your Domain from External Models
An Anti-Corruption Layer (ACL) is a protective boundary that translates between two Bounded Contexts. It prevents one context's model from leaking into and corrupting another's. This is especially important when integrating with legacy systems or third-party APIs where you can't control the model.
The ACL typically consists of a set of translators, adapters, and facade services. It maps external concepts to your internal Ubiquitous Language. For example, a legacy CRM's 'Account' object might map to your 'Customer' and 'Contract' aggregates.
Don't overbuild your ACL. Start with simple mapping functions. If you need to handle complex transformations, consider using a separate anti-corruption service. The key is that changes to the external system's model only affect the ACL, not your core domain.
Production gotcha: people often forget to version the ACL's translations. When the external model changes, the ACL must be updated. Without versioning, you'll get silent data corruption. Test ACL mappings with contract tests.
Another important point: the ACL belongs to the consuming context, not the provider. The team that consumes the external data owns the translation. That way they control when and how to update it.
Real-world example: a fintech company integrated with a legacy core banking system. The legacy system had a concept of 'account status' with values 'active', 'dormant', 'closed'. Our domain model had 'CustomerStatus' with 'Active', 'Inactive', 'Suspended'. The ACL translated one to the other. When the legacy system added 'suspended' as a status, the ACL broke silently until a customer complained they couldn't trade. The fix was adding a contract test on the ACL that warned when new statuses appeared.
A recurring pattern: teams put the ACL in the provider's codebase. Wrong. The consumer must own it, because the consumer decides what the domain model looks like.
Context Mapping — Visualizing Relationships Between Bounded Contexts
Context Mapping is the practice of documenting the relationships between Bounded Contexts. It's an essential tool for understanding integration points, shared kernels, and translation requirements.
- Partnership: two contexts cooperate on a shared goal
- Shared Kernel: a small shared subset of the model (risky)
- Customer-Supplier: one context provides data to another
- Conformist: one context adopts the other's model without translation
- Anti-Corruption Layer: protects the downstream context
- Open Host Service: one context exposes a stable API for others
- Published Language: both sides agree on a common interchange format
Draw a context map on a whiteboard during architecture reviews. It'll surface hidden dependencies. The map should be living documentation — update it whenever you change integration patterns.
Production insight: most teams skip context mapping until they have a broken integration. By then it's too late. Start the map early. Also, avoid shared kernels unless you have a dedicated team to manage them — they rot fast.
One more tip: color-code your context map by team ownership. It makes it immediately obvious where one team's changes can break another. That visual feedback is worth more than a hundred wiki pages.
A practical approach: use a tool like Miro or Structurizr to create a living context map. Link it to your code repositories. When a developer creates a new integration, they should update the map. Make it part of the definition of done.
A real example: a bank had 15 Bounded Contexts but no map. During an upgrade, the Payments team changed their API contract and seven downstream contexts broke silently. After creating a map, they discovered they had five undocumented conformist relationships. They invested in ACLs for each, and subsequent upgrades went smoothly.
The God Aggregate That Took Down Payment Processing
- Aggregate boundaries must be sized by transactional consistency, not by data hierarchy.
- If an aggregate holds more than ~100 entities on average, redesign it.
- Test aggregate load under realistic data volumes before going to production.
- When you split, invest in a dedicated read model for queries that need the full picture.
- Always monitor transaction duration per aggregate — if it grows over time, your boundary is leaking.
Key takeaways
Common mistakes to avoid
5 patternsSharing a single database table across multiple Bounded Contexts
Making Aggregates too large (including hundreds of child entities)
Skipping the Ubiquitous Language step and jumping straight to aggregate design
Forgetting to version Domain Events and not using idempotency keys
Using the same class model (Entity) across contexts instead of separate Value Objects or DTOs
Interview Questions on This Topic
What is a Bounded Context and why is it important in DDD?
Frequently Asked Questions
That's Architecture. Mark it forged?
16 min read · try the examples if you haven't