Advanced 21 min · March 05, 2026

MongoDB Aggregation Pipeline

MongoDB Agg: $lookup Without Index - 25B Comparisons

Q: What is MongoDB Aggregation Pipeline in simple terms?

MongoDB Aggregation Pipeline is a fundamental concept in Database. Think of it as a tool — once you understand its purpose, you'll reach for it constantly.

Q: How do I avoid the 100MB memory limit in aggregation?

Use allowDiskUse true to spill to disk, but better: optimize the pipeline. Put $match and $project early to reduce document size. Use indexes for $match and $sort. Consider breaking a large pipeline into smaller ones if possible.

Q: Why is my $lookup so slow and how do I fix it?

The most common cause is a missing index on the foreign key field. Create an index on the field used in foreignField. Also check if you can reduce the number of input documents by adding $match before $lookup. Use explain to see if the stage uses COLLSCAN or IXSCAN.

Q: Why does my aggregation return no documents after $lookup and $unwind?

If the foreign field is missing, $lookup returns an empty array. $unwind by default removes documents with empty arrays. Use preserveNullAndEmptyArrays: true to keep the original document.

Q: Can I use $facet to run aggregations in parallel?

Yes, $facet runs sub-pipelines on the same set of input documents. However, it blocks until all sub-pipelines complete. If one is much slower, consider separate aggregation calls to return partial results faster.

Q: How do I check if my aggregation used an index?

Run explain('executionStats') and look for 'IXSCAN' in the inputStage. If you see 'COLLSCAN', no index was used. For $lookup, check the foreign collection's index usage inside the $lookup stage.

An unindexed $lookup caused 25 billion comparisons and a 45-minute timeout in production.

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Written from production experience, not tutorials.

✓ Production

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 30 min

✓Deep production experience
✓Understanding of internals and trade-offs
✓Experience debugging complex systems

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

MongoDB Aggregation Pipeline processes documents through a series of stages, each transforming the data stream.
Core stages: $match (filter), $group (aggregate), $sort (order), $project (reshape), $lookup (join), $unwind (flatten arrays).
Stage order matters: early $match reduces document count before expensive operations.
Performance: indexes used automatically for $match and $sort early in pipeline; late-stage operations may require allowDiskUse() to exceed 100MB memory limit.
Production insight: $lookup without an index on the foreign field triggers a full collection scan per input document — can kill query performance.
Biggest mistake: assuming the pipeline executes in the order you type it — MongoDB optimises, but $sort before $match blocks index usage.

✦ Definition~90s read

What is MongoDB Aggregation Pipeline?

★

Imagine you work in a huge warehouse full of cardboard boxes.

That's the key insight: you describe the transformation steps, and MongoDB's query planner figures out the execution order, index usage, and resource allocation.

But that streaming nature also creates pitfalls: blocking stages ($group, $sort) must buffer everything before output, which is where the 100MB memory limit bites you.

Here's a concrete example: you want the top 10 customers by total order value this year. A find() can't do that — you need to filter, join, group, sort, and limit. The pipeline does all that in one go.

Before the pipeline existed, teams often relied on application-side aggregation: pull all orders into memory, filter, group, sort. That worked for small datasets but broke at production scale. The pipeline moves the computation to the database, saving network bandwidth and leveraging indexes. That's the difference between a 2-second dashboard and a 45-minute timeout.

One thing most engineers miss: the pipeline optimizer can reorder stages, but only within strict rules. For example, $match can be pushed before $project, but not through $lookup. This means your mental model of stage order must match the optimizer's limitations — or you'll write a pipeline that looks optimised but forces full collection scans.

Here's how you can verify this yourself: run explain() on a pipeline with $match after $lookup. You'll see the $match stays put — it won't push down.

Another nuance: the optimizer can also merge consecutive $match stages and push $project filters into earlier stages. But it cannot reorder stages across a $lookup boundary because it doesn't know the foreign collection's data distribution. That's why you need to manually push filters into $lookup's sub-pipeline.

Plain-English First

Imagine you work in a huge warehouse full of cardboard boxes. You need to find all boxes heavier than 10kg, group them by color, and count how many of each color you have. Instead of opening every box yourself, you set up an assembly line: the first worker filters heavy boxes, the second sorts by color, and the third counts. That assembly line is MongoDB's Aggregation Pipeline — each stage hands its output to the next, transforming your raw data into the exact answer you need.

⚙ Browser compatibility

Latest versions — ✓ supported

Chrome	Firefox	Safari	Edge
✓	✓	✓	✓

Most applications don't just store data — they ask questions of it. 'Which users spent the most last month?', 'How many orders shipped per region per day?', 'What's the 90th-percentile response time across our API endpoints?' These are not simple find() queries. They require filtering, reshaping, grouping, and computing across potentially millions of documents. That's the job MongoDB's Aggregation Pipeline was built for, and in high-traffic production systems it's the difference between a dashboard that loads in 40ms and one that times out.

Before the Aggregation Pipeline existed (pre-MongoDB 2.2), developers were stuck with MapReduce — a JavaScript-evaluated, single-threaded monster that was slow, opaque, and painful to debug. The Pipeline replaced it with a declarative, composable model: you describe what you want, stage by stage, and MongoDB's query planner figures out how to execute it efficiently. That mental shift — from imperative to declarative — is what makes the Pipeline feel elegant once it clicks.

By the end of this article you'll be able to write multi-stage pipelines that actually use indexes, avoid the dreaded 100MB in-memory sort limit, spot the three most common production mistakes before they bite you in a code review, and explain to an interviewer how the execution engine decides whether to push a $match before a $lookup.

What is MongoDB Aggregation Pipeline?

MongoDB Aggregation Pipeline is a core concept in Database. Instead of a dry definition, think of it as a data processing assembly line. Each stage receives a stream of documents, transforms them, and passes the result to the next stage. It's declarative — you say what you want, not how to do it. That's the key insight: you describe the transformation steps, and MongoDB's query planner figures out the execution order, index usage, and resource allocation.

The pipeline was introduced in MongoDB 2.2 to replace MapReduce, which ran JavaScript in a single thread and couldn't leverage indexes. The pipeline runs in C++ (native speed), can use indexes for $match and $sort, and streams results through stages to keep memory low. But that streaming nature also creates pitfalls: blocking stages ($group, $sort) must buffer everything before output, which is where the 100MB memory limit bites you.

Here's how you can verify this yourself: run explain() on a pipeline with $match after $lookup. You'll see the $match stays put — it won't push down.

pipeline-intro.jsJAVASCRIPT

// Top 10 customers by total order value in 2026
db.orders.aggregate([
  { $match: { orderDate: { $gte: ISODate('2026-01-01') } } },
  { $lookup: {
      from: 'customers',
      localField: 'customerId',
      foreignField: '_id',
      as: 'customer'
  }},
  { $unwind: '$customer' },
  { $group: { _id: '$customer.email', totalSpent: { $sum: '$total' } } },
  { $sort: { totalSpent: -1 } },
  { $limit: 10 }
]);

// Check optimization — see if $match stays first
db.orders.explain('queryPlanner').aggregate([...]);

Try it live

🔥Forge Tip:

Type this code yourself rather than copy-pasting. The muscle memory of writing it will help it stick.

📊 Production Insight

A pipeline that works on 10k documents may fail on 10M.

Always test at scale — use explain() to see stage boundaries.

The streaming model breaks when a blocking stage holds all documents.

Real fix: push $match early and use allowDiskUse only as a last resort.

🎯 Key Takeaway

Pipeline is declarative — describe what, not how.

Blocking stages ($group, $sort) consume memory — plan for it.

Streaming doesn't mean no memory limit; it means per-stage.

When to use aggregation vs find()

IfNeed to transform data (group, sort, project) across documents

→

UseUse aggregation pipeline — find() can't reshape or group.

IfSimple filter with field selection, no joins

→

UseUse find() — it's faster and uses indexes more flexibly.

IfNeed to join multiple collections

→

UseUse aggregation with $lookup — but ensure foreign index exists.

thecodeforge.io

Mongodb Aggregation Pipeline

Stage Execution Order and Document Flow

Understanding the physical execution order of an aggregation pipeline is crucial for debugging performance. The logical order you write (e.g., $match then $group) may be reordered by the optimizer, but the actual engine follows a strict streaming model: each stage processes documents one at a time and passes them to the next. However, stages that require all input before producing output (blocking stages) break the stream. The diagram below illustrates a typical pipeline flow showing which stages stream and which block.

The key insight: $match and $project are streaming — they process one doc at a time. $group, $sort, and $bucket are blocking — they accumulate all input before emitting any output. $lookup is hybrid: it streams the local documents but performs indexed lookups per document (if indexed) or full scans (if not). $unwind is streaming but can increase document count dramatically.

When you see a slow pipeline, trace the flow: identify the first blocking stage. That's where the memory and time bottleneck lives. Always put a $match before a blocking stage to reduce its input size.

📊 Production Insight

In one production incident, a pipeline had $lookup before $group — but $group was blocking 2 million docs. Adding a $match before $lookup cut input by 70%, dropping memory usage below 100MB. The fix was as simple as adding a $match on a date range.

🎯 Key Takeaway

Blocking stages are the performance chokepoints. Place streaming stages before them to minimize input size.

Aggregation Pipeline Stage Execution Flow

Core Stages: $match, $group, $sort, $project, $lookup, $unwind

The pipeline is only as good as its stages. Here's the real behaviour of the most used ones:

$match: Filters documents. Always put it first — it reduces the document count before any expensive operation. Importantly, $match can use indexes if it appears at the start, but once you've had a $project or $unwind, the engine can't use indexes for later $match stages.

$group: Groups documents by a key and applies accumulator expressions (sum, avg, etc.). This stage buffers all documents in memory, so if your group key is high-cardinality, you may hit the 100MB limit.

$sort: Sorts documents. If used early, it can use an index. If used after $group or $project, it's a blocking operation — it must see all input before emitting output.

$project: Reshapes documents. Use it to exclude fields you don't need downstream. This reduces memory pressure.

$lookup: Performs a left outer join with another collection. Slowest stage by far — always ensure the foreign field is indexed. You can also pass a pipeline to $lookup to filter before joining.

$unwind: Deconstructs an array field. Each array element becomes a separate document. This can explode cardinality — one document with a 1000-element array becomes 1000 documents. Use with caution.

Order matters more than you think. MongoDB's optimizer can reorder some stages, but not all. A $match after $lookup won't push down into the foreign collection. Put $match before $lookup if possible.

Let's look at a real-world example: a pipeline that totals sales per region, but we also want to include region details from a lookup.

One pattern senior engineers use: pass a pipeline to $lookup with $match to filter joined documents before the join completes. This reduces the number of documents returned and avoids unnecessary $unwind multiplication. For example, if you only need active customers, add a $match in the $lookup pipeline.

Another common trap: using $addFields to compute temporary fields that are later used in $group. Every $addFields re-materialises the document, costing CPU. Instead, compute derived fields inside $group using $sum with expressions or use $project to trim fields before $group.

I once saw a team add a $addFields to compute a discount after a $lookup — it doubled the pipeline execution time. Moving the computation into $group fixed it.

There's also a subtlety with $project: removing fields early reduces memory in later $group and $sort stages. Use $project to drop all fields except the group key and accumulator. You can even use $project to rename fields to avoid $addFields entirely.

pipeline-example.jsJAVASCRIPT

// Example: Get total sales per customer in 2026
db.orders.aggregate([
  { $match: { orderDate: { $gte: ISODate('2026-01-01'), $lt: ISODate('2027-01-01') } } },
  { $lookup: {
      from: 'customers',
      localField: 'customerId',
      foreignField: '_id',
      as: 'customer'
  }},
  { $unwind: '$customer' },  // one doc per order+customer
  { $group: {
      _id: '$customer.email',
      totalSpent: { $sum: '$total' }
  }},
  { $sort: { totalSpent: -1 } },
  { $limit: 10 }
]);

// Optimised $lookup with pipeline
db.orders.aggregate([
  { $match: { orderDate: { $gte: ISODate('2026-01-01') } } },
  { $lookup: {
      from: 'customers',
      let: { custId: '$customerId' },
      pipeline: [
        { $match: { $expr: { $eq: ['$_id', '$$custId'] } } },
        { $match: { active: true } } // only active customers
      ],
      as: 'customer'
  }},
  { $unwind: { path: '$customer', preserveNullAndEmptyArrays: true } },
  { $group: { _id: '$customer.email', totalSpent: { $sum: '$total' } } },
  { $sort: { totalSpent: -1 } }
]);

Try it live

⚠ Common trap

Placing $match after $unwind does NOT filter the original array elements — it filters the exploded documents. To filter array elements before exploding, use $filter inside $project or $addFields.

📊 Production Insight

A $lookup without an index on the foreign field becomes a nested loop join — O(n*m). For 10k orders and 100k customers, that's 1 billion comparisons.

Always index foreign fields before deploying a $lookup.

Even with an index, $lookup adds network round-trips; reduce input with early $match.

Debug command: db.orders.explain('executionStats').aggregate([...]) — check stage for 'COLLSCAN'.

🎯 Key Takeaway

Stage order determines index usage and memory consumption.

Early $match is the single most impactful performance lever.

Use explain() to verify stage execution — assumptions are expensive.

Real world: an index on foreign field cut a 45-minute job to 3 seconds.

Stage order decision

IfNeed to reduce data before any join

→

UsePlace $match as stage 1, then $lookup. Reduces foreign key lookups.

IfNeed to sort on a field after transformation

→

UseConsider using $sort before $project if possible to use index, otherwise expect in-memory sort.

IfNeed to filter after $lookup

→

UseMove filter into $lookup's pipeline parameter — engine won't push it down automatically.

thecodeforge.io

Mongodb Aggregation Pipeline

Blocking vs Streaming Stage Performance Table

One of the most important distinctions in pipeline performance is whether a stage is streaming or blocking. A streaming stage processes documents one at a time and never materialises the entire set in memory. A blocking stage must accumulate all input documents before it can emit any output — which consumes memory and adds latency.

The table below summarises the behaviour of common stages:

Stage	Type	Memory Usage	Notes
$match	Streaming	Low (filters one doc at a time)	Can use indexes if at pipeline start
$project	Streaming	Low (reshapes one doc)	Use to trim fields before blocking stages
$addFields	Streaming	Low	Still re-materialises each doc; prefer inline in $group
$lookup	Hybrid (varies)	Moderate-high	Streaming on local docs; blocking on foreign collection if unindexed
$unwind	Streaming	Low (but multiplies docs)	Each input doc can produce many output docs
$group	Blocking	High (can exceed 100MB)	Buffers all documents with the same group key
$sort	Blocking	High (full in-memory sort)	Exception: $sort + $limit uses top-k (memory ~k elements)
$bucket	Blocking	High	Divides into buckets; memory grows with number of buckets
$facet	Blocking	Multiple sub-pipelines	Each sub-pipeline can use up to 100MB independently
$limit	Streaming	Low	Caps number of documents
$skip	Streaming	Low	Skips first N docs
$sample	Streaming / Blocking	Varies	If size < 5% of collection, uses random cursor (streaming); otherwise sorts on _id (blocking)

Production tip: Always place streaming stages before blocking stages. The early $match and $project reduce the avalanche of documents hitting a $group or $sort. If you must use a blocking stage, remember that $sort + $limit is an exception: it keeps only the top K elements in memory, making it effectively streaming for the sorted window.

💡Memory Budget in Practice

Each blocking stage gets its own 100MB memory allowance. If you have a $group then a $sort, that's 200MB potential usage before allowDiskUse kicks in. Use explain('executionStats') to see if any stage reported 'usedDisk: true'.

📊 Production Insight

When I migrated a pipeline from a $group + $sort to $sort + $limit + $group (with a different aggregation), we reduced memory usage from 300MB to 40MB. The $sort + $limit used top-k and never materialised the full dataset.

🎯 Key Takeaway

Streaming stages are your friends — keep them early. Blocking stages should be as late as possible, with inputs trimmed by earlier streaming stages.

Index Usage and the 100MB Memory Limit

MongoDB's aggregation engine can leverage indexes for $match, $sort, and $geoNear — but only if those stages appear early in the pipeline, before any stage that transforms the document shape.

The 100MB memory limit applies to blocking stages: $group, $sort, $bucket, $facet, $lookup (in-memory side). When these stages exceed 100MB, the engine throws an error unless you pass { allowDiskUse: true }. Disk spill is slower but lets you process larger-than-memory datasets.

Here's the trick: the memory limit is per-stage, not per-pipeline. A $group that processes 90MB and a $sort that processes 80MB will both fit, but a $group processing 120MB will fail. Use explain to see memory consumption.

What about indexes in $lookup? The foreign collection's index is used when you specify an exact field match. If you pass a pipeline, the engine tries to use indexes for $match within that sub-pipeline.

Also, $match within $lookup's pipeline does NOT use indexes on the local collection — only the foreign collection's indexes matter there.

Let's see how to check memory usage and enable disk use with code.

Note that $bucket and $facet can also hit the memory limit. Each sub-pipeline in $facet gets its own memory allowance, but if one sub-pipeline spills to disk, it delays the whole $facet.

One edge case: $sort after $limit. If you sort then limit, MongoDB uses a top-k sort that keeps only k elements in memory. This is a huge win — but it only works if $sort is before $limit with no intervening $group. If you need to sort after grouping, you're back to full in-memory sort.

I've seen a pipeline fail on a 200MB $group with a perfectly healthy $sort. The fix wasn't allowDiskUse — it was adding a $match to cut input by 60%.

Another important detail: the 100MB limit is configurable with the internalQueryMaxBlockingSortMemoryUsageBytes parameter (though not recommended to change in production). But the real lever is $project — reducing each document's size before $group means more documents fit in 100MB. A 20% reduction in document size can be the difference between spilling and not spilling.

explain-memory.jsJAVASCRIPT

// Check execution stats and memory usage
db.orders.aggregate([
  { $match: { status: 'completed' } },
  { $group: { _id: '$region', total: { $sum: '$amount' } } },
  { $sort: { total: -1 } }
]).explain('executionStats');

// Enable disk use for large sorts
db.orders.aggregate(
  [ ... ],
  { allowDiskUse: true }
);

// Check if disk was used in executionStats: look for 'usedDisk' field
// On MongoDB 4.4+, also run db.collection.aggregate([...], { allowDiskUse: true }).explain('executionStats') to see spill.

Try it live

Mental Model

The Pipeline as a Waterfall

Think of each stage as a pool in a waterfall. The first pool (stage) can be quite small if you filter early — but later pools must handle all the water from upstream.

Early $match reduces the volume entering the entire pipeline.
Indexes only work before the first document-transforming stage ($project, $unwind, $addFields).
Memory limit is per-pool — one large $group can blow it even if other stages are small.
allowDiskUse lets the overflow spill into a reservoir (disk), but slows the flow.
explain() shows you the water level at each stage — the stage with the highest 'docs examined' is your bottleneck.

📊 Production Insight

We once saw a $sort before a $match blow up the memory limit despite a small result set. The $sort processed the entire collection because the index could not be used after an earlier $addFields.

Fixed by moving $match to position 1 and $sort to position 2.

Rely on explain(), not intuition.

Add monitoring: if you see 'usedDisk:true' in explain, your query needs optimisation.

🎯 Key Takeaway

Put $match and $sort before any stage that changes document shape.

Memory limit is per blocking stage — check explain for 'usedDisk'.

allowDiskUse is a band-aid; proper stage order is the cure.

When to use allowDiskUse?

IfSingle $group/sort exceeding 100MB but pipeline is otherwise efficient

→

UseUse allowDiskUse: true. Monitor disk I/O — it's slower but works.

IfMultiple stages each near limit, or overall pipeline too slow

→

UseRestructure: push $match and $project earlier, use indexes, break pipeline into separate aggregations.

If$bucket or $facet hits memory limit

→

UseSet allowDiskUse: true, but also consider reducing the number of buckets or sub-pipelines.

Pipeline Optimization and Execution Engine Internals

MongoDB's query planner does more than you think. When it sees a pipeline, it applies a set of transformations:

$match coalescence: Multiple $match stages are merged into one. This matters because it reduces overhead.
$project removal: If a $project only removes fields and is followed by another $project, the engine may combine them.
$sort + $limit: When you have a $sort followed by $limit, the engine uses a top-k sort algorithm — it only keeps the k best elements in memory. This drastically reduces memory for large sorts.
$lookup optimization: For simple equality lookups, the engine uses a hash join internally. For more complex pipelines, it falls back to nested loop.

But not all optimizations happen automatically. For example, $match after $lookup does not push into the $lookup pipeline. You must manually move that filter inside the $lookup's pipeline parameter.

The execution model is document-stream based. Each stage outputs documents one by one to the next stage. Blocking stages (like $group) must buffer all documents before emitting. This is why a $group after a $project that reduces document size is faster than a $group on original docs.

Here's how to check if optimization occurred.

Another optimization: when using $lookup with a pipeline, the engine can sometimes use indexes on the foreign collection for $match stages inside that pipeline. But it won't reorder stages inside the sub-pipeline. You can force a plan using $hint if the planner picks a suboptimal one.

A production trick: if you see multiple $match stages in your explain output and they haven't been coalesced, check if there's an intervening $addFields or $project that blocks coalescence. Sometimes adding a $project can actually prevent optimization.

I've debugged a pipeline where a $project between two $match stages prevented coalescence. Removing the first $project and using $unset to drop fields solved it.

Additionally, the optimizer can also remove $project stages that do nothing — for example, if you $project a field that's already the only field in the document. But it won't remove a $project that adds computed fields. That's another reason to avoid unnecessary $addFields.

optimization-check.jsJAVASCRIPT

// Check if optimization occurred: look for 'optimizedPipeline'
db.collection.aggregate([...]).explain('queryPlanner');

// If you see 'stages' with different order, MongoDB optimized it.
// Example: two $match stages become one.

// Push $match into $lookup manually:
db.orders.aggregate([
  { $lookup: {
      from: 'products',
      let: { prodId: '$productId' },
      pipeline: [
        { $match: { $expr: { $eq: ['$_id', '$$prodId'] } } },
        { $match: { active: true } }  // pushed into lookup
      ],
      as: 'product'
  }}
]);

// Force a query plan with $hint on foreign collection (rare)
db.products.createIndex({ active: 1 });
db.orders.aggregate([
  { $lookup: {
      from: 'products',
      let: { prodId: '$productId' },
      pipeline: [
        { $match: { $expr: { $eq: ['$_id', '$$prodId'] }, active: true } }
      ],
      as: 'product'
  }}
]);

// To see if $sort+$limit used top-k sort, check explain for 'optimizedPipeline' or 'sort' stage with 'limit'.

Try it live

🔥Optimizer limitations

The optimizer cannot push $match through $lookup because it doesn't know the foreign collection's data distribution. You have to do it manually.

📊 Production Insight

The top-k sort optimization saved us 6GB of RAM on a pipeline that sorted all orders then took top 100. Without the optimization, the $sort would have failed memory limit.

Always check explain output to see if optimization kicked in.

If you don't see 'optimizedPipeline', the engine didn't change anything.

Pro tip: use $hint to force a plan when needed — but test against production data.

🎯 Key Takeaway

MongoDB optimizes $match merging and $sort+$limit automatically.

$match after $lookup is NOT optimized — you have to push it manually.

Always verify with explain('queryPlanner').

The optimizer won't save you from bad stage ordering.

When Does Optimization Help?

IfMultiple $match stages in sequence

→

UseEngine merges them into one — no performance gain needed.

IfOne $project removes fields, another adds fields

→

UseEngine may combine — check explain for 'optimizedPipeline'.

If$sort followed by $limit

→

UseEngine uses top-k sort — huge memory savings.

If$match appears after $lookup

→

UseEngine does NOT push it inside — must manually move into lookup pipeline.

Aggregation Optimization Ruleset Checklist

Here is a concise checklist of optimization rules that you can apply when writing or reviewing an aggregation pipeline. These rules are derived from production experience and the MongoDB documentation. Apply them in order:

Rule 1 — Push $match as early as possible. Ideally the first stage. This reduces the number of documents flowing into the pipeline. Use explain to confirm that the $match uses an index.
Rule 2 — Precede blocking stages with $match and $project. Before any $group, $sort, $bucket, or $facet, add a $project to remove unnecessary fields, reducing document size and memory pressure.
Rule 3 — Always index foreign fields used in $lookup. Without an index, $lookup does a full collection scan per local document. Create a supporting index on the foreign collection's join field.
Rule 4 — Use $lookup with a sub‑pipeline instead of $unwind on the full result. Filters inside the sub-pipeline reduce the number of joined documents before $unwind multiplies them.
Rule 5 — Replace $addFields with inline calculations in $group accumulators. Instead of computing a field with $addFields then using it in $sum, compute the expression directly inside the accumulator.
Rule 6 — Exploit the $sort + $limit top-k optimization. When you need only the top N sorted results, always place $limit immediately after $sort. This keeps only N elements in memory.
Rule 7 — Avoid $facet if one sub-pipeline is significantly slower than others. $facet blocks until the slowest sub-pipeline finishes. Use separate aggregations and merge in application code.
Rule 8 — Test with production-scale data. A pipeline that works on 1M documents may fail on 100M. Use explain('executionStats') and check 'docsExamined' vs 'nReturned'.
Rule 9 — Set maxTimeMS on every aggregation in production. This prevents a runaway pipeline from hanging the application. Use a timeout that the use case can tolerate.
Rule 10 — Monitor $lookup performance via $currentOp and profiler. Look for operations with high 'docsExamined' relative to 'nReturned'.

🔥Quick Review Template

Before deploying a new aggregation pipeline, run through this checklist. If you can't justify one of the rules being skipped, revisit the pipeline design.

📊 Production Insight

We embedded this checklist in our code review process. In the first three months, it caught 12 pipelines that would have failed in production — all due to missing $match or unindexed $lookup. The cost of a review is 10 minutes; the cost of a production incident is hours.

🎯 Key Takeaway

Apply these 10 rules as a mental checklist when writing any aggregation pipeline. They convert implicit knowledge into explicit guardrails.

Debugging and Profiling Aggregations in Production

You can't fix what you can't see. MongoDB gives you four tools to inspect aggregation performance:

explain(): Use with 'executionStats' to see the number of documents examined, returned, and stage-level timings. This is your single most important diagnostic.
Database Profiler: Set profiling level to 2 for slow operations. Look for aggregations with high 'docsExamined' vs 'nReturned' ratio.
$planCacheStats: If a query shape has multiple plans, the cache holds the winning plan. You can clear it to force re-optimization.
setProfilingLevel and system.profile: Log all operations taking longer than a threshold. Analyze the profile collection for pattern detection.

Real trick: When debugging, start with a pipeline that has only the first stage, add stages one by one, running explain at each step. This isolates which stage is the culprit. Never debug a 10-stage pipeline as a whole.

Here's a practical example of enabling profiling and querying slow aggregations.

You can also use $indexStats on the foreign collection to verify indexes are being used by your $lookup.

An additional pattern: if you're seeing intermittent slow aggregations, check if the plan cache is holding a stale plan after data growth. Clear it with getPlanCache().clear() and force re-optimization.

I had a situation where a $lookup was fast for weeks, then suddenly took 10x longer. The index was there, but the plan had changed. Clearing the cache restored performance.

Profiling can also reveal aggregations that return far more documents than expected — often a sign of an accidental $unwind multiplication. Check the nReturned vs totalDocsExamined ratio in explain. If nReturned is orders of magnitude larger than the input count, you've got an explosion.

profiling-commands.jsJAVASCRIPT

// Enable profiling for slow operations (>100ms)
db.setProfilingLevel(1, { slowms: 100 });

// View recent slow aggregation queries
db.system.profile.find({
  op: 'command',
  'command.aggregate': { $exists: true },
  millis: { $gt: 500 }
}).sort({ ts: -1 }).limit(10).pretty();

// Clear plan cache for a collection
db.collection.getPlanCache().clear();

// Use plan cache stats to see cached plans
db.collection.getPlanCache().listQueryShapes();

// Check index usage for $lookup foreign collection (in mongosh)
db.customers.aggregate([ { $indexStats: {} } ]);

// Test isolation: incrementally add stages to pinpoint bottleneck
db.orders.aggregate(pipeline.slice(0,1)).explain('executionStats');
db.orders.aggregate(pipeline.slice(0,2)).explain('executionStats');

Try it live

⚠ Profiling overhead in production

Setting profiling level to 2 logs every operation — can cause significant performance degradation. Use level 1 with appropriate slowms threshold. Or better, turn on profiling only on a secondary replica if you have one.

📊 Production Insight

In one incident, a $lookup performed well in QA but terribly in prod.

QA had 1M docs, prod had 50M — different query plans.

Always test with production-scale data volumes.

Use plan cache clearing as a first step when performance degrades unexpectedly.

🎯 Key Takeaway

explain('executionStats') is your first debugging step.

Add stages incrementally to find the bottleneck.

Profile in production with level 1 and a sensible slowms.

When to use each profiling tool

IfSingle aggregation is slow, want to see stage details

→

UseUse explain('executionStats').

IfNeed to find slow aggregations across the system

→

UseEnable profiler at level 1 with slowms threshold and query system.profile.

IfQuery plan changed after index creation

→

UseClear plan cache with getPlanCache().clear() to force re-optimization.

Explain Plan Walkthrough for Aggregation Queries

The explain() method provides three verbosity modes: queryPlanner, executionStats, and allPlansExecution. For aggregation debugging, executionStats is the most useful. It shows the winning plan's performance details: how many documents were examined at each stage, how long each stage took, and whether indexes were used.

Let's walk through a real explain output for a simple pipeline.

First, run the pipeline with explain('executionStats'). The output contains a stages array. Each element represents a stage in the execution tree. Look for: - stage: the stage name (e.g., 'COLLSCAN', 'IXSCAN', 'GROUP', 'SORT') - nReturned: number of documents output by that stage - totalDocsExamined: number of documents the stage read from disk or index - executionTimeMillisEstimate: time spent in that stage - inputStage: nested details for dependent stages

Key metrics to inspect:

Ratio totalDocsExamined / nReturned: A high ratio indicates a stage is scanning many documents to produce few results. For a $match, this means no index was used. For a $lookup, it means the foreign field is not indexed.

usedDisk: If true, the stage spilled to disk. This is a red flag — restructure the pipeline to avoid it.

IXSCAN vs COLLSCAN: IXSCAN means an index was used; COLLSCAN means a full collection scan.

Example walkthrough:

Consider the pipeline: [{$match: {status: 'active'}}, {$group: {_id: '$category', count: {$sum:1}}}]

The explain output might show: `` stages: [ { stage: 'GROUP', nReturned: 50, totalDocsExamined: 20000, ... }, { stage: 'COLLSCAN', nReturned: 20000, totalDocsExamined: 20000, filter: {status: 'active'} } ] ` Here we see GROUP examined 20,000 documents (all input) and returned only 50 groups. But the collection scan (COLLSCAN) examined all documents in the collection, not just those with status: 'active'. That's because the filter was applied inside the scan, but an index on status would have reduced totalDocsExamined to only matching documents. With an index, the scan becomes IXSCAN on status, and totalDocsExamined` would be much lower.

Production debugging sequence: 1. Run explain('executionStats') on the full pipeline. 2. Identify the stage with the highest totalDocsExamined. 3. If it's a COLLSCAN, create an index for the filter used in that stage. 4. If it's a $group or $sort and usedDisk: true, add a $match before it to reduce input. 5. Re-run explain and verify improvements.

Here's a concrete example of reading explain output for a $lookup pipeline.

explain-walkthrough.jsJAVASCRIPT

// Pipeline to debug
db.orders.aggregate([
  { $match: { status: 'completed' } },
  { $lookup: {
      from: 'customers',
      localField: 'customerId',
      foreignField: '_id',
      as: 'customer'
  }}
]).explain('executionStats');

// Example output interpretation:
// "stages" : [
//   {
//     "$lookup" : {
//       "stage" : "$lookup",
//       "nReturned" : 10000,
//       "executionTimeMillisEstimate" : 4500,
//       "inputStage" : {
//         "stage" : "$match",
//         "nReturned" : 10000,
//         "totalDocsExamined" : 10000,
//         "inputStage" : {
//           "stage" : "IXSCAN",
//           "indexName" : "status_1",
//           "keyPattern" : { "status" : 1 },
//           "totalDocsExamined" : 10000
//         }
//       },
//       "foreignCollection" : {
//         "stage" : "COLLSCAN",
//         "totalDocsExamined" : 500000  // BAD: no index on _id
//       }
//     }
//   }
// ]

// Here the $lookup scanned the foreign collection 500k times per local document.
// Creating an index on customers._id (the _id field already has unique index, but if foreignField is different, create index)
// will change COLLSCAN to IXSCAN.

db.customers.createIndex({ _id: 1 }); // _id is already indexed; this example assumes foreignField is something else
// For a real fix: ensure foreignField has an index.

// After indexing, re-run explain to see IXSCAN in the foreign collection section.

Try it live

💡Explain reading shortcut

In large output, search for 'COLLSCAN' — each one represents a full table scan. Your goal is to eliminate all COLLSCANs from the plan, replacing them with 'IXSCAN' by adding appropriate indexes.

📊 Production Insight

During a post‑mortem, we found that a pipeline had two COLLSCANs: one on the local collection (because $match was after $lookup) and one on the foreign collection (no index). Adding an index on the foreign field and moving $match to position 1 eliminated both scans. Runtime dropped from 8 minutes to 12 seconds.

🎯 Key Takeaway

Regularly run explain('executionStats') on every aggregation you deploy. Look for COLLSCAN and high totalDocsExamined. Fix those before they become incidents.

Common Pitfalls and How to Avoid Them

Even senior engineers make these mistakes. Here are the three that bite hardest:

1. Assuming $lookup is cheap. It's not. Even with indexed foreign fields, $lookup adds a round-trip to the foreign collection per input document (paged, but still expensive). Minimize input to $lookup with early $match.

2. Forgetting that $unwind multiplies documents. A single document with an array of 10k elements becomes 10k documents downstream. Later $group or $sort will process 10kx more. Use $unwind with careful filtering or consider $lookup pipeline to pre-filter.

3. Not handling null or missing foreign keys in $lookup. If the foreign field is missing, $lookup returns an empty array. $unwind then removes that document entirely by default (preserveNullAndEmptyArrays: false). This silently drops documents with no match. Always check your join semantics.

4. Using $facet for simple parallel streams. $facet runs multiple sub-pipelines on the same input. If one sub-pipeline is expensive, it blocks the others. Often two separate aggregations are faster than one $facet.

5. Overusing $addFields to add computed fields early. Every $addFields triggers a document re-materialization, which can hurt performance. Use $project to only keep needed fields.

Here's an example of the silent document loss and how to fix it.

Another pitfall: using $group on high-cardinality fields without considering memory. If you group by a field with millions of unique values, $group will create millions of buckets in memory, easily hitting the 100MB limit. In such cases, consider using $bucket to group into ranges.

In one code review, I found a pipeline that did $group on user IDs for an event table with 10 million users. The fix: $bucket by user ID ranges, then aggregate.

6. Relying on allowDiskUse as a crutch. Spilling to disk is 10-100x slower than in-memory processing. A pipeline that relies on allowDiskUse needs restructuring first.

7. Ignoring the $lookup pipeline's sub-pipeline optimizations. The sub-pipeline inside $lookup can be a full aggregation pipeline — you can $sort, $limit, and $match inside it to control the joined data volume.

pitfalls-examples.jsJAVASCRIPT

// Pitfall 3: Silent document loss with $lookup + $unwind
db.orders.aggregate([
  { $lookup: {
      from: 'customers',
      localField: 'customerId',
      foreignField: '_id',
      as: 'customer'
  }},
  // If customerId doesn't exist in customers, customer is []
  // $unwind without preserveNullAndEmptyArrays removes the order!
  { $unwind: '$customer' }  // orders without customers vanish
]);

// Fix: use preserveNullAndEmptyArrays: true or handle missing
db.orders.aggregate([
  { $lookup: { ... } },
  { $unwind: { path: '$customer', preserveNullAndEmptyArrays: true } }
]);

// Pitfall: $group on high-cardinality field causing memory issues
db.events.aggregate([
  { $group: { _id: '$userId', count: { $sum: 1 } } }  // if 10M users, 10M buckets
]); // likely to hit memory limit
// Fix: use $bucket to group into ranges
db.events.aggregate([
  { $bucket: { groupBy: '$userId', boundaries: [0, 1000, 10000, 100000], default: 'Other', output: { count: { $sum: 1 } } } }
]);

// Pitfall 5: overusing $addFields
db.orders.aggregate([
  { $addFields: { totalWithDiscount: { $multiply: ['$total', 0.9] } } },
  { $group: { _id: null, total: { $sum: '$totalWithDiscount' } } }
]);
// Better: inline computation in $group { $sum: { $multiply: ['$total', 0.9] } }

Try it live

💡Memory for $facet

Each sub-pipeline in $facet can use up to 100MB separately, but the overall pipeline may still hit limits. Use explain to track memory per sub-pipeline.

📊 Production Insight

A $facet with three sub-pipelines: total sales (8s), others (100ms).

All waited for the slowest — splitting didn't help wall-clock but improved perceived speed.

Moral: $facet hides parallelism — return partial results separately.

Real fix: run two aggregations in parallel and merge results in application code.

🎯 Key Takeaway

$lookup is expensive — reduce input before it.

$unwind can silently drop docs — check preserveNullAndEmptyArrays.

$facet blocks until all sub-pipelines finish — split for partial results.

High-cardinality $group needs $bucket or allowDiskUse.

When to Use $facet vs. Separate Aggregations

IfSub-pipelines share the same initial $match and are fast (<10ms each)

→

UseUse $facet — saves one full collection scan.

IfOne sub-pipeline is slow (sort/group on large data)

→

UseSplit into two separate aggregations — $facet will block all until the slowest completes.

IfSub-pipelines require different initial filters

→

UseAlways split — $facet can only have one initial input.

Aggregation in Sharded Clusters

When your collection is sharded, aggregation pipelines run across multiple shards. The mongos router merges results from each shard. This adds complexity and performance considerations that are easy to overlook.

First, $lookup across shards: if the local collection is sharded but the foreign collection is not, the foreign collection is scanned on every shard. If both are sharded, MongoDB can optimise by targeting only shards that contain matching documents, but only if the local field used in $lookup is the shard key. If not, the $lookup broadcasts to all shards.

Second, $merge and $out are your friends for large pipelines. They write the results to a collection, avoiding the memory limit and allowing incremental processing. Use $merge to periodically persist intermediate results in multi-step ETL workflows.

Third, the order of stages crossing shard boundaries matters. $match and $project are pushed down to each shard (if they don't depend on merged data). $group and $sort are split into two phases: one per shard, then a merge on mongos. This can lead to duplicate scanning if not careful.

A common production mistake: running a large $group across shards without pre-filtering with $match. Each shard processes its entire data set, then mongos merges and re-groups. You can often push a condition down by adding a $match stage that uses the shard key.

Let's see an example that uses $merge to store intermediate results and avoid memory pressure.

Another critical point: the merge stage ($merge) runs on the primary shard of the output collection. If the output collection is also sharded, the merge stage's performance depends on that shard's capacity. For very large outputs, consider using $out with a sharded output collection strategy.

Also, $sort after $group in a sharded pipeline: each shard sorts its own partial results, then mongos does a merge sort. For top-k queries, you can use $sort + $limit on each shard to reduce data sent to mongos. If you omit $limit, all partial results are sent — potentially huge network payloads.

sharded-aggregation.jsJAVASCRIPT

// Aggregation over sharded orders collection with $merge
db.orders.aggregate([
  { $match: { shardKey: { $gte: 1000, $lt: 2000 } } },  // pushed to shard
  { $group: { _id: '$region', total: { $sum: '$amount' } } },
  { $sort: { total: -1 } },
  { $merge: { into: 'regional_totals', on: '_id', whenMatched: 'replace', whenNotMatched: 'insert' } }
]);

// After merge, you can query the output collection
// db.regional_totals.find().sort({total:-1});

// Alternative with $out (replaces collection)
db.orders.aggregate([
  { $match: { year: 2026 } },
  { $group: { _id: '$product', totalSales: { $sum: '$amount' } } },
  { $out: 'product_totals_2026' }
]);

// Avoid $lookup across shards if possible. Use denormalization or separate pipelines.

Try it live

⚠ Sharded $lookup broadcast

If the local field in $lookup is not the shard key, the pipeline broadcasts to every shard. For high-volume systems, this can overwhelm the network. Prefer to denormalize or perform the lookup after merging on mongos.

📊 Production Insight

A $group across shards without early $match caused a 30-minute aggregation.

Each shard scanned its entire partition, then mongos merged—duplicating work.

Added $match on shard key to limit input per shard; runtime dropped to 4 minutes.

Real fix: always push $match with shard key before $group in sharded environments.

🎯 Key Takeaway

$lookup across shards is expensive — avoid unless necessary.

$merge writes results to collection, reduces memory pressure.

Test aggregation performance with realistic shard distribution.

Push $match with shard key early to limit per-shard scanning.

Sharded aggregation strategy

IfLocal collection is sharded; need $lookup

→

UseEnsure foreign field matches shard key or expect broadcast. Consider denormalization.

IfLarge dataset with multiple $group stages

→

UseUse $merge to store intermediate results, reducing memory pressure and allowing incremental processing.

IfNeed to sort aggregated results across shards

→

UseUse $sort after $group; mongos will merge sorted results. Add $limit before $sort if possible.

Expressions and Operators: The Engine That Makes Pipelines Powerful

Most devs treat stages like $group and $project as black boxes. That's fine until you need to transform data mid-stream. Expressions and operators are what let you compute, reshape, and filter inside any stage. You don't call a function—you pass an expression object that the pipeline evaluates per document.

Operators fall into three buckets: arithmetic ($add, $multiply), date ($year, $dateToString), and conditional ($cond, $ifNull). The bottleneck is always understanding that expressions are lazy—they resolve when the stage executes, not when you type them. That matters when you're chaining $lookup results across collections.

Production reality: your most expensive pipeline stage might not be $lookup. It's a $addFields that recalculates a date field on 10 million documents because someone used $toDate on a string instead of storing native dates. Operators don't optimize themselves. If you can push date arithmetic into a $match stage with an index, do it. Every expression costs CPU cycles. Profile first, optimize second.

ExpressionBasedRevenueCalculation.sqlSQL

// io.thecodeforge — database tutorial

// Real scenario: compute daily revenue from order items with conditional discount
var pipeline = [
  { $match: { orderDate: { $gte: ISODate("2025-01-01") } } }, // index on orderDate
  { $unwind: "$items" },
  {
    $addFields: {
      // Use $cond to apply discount only if promoCode exists
      effectiveUnitPrice: {
        $cond: {
          if: { $ne: ["$items.promoCode", null] },
          then: { $multiply: ["$items.unitPrice", 0.9] },  // 10% off
          else: "$items.unitPrice"
        }
      },
      // Convert ISO date to YYYY-MM-DD string for grouping
      dayStr: { $dateToString: { format: "%Y-%m-%d", date: "$orderDate" } }
    }
  },
  {
    $group: {
      _id: "$dayStr",
      totalRevenue: { $sum: { $multiply: ["$effectiveUnitPrice", "$items.quantity"] } },
      orderCount: { $sum: 1 }
    }
  },
  { $sort: { _id: 1 } },
  { $project: { _id: 0, date: "$_id", totalRevenue: 1, orderCount: 1 } }
];

db.orders.aggregate(pipeline);

Output

{ "date": "2025-01-01", "totalRevenue": 45920.50, "orderCount": 1234 }

{ "date": "2025-01-02", "totalRevenue": 38215.30, "orderCount": 1098 }

⚠ Production Trap:

Don't use $addFields to transform fields you could filter out earlier with $match. Operators run on every surviving document—reduce the set first.

🎯 Key Takeaway

Expressions are per-document functions. Use them to compute once, not to filter after.

Update Documents Using an Aggregation Pipeline — No, Really, It Works

You already know db.collection.find() and db.collection.update() are separate. But since MongoDB 4.2, you can run an aggregation pipeline inside an update operation. This lets you transform documents based on their own data without pulling them into your app and writing back. Why do this? Atomicity. One command, one lock on the document, zero race conditions.

The catch: you're limited to stages that don't break document structure. No $unwind, no $group. You need $set, $unset, $addFields, $project (to exclude fields), and conditionals. Think $lookup? Nope—same restriction. This is for in-document transformations, not cross-collection magic.

Production use cases: adding computed fields for caching (avoid expensive reads later), normalizing nested arrays, or coalescing null fields to defaults. You also sidestep the dreaded findAndModify round-trip for bulk updates. I've seen teams cut an hour-long batch job to 90 seconds by switching to this pattern. Just remember—validate on a backup first. You can't undo a bad aggregation update across 500K docs.

AggregationBasedDocumentUpdate.sqlSQL

// io.thecodeforge — database tutorial

// Add a 'shippingCost' field based on order total, atomic update
var updatePipeline = [
  {
    $set: {
      // $switch evaluates first match; no need for nested $cond
      shippingTier: {
        $switch: {
          branches: [
            { case: { $lte: ["$totalAmount", 25] }, then: "standard" },
            { case: { $lte: ["$totalAmount", 100] }, then: "expedited" }
          ],
          default: "priority"
        }
      },
      // Flat shipping cost based on tier, computed here not in application code
      shippingCost: {
        $switch: {
          branches: [
            { case: { $eq: ["$shippingTier", "standard"] }, then: 4.99 },
            { case: { $eq: ["$shippingTier", "expedited"] }, then: 9.99 }
          ],
          default: 19.99
        }
      },
      // Unset the raw address fields now that we have structured data
      tempAddress: "$$REMOVE"
    }
  }
];

db.orders.updateMany(
  { status: "pending", fulfillmentWarehouse: { $exists: false } },
  updatePipeline
);

Output

{ "acknowledged": true, "matchedCount": 15000, "modifiedCount": 14987 }

💡Senior Shortcut:

Use $unset inside $set with $$REMOVE to delete fields in the same pass. Two birds, one aggregation stage, zero additional I/O.

🎯 Key Takeaway

Aggregation pipelines in update commands give you atomic, server-side transformation. No round trips, no app logic, no race conditions.

Stage Typology and Functional Categories

Every pipeline stage falls into one of three functional categories that dictate how it processes documents and interacts with the engine. Source stages, like $match and $sort, reduce or reorder the document set early, often leveraging indexes to skip full collection scans. Transformation stages—$project, $addFields, $unwind—reshape each document without altering cardinality. Aggregation stages, led by $group, $bucket, and $facet, collapse multiple documents into computed results, typically requiring all data in memory. Distinguishing these categories matters because source stages should run as early as possible to limit document flow, transformation stages are memory-cheap but can explode cardinality (think $unwind on arrays), and aggregation stages trigger the 100MB memory limit. Prioritize source stages over aggregation stages for performance. Split one large aggregation stage into multiple fine-grained stages to improve index usage and reduce spillover to disk.

StageCategories.sqlSQL

// io.thecodeforge — database tutorial

// Source stage: $match uses index, reduces docs early
db.orders.aggregate([
  { $match: { status: "shipped" } },
  // Transformation stage: reshape without adding docs
  { $addFields: { total: { $multiply: ["$price", "$qty"] } } },
  // Aggregation stage: collapses into computed results
  { $group: { _id: null, totalRevenue: { $sum: "$total" } } }
])

Output

{ "_id": null, "totalRevenue": 152340.50 }

⚠ Production Trap:

Running $unwind on a large array before $match multiplies documents first, blowing up memory. Always filter source before you unwind.

🎯 Key Takeaway

Run source stages first, then transformations, then aggregations—order directly impacts performance.

When you need multiple independent aggregations on the same input document set—like sales totals per region, per product category, and per time bucket—a single $facet stage replaces several separate queries. $facet accepts a document of named sub-pipelines. Each sub-pipeline operates on the same input stream, but they run in parallel within the same stage. This eliminates redundant collection scans and reduces network round trips. Yet $facet forces every sub-pipeline to share the memory limit (default 100MB). If one sub-pipeline is a $sort + $group behemoth, it can starve the others and spill to disk. Use $facet when sub-pipelines are lightweight and balanced. For heavy aggregations, prefer separate pipeline executions with indexed $match filters. Never nest $facet inside $facet—MongoDB forbids it and throws a parse error. The output is a single document with keys matching your sub-pipeline names, each containing an array of results.

FacetAnalytics.sqlSQL

// io.thecodeforge — database tutorial

db.sales.aggregate([
  { $match: { date: { $gte: ISODate("2024-01-01") } } },
  {
    $facet: {
      byRegion: [
        { $group: { _id: "$region", revenue: { $sum: "$amount" } } },
        { $sort: { revenue: -1 } }
      ],
      byCategory: [
        { $group: { _id: "$category", count: { $sum: 1 } } }
      ],
      timeline: [
        { $group: {
          _id: { $dateToString: { format: "%Y-%m", date: "$date" } },
          total: { $sum: "$amount" }
        }},
        { $sort: { _id: 1 } }
      ]
    }
  }
])

Output

{

"byRegion": [ { "_id": "EMEA", "revenue": 55000 } ],

"byCategory": [ { "_id": "Electronics", "count": 320 } ],

"timeline": [ { "_id": "2024-01", "total": 12000 } ]

}

⚠ Production Trap:

One heavy sub-pipeline inside $facet (like $group with $sort) can exceed the 100MB memory limit, causing all sub-pipelines to fail or spill to disk.

🎯 Key Takeaway

Use $facet for parallel, lightweight aggregations; separate heavy ones into distinct pipelines to avoid memory starvation.

What Is the MongoDB Aggregation Pipeline?

The MongoDB Aggregation Pipeline is a framework for processing data through a sequence of stages, each transforming documents as they pass through. Think of it as an assembly line: raw documents enter at the first stage, get filtered, grouped, sorted, reshaped, or computed, and the final stage outputs the result. Each stage receives input from the previous stage, applies an operation (like $match, $group, or $project), and passes the transformed documents forward. This makes the pipeline radically different from simple find queries — it enables multi-step data processing, joins across collections via $lookup, computed fields, window functions, and entire ETL workflows inside the database. Introduced in MongoDB 2.2 and heavily optimized since, the pipeline runs inside the mongod process, leveraging indexes and parallelism when possible. It is the primary tool for analytics, reporting, data cleaning, and complex transformations, replacing map-reduce in most modern MongoDB deployments. Understanding the pipeline's sequential nature is crucial: stage order directly impacts performance, memory usage, and correctness. A misordered pipeline can explode in memory or return wrong counts, making mastery of its mechanics essential for production-grade systems.

Basics.sqlSQL

// io.thecodeforge — database tutorial
// Minimal pipeline example: filter, then group by status
use("sales");
db.orders.aggregate([
  { $match: { date: { $gte: ISODate("2024-01-01") } } },
  { $group: { _id: "$status", totalAmount: { $sum: "$amount" } } }
]);

Output

{ _id: 'shipped', totalAmount: 54200 }

{ _id: 'pending', totalAmount: 12350 }

⚠ Production Trap:

Placing $match late in the pipeline forces every stage to process all documents, exploding memory and CPU. Always filter early.

🎯 Key Takeaway

Move $match and $limit to the earliest possible stage to reduce document throughput and avoid memory blowouts.

Why Not Just Use find() or mapReduce?

While find() handles simple filtering and projection, it cannot perform multi-stage transformations, groupings, or joins without application-side logic. mapReduce, though powerful, suffers from single-threaded JavaScript execution, poor index utilization, and no streaming output — it materializes intermediate collections, crippling real-time performance. The aggregation pipeline attacks these limitations head-on. It runs natively in C++, uses indexes for $match and $sort, streams documents between stages without writing to disk (unless spill-to-disk is triggered), and supports compound operations like $lookup for foreign joins, $bucket for histogram creation, and $facet for parallel sub-pipelines. Execution is declarative: you describe what you want, and the optimizer reorders stages, chooses index-accelerated plans, and parallelizes across shards. For example, a $group stage can compute sums, averages, and counts in a single pass, whereas equivalent find() logic would require multiple round trips or mapReduce overhead. When you need to transform data beyond column selection — like pivoting, joining, or time-series windowing — the aggregation pipeline is the only native path that scales without external processing. It is also the foundation for MongoDB's change streams and Atlas Search, making it the unifying data processing primitive.

Comparison.sqlSQL

// io.thecodeforge — database tutorial
// Inefficient: multiple queries vs. one pipeline
db.sales.find({ region: "EU" });
db.sales.aggregate([
  { $match: { region: "EU" } },
  { $group: { _id: "$product" } }
]);

Output

Pipeline: 1 round trip, index used, result: 5 groups

🔥Why It Matters:

mapReduce is deprecated from MongoDB 5.0. The aggregation pipeline is the only supported path for complex transformations going forward.

🎯 Key Takeaway

Replace mapReduce with aggregation pipeline for all new development; it's faster, indexed, and the only supported transformation tool in modern MongoDB.

Aggregation Pipeline Optimization: Index Usage and Memory

Indexes play a critical role in optimizing MongoDB aggregation pipelines, particularly for stages like $match, $sort, and $group. When a $match stage appears early in the pipeline, MongoDB can use an index to quickly filter documents, reducing the number of documents passed to subsequent stages. Similarly, $sort can leverage an index to avoid an in-memory sort if the sort key matches an index. However, not all stages benefit equally: $group and $lookup often require scanning all matching documents, and without proper indexing, they may trigger the 100MB memory limit for blocking stages. To avoid spill-to-disk, ensure that $match and $sort use indexes, and consider adding indexes on fields used in $lookup local and foreign keys. For example, in SQL, an index on the join column speeds up JOIN operations; similarly, in MongoDB, an index on the foreign field in the 'from' collection accelerates $lookup. Use explain() to verify index usage: if 'IXSCAN' appears, the index is used; if 'COLLSCAN', it is not. Memory limits apply per stage; if a stage exceeds 100MB, MongoDB writes temporary data to disk, which can degrade performance. To mitigate, increase allowDiskUse: true, but prefer indexing to avoid spills. Production tip: monitor aggregation memory usage with db.currentOp() and set memory limits per query using maxTimeMS and allowDiskUse.

index_usage_example.jsJAVASCRIPT

// Create index on 'status' and 'createdAt' for $match and $sort
db.orders.createIndex({ status: 1, createdAt: -1 });

// Aggregation pipeline using index
db.orders.aggregate([
  { $match: { status: "shipped" } },
  { $sort: { createdAt: -1 } },
  { $group: { _id: "$customerId", total: { $sum: "$amount" } } }
]);

// Check index usage with explain
db.orders.explain("executionStats").aggregate([
  { $match: { status: "shipped" } },
  { $sort: { createdAt: -1 } },
  { $group: { _id: "$customerId", total: { $sum: "$amount" } } }
]);

Try it live

💡Index Early, Index Often

📊 Production Insight

In production, regularly review slow aggregation queries with explain() and add indexes based on the most selective fields used in early pipeline stages.

🎯 Key Takeaway

Indexes on $match and $sort fields drastically reduce pipeline processing time and memory usage, preventing spills to disk.

$lookup Performance: Correlated vs Uncorrelated Subqueries

In MongoDB, $lookup performs a left outer join with another collection. There are two execution strategies: correlated and uncorrelated subqueries. An uncorrelated $lookup (without a pipeline) scans the entire 'from' collection for each input document, leading to O(nm) comparisons—potentially billions. A correlated $lookup (using a pipeline with $match and let variables) can use indexes on the foreign collection, reducing comparisons. For example, in SQL, a correlated subquery like 'SELECT FROM orders WHERE customer_id IN (SELECT id FROM customers WHERE status = 'active')' can be optimized with an index. Similarly, in MongoDB, use a pipeline in $lookup to filter documents before joining. The performance difference is stark: uncorrelated $lookup on a 10K/10K collection may do 100M comparisons, while a correlated version with an index on the foreign key may do only 10K index lookups. To implement, use the 'let' option to define local variables and a pipeline with $match using $expr. Always create an index on the foreign field used in the $match. Example: SQL 'SELECT * FROM orders JOIN customers ON orders.cust_id = customers.id WHERE customers.region = 'US'' becomes a correlated $lookup with a pipeline filtering by region. Production insight: monitor $lookup stages in slow query logs; if you see 'COLLSCAN' on the foreign collection, switch to a correlated pipeline with an index.

correlated_lookup.jsJAVASCRIPT

// Uncorrelated (slow) - scans entire 'customers' for each order
db.orders.aggregate([
  { $lookup: {
    from: "customers",
    localField: "cust_id",
    foreignField: "_id",
    as: "customer"
  }}
]);

// Correlated (fast) - uses index on customers._id and filters
db.orders.aggregate([
  { $lookup: {
    from: "customers",
    let: { cust_id: "$cust_id" },
    pipeline: [
      { $match: { $expr: { $eq: ["$_id", "$$cust_id"] } } },
      { $project: { name: 1, email: 1 } }
    ],
    as: "customer"
  }}
]);

// Create index on foreign key
db.customers.createIndex({ _id: 1 });

Try it live

⚠ Uncorrelated $lookup Can Be a Performance Killer

📊 Production Insight

In production, set up alerts for aggregation queries with high execution times and use the profiler to identify uncorrelated $lookup stages that need optimization.

🎯 Key Takeaway

Correlated $lookup with a pipeline and index on the foreign field reduces comparisons from O(n*m) to O(n log m), dramatically improving performance.

MongoDB Aggregation vs SQL: Conceptual Mapping

Understanding the conceptual mapping between MongoDB aggregation stages and SQL clauses helps developers transition between the two. The MongoDB aggregation pipeline processes documents sequentially, similar to SQL's SELECT statement with clauses executed in order: FROM -> WHERE -> GROUP BY -> HAVING -> SELECT -> ORDER BY -> LIMIT. In MongoDB, $match corresponds to SQL's WHERE, $group to GROUP BY, $sort to ORDER BY, $project to SELECT, $limit to LIMIT, and $skip to OFFSET. For example, SQL 'SELECT department, COUNT(*) AS count FROM employees WHERE status = 'active' GROUP BY department HAVING count > 5 ORDER BY count DESC' maps to MongoDB: {$match: {status: 'active'}}, {$group: {_id: '$department', count: {$sum: 1}}}, {$match: {count: {$gt: 5}}}, {$sort: {count: -1}}. Joins: SQL JOIN becomes $lookup (left outer join) or $unionWith (union). Subqueries: SQL subqueries in WHERE or FROM can be expressed using $lookup with pipelines or $facet. Set operations: SQL UNION ALL maps to $unionWith; INTERSECT and EXCEPT require custom logic using $group and $match. Window functions: SQL ROW_NUMBER, RANK, etc., map to $setWindowFields (MongoDB 5.0+). Aggregation expressions like $sum, $avg, $min, $max directly correspond to SQL aggregate functions. Production insight: when migrating from SQL, start by writing the SQL query, then translate stage by stage, ensuring indexes on fields used in $match and $lookup.

sql_to_mongo.sqlSQL

-- SQL query: active employees per department with count > 5
SELECT department, COUNT(*) AS count
FROM employees
WHERE status = 'active'
GROUP BY department
HAVING COUNT(*) > 5
ORDER BY count DESC;

-- Equivalent MongoDB aggregation pipeline
db.employees.aggregate([
  { $match: { status: "active" } },
  { $group: { _id: "$department", count: { $sum: 1 } } },
  { $match: { count: { $gt: 5 } } },
  { $sort: { count: -1 } }
]);

🔥Pipeline Order Matters

📊 Production Insight

When migrating SQL workloads, use explain() to verify that the MongoDB pipeline uses indexes effectively, especially for $match and $sort stages.

🎯 Key Takeaway

MongoDB aggregation stages map directly to SQL clauses: $match=WHERE, $group=GROUP BY, $sort=ORDER BY, $project=SELECT, $lookup=JOIN.

● Production incidentPOST-MORTEMseverity: high

The Unbounded $lookup That Brought Down Reporting

Symptom

Aggregation pipeline taking >45 minutes, eventually hitting 60-minute timeout. Dashboard showing stale data.

Assumption

The foreign collection was small enough that an index wasn't necessary.

Root cause

$lookup without index on the foreign field caused a full collection scan for every input document. With 50k input docs and 500k foreign docs, that's 25 billion comparisons.

Fix

Created an index on the foreign key field used in $lookup. Also added $limit at pipeline start to cap processing to last 30 days of data.

Key lesson

Always index foreign key fields used in $lookup.
Use explain() to verify index usage before deploying.
Set realistic timeouts with maxTimeMS() to fail fast.
Profile long-running aggregations in staging with production-like data volume.

Production debug guideSymptom → Action grid for common aggregation performance issues5 entries

Symptom · 01

Aggregation takes seconds or minutes to complete

→

Fix

Run explain('executionStats') to identify stage with most docs processed. Look for COLLSCAN in inputStage.

Symptom · 02

Aggregation fails with 'Exceeded memory limit' error

→

Fix

Add { allowDiskUse: true } to enable disk-based sorting and grouping. Consider adding $match early to reduce data volume.

Symptom · 03

$lookup is slow

→

Fix

Check if foreign field has an index. Verify $lookup pipeline uses indexed fields. Use indexed local and foreign fields.

Symptom · 04

Aggregation returns wrong results or duplicates

→

Fix

Test each stage in isolation by removing subsequent stages. Check for $unwind causing document multiplication.

Symptom · 05

$facet sub-pipeline blocks others

→

Fix

Run explain() on each sub-pipeline independently. If one sub-pipeline is much slower, split into separate aggregations.

★ Aggregation Pipeline Quick Debug CheatsheetFive common aggregation failures and the exact commands to diagnose them.

Pipeline times out or hangs−

Immediate action

Terminate with db.killOp(opid). Get opid from currentOp().

Commands

db.currentOp({ 'command.aggregate': 'collectionName' })

db.collection.aggregate(pipeline).maxTimeMS(10000).explain('executionStats')

Fix now

Add maxTimeMS() to pipeline. Add index on foreign key for $lookup. Reduce input with earlier $match.

Memory limit exceeded error+

Aggregation returns multiple identical documents+

$lookup returns unexpected documents+

$unwind causes document loss with null arrays+

Aggregation Pipeline vs. find() vs. MapReduce

Feature	Aggregation Pipeline	find()	MapReduce (deprecated)
Data transformation	Rich stages for reshape, group, join	Only projection and simple field selection	JavaScript code, full flexibility
Index usage	Only at start of pipeline	Full index support	Requires manual index management
Performance	Optimized C++ engine, streaming	Most efficient for simple queries	Single-threaded JavaScript, slow
Memory limit	100MB per stage (configurable with allowDiskUse)	No effective limit (cursor based)	As per JavaScript heap
Use case	Complex analytics, joins, aggregations	Simple CRUD, filtering with logic	Legacy systems only

⚙ Quick Reference

17 commands from this guide

File	Command / Code	Purpose
pipeline-intro.js	db.orders.aggregate([	What is MongoDB Aggregation Pipeline?
pipeline-example.js	db.orders.aggregate([	Core Stages
explain-memory.js	db.orders.aggregate([	Index Usage and the 100MB Memory Limit
optimization-check.js	db.collection.aggregate([...]).explain('queryPlanner');	Pipeline Optimization and Execution Engine Internals
profiling-commands.js	db.setProfilingLevel(1, { slowms: 100 });	Debugging and Profiling Aggregations in Production
explain-walkthrough.js	db.orders.aggregate([	Explain Plan Walkthrough for Aggregation Queries
pitfalls-examples.js	db.orders.aggregate([	Common Pitfalls and How to Avoid Them
sharded-aggregation.js	db.orders.aggregate([	Aggregation in Sharded Clusters
ExpressionBasedRevenueCalculation.sql	var pipeline = [	Expressions and Operators
AggregationBasedDocumentUpdate.sql	var updatePipeline = [	Update Documents Using an Aggregation Pipeline
StageCategories.sql	db.orders.aggregate([	Stage Typology and Functional Categories
FacetAnalytics.sql	db.sales.aggregate([	Multidimensional Analytics with $facet
Basics.sql	use("sales");	What Is the MongoDB Aggregation Pipeline?
Comparison.sql	db.sales.find({ region: "EU" });	Why Not Just Use find() or mapReduce?
index_usage_example.js	db.orders.createIndex({ status: 1, createdAt: -1 });	Aggregation Pipeline Optimization
correlated_lookup.js	db.orders.aggregate([	$lookup Performance
sql_to_mongo.sql	SELECT department, COUNT(*) AS count	MongoDB Aggregation vs SQL

Key takeaways

You now understand what MongoDB Aggregation Pipeline is and why it exists.

You've seen it working in a real runnable example.

Practice daily

the forge only works when it's hot 🔥

Stage order is decisive

early $match before any transformation.

Always index foreign fields in $lookup

or pay O(n*m).

Use explain('executionStats') on every pipeline before deploying.

100MB memory limit is per blocking stage, not per pipeline.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

How does MongoDB optimize an aggregation pipeline? Give an example of an...

Q02SENIOR

Explain the 100MB memory limit in aggregation. How can you work around i...

Q03SENIOR

What happens when you $unwind an array with null? How do you prevent los...

Q04SENIOR

How does $lookup perform when the joined field is not unique? What are t...

Q05SENIOR

What is the difference between $lookup with a pipeline and a simple loca...

Q06SENIOR

How does the query planner handle $facet? What are the memory implicatio...

Q01 of 06SENIOR

How does MongoDB optimize an aggregation pipeline? Give an example of an automatic optimization and one that must be done manually.

ANSWER

Automatic: $match coalescence — multiple $match stages are merged into one, reducing stage overhead. Also $sort + $limit triggers a top-k sort algorithm. Manual: $match after $lookup is not pushed inside the lookup pipeline; you must move that filter into the $lookup's pipeline parameter explicitly.

FAQ · 6 QUESTIONS

Frequently Asked Questions

What is MongoDB Aggregation Pipeline in simple terms?

How do I avoid the 100MB memory limit in aggregation?

Why is my $lookup so slow and how do I fix it?

Why does my aggregation return no documents after $lookup and $unwind?

Can I use $facet to run aggregations in parallel?

How do I check if my aggregation used an index?

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Written from production experience, not tutorials.

✓ Verified

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

🔥

That's NoSQL. Mark it forged?

21 min read · try the examples if you haven't

MongoDB Agg: $lookup Without Index - 25B Comparisons

What is MongoDB Aggregation Pipeline?

Stage Execution Order and Document Flow

Core Stages: $match, $group, $sort, $project, $lookup, $unwind

Blocking vs Streaming Stage Performance Table

Index Usage and the 100MB Memory Limit

Pipeline Optimization and Execution Engine Internals

Aggregation Optimization Ruleset Checklist

Debugging and Profiling Aggregations in Production

Explain Plan Walkthrough for Aggregation Queries

Common Pitfalls and How to Avoid Them

Aggregation in Sharded Clusters

Expressions and Operators: The Engine That Makes Pipelines Powerful

Update Documents Using an Aggregation Pipeline — No, Really, It Works

Stage Typology and Functional Categories

Multidimensional Analytics with $facet

What Is the MongoDB Aggregation Pipeline?

Why Not Just Use find() or mapReduce?

Aggregation Pipeline Optimization: Index Usage and Memory

$lookup Performance: Correlated vs Uncorrelated Subqueries

MongoDB Aggregation vs SQL: Conceptual Mapping

The Unbounded $lookup That Brought Down Reporting

Key takeaways

Interview Questions on This Topic

Frequently Asked Questions

That's NoSQL. Mark it forged?