MongoDB Agg: $lookup Without Index - 25B Comparisons
An unindexed $lookup caused 25 billion comparisons and a 45-minute timeout in production.
- MongoDB Aggregation Pipeline processes documents through a series of stages, each transforming the data stream.
- Core stages: $match (filter), $group (aggregate), $sort (order), $project (reshape), $lookup (join), $unwind (flatten arrays).
- Stage order matters: early $match reduces document count before expensive operations.
- Performance: indexes used automatically for $match and $sort early in pipeline; late-stage operations may require allowDiskUse() to exceed 100MB memory limit.
- Production insight: $lookup without an index on the foreign field triggers a full collection scan per input document — can kill query performance.
- Biggest mistake: assuming the pipeline executes in the order you type it — MongoDB optimises, but $sort before $match blocks index usage.
Imagine you work in a huge warehouse full of cardboard boxes. You need to find all boxes heavier than 10kg, group them by color, and count how many of each color you have. Instead of opening every box yourself, you set up an assembly line: the first worker filters heavy boxes, the second sorts by color, and the third counts. That assembly line is MongoDB's Aggregation Pipeline — each stage hands its output to the next, transforming your raw data into the exact answer you need.
Most applications don't just store data — they ask questions of it. 'Which users spent the most last month?', 'How many orders shipped per region per day?', 'What's the 90th-percentile response time across our API endpoints?' These are not simple find() queries. They require filtering, reshaping, grouping, and computing across potentially millions of documents. That's the job MongoDB's Aggregation Pipeline was built for, and in high-traffic production systems it's the difference between a dashboard that loads in 40ms and one that times out.
Before the Aggregation Pipeline existed (pre-MongoDB 2.2), developers were stuck with MapReduce — a JavaScript-evaluated, single-threaded monster that was slow, opaque, and painful to debug. The Pipeline replaced it with a declarative, composable model: you describe what you want, stage by stage, and MongoDB's query planner figures out how to execute it efficiently. That mental shift — from imperative to declarative — is what makes the Pipeline feel elegant once it clicks.
By the end of this article you'll be able to write multi-stage pipelines that actually use indexes, avoid the dreaded 100MB in-memory sort limit, spot the three most common production mistakes before they bite you in a code review, and explain to an interviewer how the execution engine decides whether to push a $match before a $lookup.
What is MongoDB Aggregation Pipeline?
MongoDB Aggregation Pipeline is a core concept in Database. Instead of a dry definition, think of it as a data processing assembly line. Each stage receives a stream of documents, transforms them, and passes the result to the next stage. It's declarative — you say what you want, not how to do it. That's the key insight: you describe the transformation steps, and MongoDB's query planner figures out the execution order, index usage, and resource allocation.
The pipeline was introduced in MongoDB 2.2 to replace MapReduce, which ran JavaScript in a single thread and couldn't leverage indexes. The pipeline runs in C++ (native speed), can use indexes for $match and $sort, and streams results through stages to keep memory low. But that streaming nature also creates pitfalls: blocking stages ($group, $sort) must buffer everything before output, which is where the 100MB memory limit bites you.
Here's a concrete example: you want the top 10 customers by total order value this year. A find() can't do that — you need to filter, join, group, sort, and limit. The pipeline does all that in one go.
Before the pipeline existed, teams often relied on application-side aggregation: pull all orders into memory, filter, group, sort. That worked for small datasets but broke at production scale. The pipeline moves the computation to the database, saving network bandwidth and leveraging indexes. That's the difference between a 2-second dashboard and a 45-minute timeout.
One thing most engineers miss: the pipeline optimizer can reorder stages, but only within strict rules. For example, $match can be pushed before $project, but not through $lookup. This means your mental model of stage order must match the optimizer's limitations — or you'll write a pipeline that looks optimised but forces full collection scans.
Here's how you can verify this yourself: run explain() on a pipeline with $match after $lookup. You'll see the $match stays put — it won't push down.
Another nuance: the optimizer can also merge consecutive $match stages and push $project filters into earlier stages. But it cannot reorder stages across a $lookup boundary because it doesn't know the foreign collection's data distribution. That's why you need to manually push filters into $lookup's sub-pipeline.
explain() to see stage boundaries.find() can't reshape or group.find() — it's faster and uses indexes more flexibly.Core Stages: $match, $group, $sort, $project, $lookup, $unwind
The pipeline is only as good as its stages. Here's the real behaviour of the most used ones:
- $match: Filters documents. Always put it first — it reduces the document count before any expensive operation. Importantly, $match can use indexes if it appears at the start, but once you've had a $project or $unwind, the engine can't use indexes for later $match stages.
- $group: Groups documents by a key and applies accumulator expressions (sum, avg, etc.). This stage buffers all documents in memory, so if your group key is high-cardinality, you may hit the 100MB limit.
- $sort: Sorts documents. If used early, it can use an index. If used after $group or $project, it's a blocking operation — it must see all input before emitting output.
- $project: Reshapes documents. Use it to exclude fields you don't need downstream. This reduces memory pressure.
- $lookup: Performs a left outer join with another collection. Slowest stage by far — always ensure the foreign field is indexed. You can also pass a pipeline to $lookup to filter before joining.
- $unwind: Deconstructs an array field. Each array element becomes a separate document. This can explode cardinality — one document with a 1000-element array becomes 1000 documents. Use with caution.
Order matters more than you think. MongoDB's optimizer can reorder some stages, but not all. A $match after $lookup won't push down into the foreign collection. Put $match before $lookup if possible.
Let's look at a real-world example: a pipeline that totals sales per region, but we also want to include region details from a lookup.
One pattern senior engineers use: pass a pipeline to $lookup with $match to filter joined documents before the join completes. This reduces the number of documents returned and avoids unnecessary $unwind multiplication. For example, if you only need active customers, add a $match in the $lookup pipeline.
Another common trap: using $addFields to compute temporary fields that are later used in $group. Every $addFields re-materialises the document, costing CPU. Instead, compute derived fields inside $group using $sum with expressions or use $project to trim fields before $group.
I once saw a team add a $addFields to compute a discount after a $lookup — it doubled the pipeline execution time. Moving the computation into $group fixed it.
There's also a subtlety with $project: removing fields early reduces memory in later $group and $sort stages. Use $project to drop all fields except the group key and accumulator. You can even use $project to rename fields to avoid $addFields entirely.
explain() to verify stage execution — assumptions are expensive.Index Usage and the 100MB Memory Limit
MongoDB's aggregation engine can leverage indexes for $match, $sort, and $geoNear — but only if those stages appear early in the pipeline, before any stage that transforms the document shape.
The 100MB memory limit applies to blocking stages: $group, $sort, $bucket, $facet, $lookup (in-memory side). When these stages exceed 100MB, the engine throws an error unless you pass { allowDiskUse: true }. Disk spill is slower but lets you process larger-than-memory datasets.
Here's the trick: the memory limit is per-stage, not per-pipeline. A $group that processes 90MB and a $sort that processes 80MB will both fit, but a $group processing 120MB will fail. Use explain to see memory consumption.
What about indexes in $lookup? The foreign collection's index is used when you specify an exact field match. If you pass a pipeline, the engine tries to use indexes for $match within that sub-pipeline.
Also, $match within $lookup's pipeline does NOT use indexes on the local collection — only the foreign collection's indexes matter there.
Let's see how to check memory usage and enable disk use with code.
Note that $bucket and $facet can also hit the memory limit. Each sub-pipeline in $facet gets its own memory allowance, but if one sub-pipeline spills to disk, it delays the whole $facet.
One edge case: $sort after $limit. If you sort then limit, MongoDB uses a top-k sort that keeps only k elements in memory. This is a huge win — but it only works if $sort is before $limit with no intervening $group. If you need to sort after grouping, you're back to full in-memory sort.
I've seen a pipeline fail on a 200MB $group with a perfectly healthy $sort. The fix wasn't allowDiskUse — it was adding a $match to cut input by 60%.
Another important detail: the 100MB limit is configurable with the internalQueryMaxBlockingSortMemoryUsageBytes parameter (though not recommended to change in production). But the real lever is $project — reducing each document's size before $group means more documents fit in 100MB. A 20% reduction in document size can be the difference between spilling and not spilling.
- Early $match reduces the volume entering the entire pipeline.
- Indexes only work before the first document-transforming stage ($project, $unwind, $addFields).
- Memory limit is per-pool — one large $group can blow it even if other stages are small.
- allowDiskUse lets the overflow spill into a reservoir (disk), but slows the flow.
- explain() shows you the water level at each stage — the stage with the highest 'docs examined' is your bottleneck.
explain(), not intuition.Pipeline Optimization and Execution Engine Internals
MongoDB's query planner does more than you think. When it sees a pipeline, it applies a set of transformations:
- $match coalescence: Multiple $match stages are merged into one. This matters because it reduces overhead.
- $project removal: If a $project only removes fields and is followed by another $project, the engine may combine them.
- $sort + $limit: When you have a $sort followed by $limit, the engine uses a top-k sort algorithm — it only keeps the k best elements in memory. This drastically reduces memory for large sorts.
- $lookup optimization: For simple equality lookups, the engine uses a hash join internally. For more complex pipelines, it falls back to nested loop.
But not all optimizations happen automatically. For example, $match after $lookup does not push into the $lookup pipeline. You must manually move that filter inside the $lookup's pipeline parameter.
The execution model is document-stream based. Each stage outputs documents one by one to the next stage. Blocking stages (like $group) must buffer all documents before emitting. This is why a $group after a $project that reduces document size is faster than a $group on original docs.
Here's how to check if optimization occurred.
Another optimization: when using $lookup with a pipeline, the engine can sometimes use indexes on the foreign collection for $match stages inside that pipeline. But it won't reorder stages inside the sub-pipeline. You can force a plan using $hint if the planner picks a suboptimal one.
A production trick: if you see multiple $match stages in your explain output and they haven't been coalesced, check if there's an intervening $addFields or $project that blocks coalescence. Sometimes adding a $project can actually prevent optimization.
I've debugged a pipeline where a $project between two $match stages prevented coalescence. Removing the first $project and using $unset to drop fields solved it.
Additionally, the optimizer can also remove $project stages that do nothing — for example, if you $project a field that's already the only field in the document. But it won't remove a $project that adds computed fields. That's another reason to avoid unnecessary $addFields.
Debugging and Profiling Aggregations in Production
You can't fix what you can't see. MongoDB gives you four tools to inspect aggregation performance:
explain(): Use with 'executionStats' to see the number of documents examined, returned, and stage-level timings. This is your single most important diagnostic.- Database Profiler: Set profiling level to 2 for slow operations. Look for aggregations with high 'docsExamined' vs 'nReturned' ratio.
- $planCacheStats: If a query shape has multiple plans, the cache holds the winning plan. You can clear it to force re-optimization.
- setProfilingLevel and system.profile: Log all operations taking longer than a threshold. Analyze the profile collection for pattern detection.
Real trick: When debugging, start with a pipeline that has only the first stage, add stages one by one, running explain at each step. This isolates which stage is the culprit. Never debug a 10-stage pipeline as a whole.
Here's a practical example of enabling profiling and querying slow aggregations.
You can also use $indexStats on the foreign collection to verify indexes are being used by your $lookup.
An additional pattern: if you're seeing intermittent slow aggregations, check if the plan cache is holding a stale plan after data growth. Clear it with getPlanCache().clear() and force re-optimization.
I had a situation where a $lookup was fast for weeks, then suddenly took 10x longer. The index was there, but the plan had changed. Clearing the cache restored performance.
Profiling can also reveal aggregations that return far more documents than expected — often a sign of an accidental $unwind multiplication. Check the nReturned vs totalDocsExamined ratio in explain. If nReturned is orders of magnitude larger than the input count, you've got an explosion.
Common Pitfalls and How to Avoid Them
Even senior engineers make these mistakes. Here are the three that bite hardest:
1. Assuming $lookup is cheap. It's not. Even with indexed foreign fields, $lookup adds a round-trip to the foreign collection per input document (paged, but still expensive). Minimize input to $lookup with early $match.
2. Forgetting that $unwind multiplies documents. A single document with an array of 10k elements becomes 10k documents downstream. Later $group or $sort will process 10kx more. Use $unwind with careful filtering or consider $lookup pipeline to pre-filter.
3. Not handling null or missing foreign keys in $lookup. If the foreign field is missing, $lookup returns an empty array. $unwind then removes that document entirely by default (preserveNullAndEmptyArrays: false). This silently drops documents with no match. Always check your join semantics.
4. Using $facet for simple parallel streams. $facet runs multiple sub-pipelines on the same input. If one sub-pipeline is expensive, it blocks the others. Often two separate aggregations are faster than one $facet.
5. Overusing $addFields to add computed fields early. Every $addFields triggers a document re-materialization, which can hurt performance. Use $project to only keep needed fields.
Here's an example of the silent document loss and how to fix it.
Another pitfall: using $group on high-cardinality fields without considering memory. If you group by a field with millions of unique values, $group will create millions of buckets in memory, easily hitting the 100MB limit. In such cases, consider using $bucket to group into ranges.
In one code review, I found a pipeline that did $group on user IDs for an event table with 10 million users. The fix: $bucket by user ID ranges, then aggregate.
6. Relying on allowDiskUse as a crutch. Spilling to disk is 10-100x slower than in-memory processing. A pipeline that relies on allowDiskUse needs restructuring first.
7. Ignoring the $lookup pipeline's sub-pipeline optimizations. The sub-pipeline inside $lookup can be a full aggregation pipeline — you can $sort, $limit, and $match inside it to control the joined data volume.
Aggregation in Sharded Clusters
When your collection is sharded, aggregation pipelines run across multiple shards. The mongos router merges results from each shard. This adds complexity and performance considerations that are easy to overlook.
First, $lookup across shards: if the local collection is sharded but the foreign collection is not, the foreign collection is scanned on every shard. If both are sharded, MongoDB can optimise by targeting only shards that contain matching documents, but only if the local field used in $lookup is the shard key. If not, the $lookup broadcasts to all shards.
Second, $merge and $out are your friends for large pipelines. They write the results to a collection, avoiding the memory limit and allowing incremental processing. Use $merge to periodically persist intermediate results in multi-step ETL workflows.
Third, the order of stages crossing shard boundaries matters. $match and $project are pushed down to each shard (if they don't depend on merged data). $group and $sort are split into two phases: one per shard, then a merge on mongos. This can lead to duplicate scanning if not careful.
A common production mistake: running a large $group across shards without pre-filtering with $match. Each shard processes its entire data set, then mongos merges and re-groups. You can often push a condition down by adding a $match stage that uses the shard key.
Let's see an example that uses $merge to store intermediate results and avoid memory pressure.
Another critical point: the merge stage ($merge) runs on the primary shard of the output collection. If the output collection is also sharded, the merge stage's performance depends on that shard's capacity. For very large outputs, consider using $out with a sharded output collection strategy.
Also, $sort after $group in a sharded pipeline: each shard sorts its own partial results, then mongos does a merge sort. For top-k queries, you can use $sort + $limit on each shard to reduce data sent to mongos. If you omit $limit, all partial results are sent — potentially huge network payloads.
The Unbounded $lookup That Brought Down Reporting
- Always index foreign key fields used in $lookup.
- Use
explain()to verify index usage before deploying. - Set realistic timeouts with maxTimeM
S()to fail fast. - Profile long-running aggregations in staging with production-like data volume.
explain() on each sub-pipeline independently. If one sub-pipeline is much slower, split into separate aggregations.S() to pipeline. Add index on foreign key for $lookup. Reduce input with earlier $match.Key takeaways
Common mistakes to avoid
7 patternsMemorising syntax before understanding the concept
Skipping practice and only reading theory
Putting $sort before $match
Forgetting to index foreign fields used in $lookup
Using allowDiskUse without understanding trade-offs
Ignoring $facet sub-pipeline blocking
Using $addFields unnecessarily before $group
Interview Questions on This Topic
How does MongoDB optimize an aggregation pipeline? Give an example of an automatic optimization and one that must be done manually.
Frequently Asked Questions
That's NoSQL. Mark it forged?
13 min read · try the examples if you haven't