Advanced 3 min · March 05, 2026

CTE Referenced 3 Times — 50M Rows Scanned in SQL Server

A CTE referenced three times in SQL Server caused 150M logical reads on a 50M-row table.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
Quick Answer
  • A CTE (WITH clause) is a named, query-scoped result set — exists only for one query execution, no schema changes needed
  • Chain multiple CTEs in one WITH clause — later CTEs can reference earlier ones, creating a readable pipeline
  • Recursive CTEs need UNION ALL (not UNION), an anchor member (base case), and a recursive member that references itself
  • Materialization is engine-specific: PostgreSQL 12+ inlines by default; SQL Server may re-execute a CTE every time it is referenced
  • Data-modifying CTEs (PostgreSQL): DELETE/INSERT with RETURNING enables atomic archive-and-delete in one transaction
  • Biggest trap: referencing an expensive CTE multiple times in SQL Server — use a #temp table instead

Every production database eventually grows a query so complex it becomes a maintenance nightmare — subqueries nested inside subqueries, the same filter logic copy-pasted five times, and a WHERE clause that scrolls off the screen. CTEs were introduced precisely because real SQL work at scale demands readable, composable, and debuggable queries.

The core problem CTEs solve is scope and readability without sacrificing power. Before CTEs, your only option was derived tables (anonymous subqueries in FROM clauses) or views (which require DDL permissions and pollute your schema). CTEs give you a named, scoped, temporary result set that lives only for the duration of your query.

By the end of this article you'll write multi-step CTEs that replace tangled subquery nests, use recursive CTEs to walk org charts, understand exactly when your database materializes a CTE versus re-evaluates it, and avoid the three production gotchas that catch even experienced engineers off guard.

The Anatomy of a CTE — Structure, Scope and Why It Beats Subqueries

A CTE is defined with the WITH keyword, followed by a name, an optional column list, and the query body in parentheses. Everything after that closing parenthesis is the outer query. The CTE only exists for the lifetime of that single outer query execution.

You can define multiple CTEs in one WITH clause by separating them with commas. Crucially, a later CTE can reference an earlier one — this is what makes chaining possible. Think of it as a pipeline: each CTE refines the data one step further before handing it to the next stage.

Why does this beat a subquery? Readability is only part of the answer. The bigger win is debuggability. When you have a three-level nested subquery and something goes wrong, you can't easily isolate which level is producing bad data. With CTEs, you SELECT directly from each named stage during development — comment out the outer query and inspect intermediate results. That feedback loop is invaluable on complex data pipelines.

Also note: a CTE referenced multiple times in the outer query does NOT automatically get computed once and cached. Whether it's materialized or re-evaluated depends on your database engine — a critical performance detail covered in Section 3.

Recursive CTEs — Walking Trees, Org Charts and Hierarchical Data

A recursive CTE has two parts separated by UNION ALL: the anchor member (the base case — where the recursion starts) and the recursive member (which references the CTE itself to walk one level deeper). The engine executes the anchor once, then keeps executing the recursive member against the previous iteration's results until it returns zero rows.

The classic use case is an organizational hierarchy stored as an adjacency list — each employee row has a manager_id pointing to another employee_id. Without a recursive CTE, flattening that requires N queries for N levels. With recursive CTEs, one query handles any depth.

Cycles in data are the most dangerous edge case. If employee A is the manager of B and B is the manager of A (data corruption happens), a naive recursive CTE loops until it hits the depth limit. Always track visited IDs and break when you detect a repeat.

CTE Materialization vs. Re-evaluation — The Performance Truth Nobody Tells You

The SQL standard says nothing about whether a CTE must be materialized (computed once, stored) or re-evaluated (re-run every time it's referenced). Each database engine makes its own choice.

PostgreSQL prior to version 12 always materialized CTEs as optimization fences — the planner could not push WHERE predicates from the outer query into the CTE. PostgreSQL 12+ inlines non-recursive CTEs by default, allowing predicate pushdown. You can force materialization with the MATERIALIZED keyword.

SQL Server never guaranteed materialization — it treats CTEs as macro expansions. If you reference a CTE three times in a UNION ALL, SQL Server may execute that query three times. For expensive CTEs referenced repeatedly, load results into a #temp_table first.

MySQL 8.0+ inlines CTEs. Oracle materializes when it determines it's beneficial and accepts the WITH MATERIALIZE hint to force it.

Production CTE Patterns — Window Functions, Data Mutations and When Not to Use CTEs

CTEs compose beautifully with window functions because window function results can't be filtered in the same SELECT where they're computed — you need a wrapper, and a CTE is cleaner than a subquery for this.

CTEs also work with INSERT, UPDATE, DELETE and MERGE in PostgreSQL and SQL Server — called data-modifying CTEs. In PostgreSQL, data-modifying CTEs execute exactly once, and their effects are visible to the outer query but not to other CTEs in the same WITH clause.

Know when NOT to use a CTE. If your query is a single flat SELECT that runs fine as a subquery, wrapping it in a CTE adds no value. CTEs shine for multi-step logic, recursion, and avoiding repeated complex expressions. A CTE over a 50-million-row unindexed table is still a full table scan — CTEs don't fix missing indexes.

AspectCTE (WITH clause)Derived Table (Subquery in FROM)Temporary TableView
ReadabilityExcellent — named, reusable within queryPoor — anonymous, nestedGood — namedGood — named and shared
Reuse within queryYes — reference same CTE multiple timesNo — must repeat the subqueryYesYes
RecursionYes — WITH RECURSIVENoNoNo
Schema footprintNone — query-scoped onlyNoneRequires CREATE/DROPPermanent schema object
MaterializationEngine-specific — may re-evaluateAlways inlinedAlways materialized — explicitEngine-specific
Best forMulti-step reads, recursion, one-time useSimple single-use subqueriesExpensive results reused many timesShared logic across many queries

Key Takeaways

  • CTEs are query-scoped named result sets — no schema changes needed, live only for one query execution, and are the cleanest way to decompose multi-step SQL into readable stages.
  • Recursive CTEs require UNION ALL (not UNION), always need a termination condition, and should include cycle-detection logic for real-world hierarchical data.
  • CTE materialization is engine-specific — PostgreSQL 12+ inlines by default, SQL Server may re-execute. Verify with EXPLAIN and use temp tables for expensive CTEs referenced multiple times in SQL Server.
  • Data-modifying CTEs in PostgreSQL enable atomic DELETE + INSERT in one transaction — but sibling CTEs cannot see each other's changes, only the outer query can.

Common Mistakes to Avoid

  • Referencing a CTE before it is defined in the WITH clause
    Symptom: 'relation does not exist' or 'invalid object name' error even though the CTE is present in the query
    Fix: CTEs must be defined in dependency order. A CTE can only reference CTEs that appear earlier in the same WITH clause. Reorder your CTEs so dependencies come first.
  • Assuming a CTE is computed once when referenced multiple times in SQL Server
    Symptom: Mysteriously slow queries where EXPLAIN shows the expensive base table scanned multiple times — one scan per reference
    Fix: In SQL Server, CTEs referenced more than once may be re-evaluated each time. Run SET STATISTICS IO ON and check logical reads. For expensive CTEs referenced multiple times, materialize into a #temp_table first.
  • Writing a recursive CTE without cycle detection on potentially dirty hierarchical data
    Symptom: Query throws 'maximum recursion depth exceeded' error — happens only in production where data has referential corruption not present in test data
    Fix: Always track visited node IDs in an ARRAY (PostgreSQL) or add a depth counter with WHERE depth < N. Never assume hierarchical data is clean — code defensively against cycles.

Interview Questions on This Topic

  • QWhat is the difference between a CTE and a subquery, and when would you choose one in production?Mid-levelReveal
    Both produce a named or anonymous result set within a query. The key differences are readability, debuggability, and recursion support. A CTE is named and can be referenced multiple times in the outer query; a subquery in FROM is anonymous and must be re-written if reused. CTEs enable recursive queries (WITH RECURSIVE) which subqueries cannot. For debugging, you can temporarily SELECT from a named CTE stage to inspect intermediate results — impossible with nested subqueries. Choose a CTE when the logic has multiple steps, when you need recursion, or when the same intermediate result is referenced more than once. Choose a subquery for simple single-use filters where the extra WITH clause adds no clarity.
  • QExplain how a recursive CTE works — what are the anchor member and recursive member, and how does the engine know when to stop?SeniorReveal
    A recursive CTE uses WITH RECURSIVE and contains two SELECT statements joined by UNION ALL. The anchor member is the base case — it executes once and produces the initial result set. The recursive member references the CTE by name and executes against the previous iteration's result set. The engine repeats the recursive member until it returns zero rows — that is the natural termination condition. You can also add explicit depth guards (WHERE depth < 100) or cycle detection (WHERE NOT id = ANY(visited_ids)) to stop earlier. Most databases enforce a maximum recursion depth (100 in PostgreSQL and SQL Server by default) to prevent infinite loops from corrupted data.
  • QIf you reference the same CTE three times in a SQL Server UNION ALL, how many times does the underlying query execute — and how do you prove it?SeniorReveal
    In SQL Server, the CTE may execute three times — once per reference. SQL Server treats CTEs as logical macro expansions by default, not as materialized result sets. To prove it: run SET STATISTICS IO ON before executing the query and examine the logical reads count in the output. If the count is approximately 3× the expected single-scan reads, the CTE is re-executing. The fix is to materialize the CTE result into a #temp_table with SELECT ... INTO #temp, then run the three UNION ALL branches against the temp table. In PostgreSQL, you can force single materialization with the MATERIALIZED keyword.

Frequently Asked Questions

Does a SQL CTE improve query performance?

Not inherently. A CTE is a logical construct, not a performance optimization. Whether it improves performance depends on whether your engine materializes it. PostgreSQL 12+ inlines CTEs by default — no performance benefit over a subquery unless you force MATERIALIZED. SQL Server may re-execute CTE logic multiple times. For genuine performance gains with reused intermediate results, use a temp table.

Can you use a CTE in an UPDATE or DELETE statement?

Yes, in most major databases. In PostgreSQL and SQL Server you can define a CTE and then UPDATE or DELETE based on it. PostgreSQL additionally supports data-modifying CTEs where the CTE itself performs the DML using RETURNING to pass affected rows to the outer query — enabling atomic archive-and-delete patterns in a single statement.

What is the difference between WITH RECURSIVE and a regular CTE?

A regular CTE is evaluated once and cannot reference itself. A recursive CTE uses WITH RECURSIVE and contains two parts joined by UNION ALL: an anchor member (base case, evaluated once) and a recursive member (references the CTE itself, runs until it returns no new rows). Recursive CTEs are essential for querying hierarchical structures like org charts, category trees, or graph data in relational tables.

🔥

That's SQL Advanced. Mark it forged?

3 min read · try the examples if you haven't

Previous
SQL Window Functions
7 / 16 · SQL Advanced
Next
SQL Query Optimisation