Senior 9 min · March 12, 2026

SQL Window Functions: No PARTITION BY Crashes 50M Rows

Unpartitioned SUM(revenue) OVER(ORDER BY date) on 50M rows hit 100% CPU and blocked all dashboards for 12 minutes.

N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

Follow
Production
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Window functions compute values across related rows without collapsing them — unlike GROUP BY which removes individual rows
  • OVER() defines the window: PARTITION BY groups, ORDER BY defines ordering within the group, frame clause controls which rows are included
  • ROW_NUMBER() = unique sequential rank, RANK() = gaps on ties, DENSE_RANK() = no gaps on ties
  • LAG(col, n) / LEAD(col, n) access n rows before/after the current row — essential for period-over-period comparisons
  • Window functions cannot be filtered in the same SELECT where they are computed — wrap in a CTE or subquery
  • Biggest trap: large PARTITION BY with no supporting index causes filesort on every execution
✦ Definition~90s read
What is SQL Window Functions?

Window functions let you perform calculations across a set of rows related to the current row, without collapsing them into a single output row like GROUP BY does. They solve the problem of needing both detail-level data and aggregated context in the same query — for example, showing each employee's salary alongside the department average.

Imagine you're in a running race and you want to know your position — but you also want to see everyone else's position on the same leaderboard.

Without them, you'd resort to self-joins or multiple subqueries, which tank performance and readability. The OVER clause defines the window: PARTITION BY splits rows into groups (like GROUP BY), and ORDER BY determines the order within each group. The killer mistake is omitting PARTITION BY on large tables — that forces the entire 50M-row dataset into a single window, blowing up memory and crashing the engine.

Window frames (ROWS vs RANGE) control which rows the function sees; the default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW can produce surprising results with ties. In the SQL evaluation order, window functions execute after WHERE, GROUP BY, and HAVING but before ORDER BY, so you can't filter on them directly — that's where a subquery or CTE comes in.

Ranking functions (ROW_NUMBER, RANK, DENSE_RANK) are the most common use case, but LAG and LEAD let you compare a row to its neighbors without a self-join, which is essential for time-series analysis or detecting changes in sequential data. When you need per-group calculations without losing row-level detail, window functions are the tool — just don't forget the PARTITION BY on big data.

Plain-English First

Imagine you're in a running race and you want to know your position — but you also want to see everyone else's position on the same leaderboard. GROUP BY would give you one row per runner (just the counts). A window function gives you each runner's row plus a computed position alongside it, without collapsing the data. You can see 'Alice: 1st place', 'Bob: 2nd place', 'Clara: 2nd place (tied)' all at once.

Every senior data engineer has hit the same wall: you need running totals, percentile ranks, or 'compare this row to its neighbours' logic, and suddenly GROUP BY feels like trying to paint a wall with a toothbrush. Window functions are the feature that separates analysts who write five nested subqueries from those who write one clean, efficient query.

The key mental model: a window function computes a value for each row using a defined set of related rows (the 'window') without removing any rows from the result. GROUP BY collapses rows into summaries. Window functions add derived columns to each row while leaving all rows intact.

Why Window Functions Are Not Just Fancy GROUP BYs

A window function computes a value across a set of rows related to the current row, without collapsing them into a single output row. Unlike aggregate functions (SUM, AVG) that reduce N rows to 1, window functions preserve every row while adding a computed column. The core mechanic is the OVER() clause, which defines the window frame — the subset of rows the function operates on. Without PARTITION BY, the frame defaults to the entire result set, meaning every row sees the same aggregate over all 50 million rows. That single-pass scan over the full dataset is O(n) per row, turning a query into a full table scan with a massive memory footprint. In practice, this means a query that should run in milliseconds can balloon into a multi-minute operation, spilling to disk and hammering tempdb or swap. The key property: window functions execute after WHERE, GROUP BY, and HAVING but before ORDER BY, so they operate on the filtered, grouped intermediate result. Use them when you need running totals, moving averages, row rankings, or comparisons to a group-level value (like 'each employee's salary vs. department average') without losing detail. In real systems, they replace self-joins and correlated subqueries, cutting query complexity from O(n²) to O(n log n) or better.

Missing PARTITION BY = Accidental Global Aggregate
Omitting PARTITION BY in a window function on a large table silently computes the aggregate over all rows, causing a full scan and massive memory pressure — not a syntax error, but a performance disaster.
Production Insight
Production scenario: A reporting query on a 50M-row sales table uses SUM(amount) OVER() to compute total sales alongside each row. Symptom: query runs for 12 minutes, tempdb grows by 30 GB, and the reporting dashboard times out. Rule of thumb: Always specify PARTITION BY unless you explicitly need a global aggregate — and if you do, cache it in a CTE or temp table first.
Key Takeaway
Window functions preserve row count; aggregates reduce it — never confuse the two.
Without PARTITION BY, every row computes over the entire dataset — O(n) per row, O(n²) total.
Use window functions to replace self-joins and correlated subqueries for cleaner, faster SQL.
SQL Window Functions: OVER, PARTITION, and Frames THECODEFORGE.IO SQL Window Functions: OVER, PARTITION, and Frames Flow from window definition to row-wise analytics without GROUP BY OVER Clause Defines window: PARTITION BY + ORDER BY PARTITION BY Groups rows into partitions (optional) ORDER BY + Frame Sorts rows; ROWS/RANGE default frame Window Function Applies per row within window (e.g., RANK, LAG) WINDOW Clause Reuses named window definitions ⚠ Missing PARTITION BY processes all 50M rows as one window Always specify PARTITION BY or test with small data first THECODEFORGE.IO
thecodeforge.io
SQL Window Functions: OVER, PARTITION, and Frames
Sql Window Functions

The OVER Clause — How Windows Are Defined

Every window function requires an OVER clause — this is what makes it a window function rather than a regular aggregate. The OVER clause defines the window: the set of rows the function can 'see' when computing its value for the current row.

OVER() with nothing inside it means the window is the entire result set. PARTITION BY inside OVER divides the result set into groups — the function computes separately within each partition. ORDER BY inside OVER defines the ordering within each partition, which is required for ranking and running total functions.

The critical difference from GROUP BY: GROUP BY collapses multiple rows into one summary row. OVER keeps all rows intact and adds a computed column alongside each row. You can see the individual transactions AND the running total on every row simultaneously.

window_basics_over.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Setup: monthly sales by salesperson
CREATE TABLE sales (
    sale_id    INT PRIMARY KEY,
    rep_name   VARCHAR(100),
    region     VARCHAR(50),
    sale_date  DATE,
    amount     DECIMAL(10,2)
);

-- GROUP BY collapses to one row per rep -- individual sales lost
SELECT rep_name, SUM(amount) AS total
FROM sales
GROUP BY rep_name;

-- Window function: all rows preserved + running total added
SELECT
    sale_id,
    rep_name,
    sale_date,
    amount,
    -- Running total per rep, ordered by date
    SUM(amount) OVER (
        PARTITION BY rep_name      -- separate window per rep
        ORDER BY sale_date         -- accumulate in date order
    ) AS running_total,
    -- Grand total for comparison (no PARTITION BY = whole table)
    SUM(amount) OVER () AS grand_total
FROM sales
ORDER BY rep_name, sale_date;
Output
sale_id | rep_name | sale_date | amount | running_total | grand_total
1 | Alice Tanaka| 2024-01-05 | 1200.00 | 1200.00 | 42850.00
4 | Alice Tanaka| 2024-01-12 | 850.00 | 2050.00 | 42850.00
7 | Alice Tanaka| 2024-02-03 | 1500.00 | 3550.00 | 42850.00
2 | Ben Okafor | 2024-01-08 | 2200.00 | 2200.00 | 42850.00
Window vs GROUP BY — The Core Difference
  • GROUP BY: N rows become 1 summary row — individual rows disappear
  • OVER(): N rows stay N rows — a computed column is added to each
  • PARTITION BY: divides the window into groups (like GROUP BY but without collapsing)
  • ORDER BY inside OVER: controls accumulation order for running totals and rankings
  • No PARTITION BY: window is the entire result set — dangerous on large tables
Production Insight
OVER() with no PARTITION BY = the entire table is one window = full table sort required.
On large tables always PARTITION BY a column with an index to split the sort into manageable chunks.
For EXPLAIN: look for WindowAgg node — the Sort node before it reveals whether the sort is index-backed or a filesort.
Key Takeaway
OVER() is what makes a function a window function — defines the rows it can see.
PARTITION BY splits into independent groups; ORDER BY inside OVER controls accumulation order.
No PARTITION BY on large tables = full table sort = production incident waiting to happen.

SQL Evaluation Order — Where Window Functions Live

Understanding the logical order of SQL clause evaluation is essential for writing correct window function queries. The order is: FROM → WHERE → GROUP BY → HAVING → WINDOW → SELECT → DISTINCT → ORDER BY → LIMIT. Window functions are computed after WHERE, GROUP BY, and HAVING but before ORDER BY and SELECT. This means:

  • You cannot reference a window function alias in the same SELECT's WHERE clause — it hasn't been computed yet. If you try, you get an 'column not found' error.
  • You cannot filter on a window function result in WHERE. Wrap in a CTE or subquery.
  • ORDER BY happens after window functions, so you can ORDER BY a window function alias without issues.
  • GROUP BY aggregates are available before window functions — your window can reference GROUP BY output (like SUM(salary) for that group).

The WINDOW clause in the evaluation order refers to the window function computation phase, not the WINDOW clause keyword (which is syntactic sugar to name a window definition for reuse).

evaluation_order.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
-- This query will fail because window function alias 'rn' is not available in WHERE
SELECT
    *,
    ROW_NUMBER() OVER (PARTITION BY region ORDER BY amount DESC) AS rn
FROM sales
WHERE rn = 1;  -- ERROR: column 'rn' does not exist

-- Proper pattern: wrap in a CTE (WITH) to allow filter after window computation
WITH ranked AS (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY region ORDER BY amount DESC) AS rn
    FROM sales
)
SELECT * FROM ranked WHERE rn = 1;

-- ORDER BY can use window function alias directly
SELECT *,
       RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank_val
FROM sales
ORDER BY rank_val;  -- works fine
Output
sale_id | rep_name | region | amount | rn
3 | Alice Tanaka| East | 1500.00 | 1
6 | Ben Okafor | East | 1400.00 | 2
9 | Clara Webb | West | 2000.00 | 1
The WINDOW Phase Happens After GROUP BY
This is why you can use GROUP BY aggregates inside a window function: the GROUP BY has already been computed, so SUM(salary) is available as a base column for the window. For example: SELECT department, SUM(salary) AS dept_total, RANK() OVER (ORDER BY SUM(salary) DESC) AS dept_rank FROM employees GROUP BY department;
Production Insight
Knowing the evaluation order prevents the most common window function mistake — trying to filter on a result in WHERE. Every production code review should flag direct filtering on window function aliases. The CTE-pattern is the standard fix and is universally supported across PostgreSQL, MySQL 8+, SQL Server, and Oracle.
Key Takeaway
Window functions are computed after WHERE/GROUP BY/HAVING but before ORDER BY. To filter on window results, use a CTE. This order is logical, not physical — the database may optimise, but the result must match the logical order.
SQL Evaluation Order — Where Window Functions Are Computed
FROM / JOINWHEREGROUP BYHAVINGWINDOW FUNCTIONSSELECTDISTINCTORDER BYLIMIT / OFFSET

Ranking Functions — ROW_NUMBER, RANK, and DENSE_RANK

Three ranking functions, three subtly different behaviours when ties exist.

ROW_NUMBER() assigns a unique sequential integer to every row, even when rows are tied. The order assigned to tied rows is arbitrary (unless you add a tiebreaker to ORDER BY). Use ROW_NUMBER() when you need to pick exactly one row per group — for example, the most recent order per customer.

RANK() assigns the same number to tied rows, then skips numbers. If two rows tie for position 2, both get rank 2 and the next row gets rank 4. The gap represents the number of tied rows.

DENSE_RANK() also assigns the same number to ties, but does not skip. Two rows tied at position 2 both get 2, and the next row gets 3. Use DENSE_RANK() when you want contiguous rank values regardless of ties — 'what is the 3rd highest salary' works with DENSE_RANK, not RANK.

ranking_functions.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Compare all three ranking functions on the same data
SELECT
    rep_name,
    region,
    SUM(amount)  AS total_sales,
    ROW_NUMBER() OVER (PARTITION BY region ORDER BY SUM(amount) DESC) AS row_num,
    RANK()       OVER (PARTITION BY region ORDER BY SUM(amount) DESC) AS rank_val,
    DENSE_RANK() OVER (PARTITION BY region ORDER BY SUM(amount) DESC) AS dense_rank_val
FROM sales
GROUP BY rep_name, region
ORDER BY region, total_sales DESC;

-- Common pattern: get the top 1 row per group
-- ROW_NUMBER guarantees exactly one row per region regardless of ties
WITH ranked_sales AS (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY region ORDER BY amount DESC) AS rn
    FROM sales
)
SELECT rep_name, region, amount
FROM ranked_sales
WHERE rn = 1;  -- cannot use window function directly in WHERE -- must wrap in CTE
Output
rep_name | region | total_sales | row_num | rank_val | dense_rank_val
Alice Tanaka | East | 15200.00 | 1 | 1 | 1
Ben Okafor | East | 14800.00 | 2 | 2 | 2
Clara Webb | East | 14800.00 | 3 | 2 | 2 <- tied
Dana Kim | East | 12100.00 | 4 | 4 | 3 <- RANK gaps, DENSE_RANK does not
Window Functions Cannot Be Filtered in WHERE
SELECT , RANK() OVER (...) AS r FROM table WHERE r = 1 throws an error — window functions are computed after WHERE filtering, so the alias is not yet available. Wrap in a CTE: WITH ranked AS (SELECT , RANK() OVER (...) AS r FROM table) SELECT * FROM ranked WHERE r = 1.
Production Insight
ROW_NUMBER() is the correct function for 'get latest record per group' — use ORDER BY created_at DESC and filter WHERE rn = 1.
Random tie-breaking in ROW_NUMBER() can cause non-deterministic query results — always add a tiebreaker (usually the PK) to ORDER BY.
Wrap ranking in a CTE before filtering on the rank value — this is a universal pattern you will write dozens of times.
Key Takeaway
ROW_NUMBER = unique rank, RANK = gaps on ties, DENSE_RANK = no gaps on ties.
To get top-1 per group: ROW_NUMBER() in a CTE, filter WHERE rn = 1.
Always add a tiebreaker column to ORDER BY inside ROW_NUMBER() — arbitrary tie-breaking produces inconsistent results.

LAG and LEAD — Comparing Rows to Their Neighbours

LAG(column, n, default) returns the value from n rows before the current row within the window. LEAD(column, n, default) returns the value from n rows ahead. Both default n to 1 if omitted. The third argument sets what to return when there is no previous or next row (otherwise NULL).

These are indispensable for period-over-period analysis: month-over-month revenue change, day-over-day user growth, previous order date for a customer. Before LAG/LEAD existed, this required a self-join — and self-joins on large tables for row comparison are notoriously slow.

lag_lead_comparison.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Month-over-month revenue change using LAG
WITH monthly_revenue AS (
    SELECT
        DATE_TRUNC('month', sale_date) AS month,
        SUM(amount)                    AS revenue
    FROM sales
    GROUP BY DATE_TRUNC('month', sale_date)
)
SELECT
    month,
    revenue,
    LAG(revenue, 1, 0) OVER (ORDER BY month) AS prev_month_revenue,
    revenue - LAG(revenue, 1, 0) OVER (ORDER BY month) AS change,
    ROUND(
        100.0 * (revenue - LAG(revenue) OVER (ORDER BY month))
        / NULLIF(LAG(revenue) OVER (ORDER BY month), 0),
    1) AS pct_change
FROM monthly_revenue
ORDER BY month;

-- Find customers whose next order came >90 days after the previous
-- LEAD gives us the next order date without a self-join
SELECT
    customer_id,
    order_date                                              AS this_order,
    LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS next_order,
    LEAD(order_date) OVER (PARTITION BY customer_id ORDER BY order_date)
        - order_date                                        AS days_to_next
FROM orders
HAVING days_to_next > 90 OR days_to_next IS NULL;
Output
month | revenue | prev_revenue | change | pct_change
2024-01-01 | 42850.00 | 0.00 | 42850.00 | NULL
2024-02-01 | 51200.00 | 42850.00 | 8350.00 | 19.5
2024-03-01 | 48900.00 | 51200.00 | -2300.00 | -4.5
NULLIF Prevents Division by Zero in Percentage Calculations
When computing percentage change, the previous value might be zero (first month of data, new customer). Dividing by zero throws an error or returns NULL depending on the database. NULLIF(prev_value, 0) returns NULL when prev_value is zero, making the division return NULL safely instead of crashing.
Production Insight
LAG/LEAD replaced thousands of correlated subqueries and self-joins in analytics codebases — if you see a self-join comparing a table to itself on adjacent rows, it is a LAG/LEAD rewrite opportunity.
Always provide a default value as the third argument (LAG(col, 1, 0)) — NULL arithmetic produces NULL results that silently break calculations.
For large tables, PARTITION BY customer_id + an index on (customer_id, order_date) is mandatory for acceptable performance.
Key Takeaway
LAG/LEAD access neighbouring rows without self-joins — the canonical tool for period-over-period comparisons.
Always supply a default (third argument) to avoid NULL arithmetic silently corrupting your calculations.
Add a composite index on (PARTITION_BY_col, ORDER_BY_col) — these queries require sorted access.

Window Frames — ROWS vs RANGE and the Subtle Default

The frame clause in an OVER definition controls exactly which rows are included in the window for each row's computation. This is the least understood part of window functions and the source of subtle bugs.

The default frame when you specify ORDER BY inside OVER is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW — not ROWS. The RANGE default includes all rows that are tied on the ORDER BY value up to and including the current row. For dates with no duplicates this makes no visible difference. For financial data where multiple transactions share the same timestamp, the RANGE default groups all tied rows together — your running total will jump by the combined amount of all tied rows simultaneously, which is often not what you want.

ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW processes each physical row individually, regardless of ties. For running totals and moving averages, ROWS is almost always the correct choice.

window_frames.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- RANGE default vs ROWS explicit -- spot the difference
SELECT
    sale_date,
    amount,
    -- Default: RANGE -- ties share the same cumulative value
    SUM(amount) OVER (
        ORDER BY sale_date
        -- RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (implicit default)
    ) AS running_range,
    -- Explicit ROWS: each physical row accumulated independently
    SUM(amount) OVER (
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_rows
FROM sales
WHERE rep_name = 'Alice Tanaka'
ORDER BY sale_date;

-- 7-day moving average: ROWS frame with sliding window
SELECT
    sale_date,
    amount,
    AVG(amount) OVER (
        ORDER BY sale_date
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW  -- current row + 6 before it
    ) AS moving_avg_7day
FROM daily_sales
ORDER BY sale_date;
Output
-- When two sales share the same date:
-- RANGE groups them: both rows show the combined total immediately
-- ROWS accumulates: first row shows partial, second row shows full
sale_date | amount | running_range | running_rows
2024-01-05 | 800.00 | 2200.00 | 800.00 <- RANGE jumps to combined total
2024-01-05 | 1400.00| 2200.00 | 2200.00 <- ROWS accumulates each row
The RANGE Default Causes Surprising Behaviour on Ties
When you write SUM(...) OVER (ORDER BY date), the default frame is RANGE which groups all rows with the same date value. If two transactions happen on the same day, both rows will show the combined daily total immediately — not the individual accumulation. For running totals, always specify ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW explicitly.
Production Insight
Running total queries in financial systems that show identical values for same-day transactions are almost always hitting the RANGE default rather than ROWS.
For moving averages, ROWS BETWEEN N PRECEDING AND CURRENT ROW is the correct frame — gives exactly N+1 rows of data regardless of value ties.
Specifying the frame explicitly (ROWS BETWEEN ...) is a production best practice — it documents intent and prevents subtle default-behaviour bugs.
Key Takeaway
Default frame with ORDER BY is RANGE — groups tied values, which is rarely what you want for running totals.
For running totals and moving averages, always specify ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW explicitly.
Explicit frame clauses are self-documenting and prevent subtle tie-handling bugs.

The WINDOW Clause — Reusing Window Definitions

When you use the same window definition in multiple aggregate functions, repeating the full OVER clause becomes error-prone and hard to maintain. The WINDOW clause (introduced in SQL:2003 and supported by PostgreSQL, MySQL 8+, SQL Server, and Oracle) lets you name a window specification and reference it by name in multiple OVER clauses.

Instead of writing: SUM(amount) OVER (PARTITION BY region ORDER BY sale_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_region_total, AVG(amount) OVER (PARTITION BY region ORDER BY sale_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_avg

You can define the window once and reuse it: WINDOW w AS (PARTITION BY region ORDER BY sale_date)

Then refer to it with a frame variation: SUM(amount) OVER w AS running_total, SUM(amount) OVER (w ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_avg

This reduces duplication, improves readability, and ensures the partition/order definition is consistent across all uses. If you need to change the partition key later, you only modify the WINDOW clause.

window_clause.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Without WINDOW clause: repetitive OVER definitions
SELECT
    region,
    sale_date,
    amount,
    SUM(amount) OVER (PARTITION BY region ORDER BY sale_date) AS running_total,
    AVG(amount) OVER (PARTITION BY region ORDER BY sale_date ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS moving_avg
FROM sales
ORDER BY region, sale_date;

-- With WINDOW clause: define once, reuse
SELECT
    region,
    sale_date,
    amount,
    SUM(amount) OVER w                           AS running_total,
    AVG(amount) OVER (w ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS moving_avg,
    COUNT(*) OVER w                              AS sales_count_in_region_so_far
FROM sales
WINDOW w AS (PARTITION BY region ORDER BY sale_date)
ORDER BY region, sale_date;
Output
region | sale_date | amount | running_total | moving_avg | sales_count
East | 2024-01-05 | 1200.00 | 1200.00 | 1200.00 | 1
East | 2024-01-12 | 850.00 | 2050.00 | 1025.00 | 2
East | 2024-02-03 | 1500.00 | 3550.00 | 1183.33 | 3
West | 2024-01-08 | 2200.00 | 2200.00 | 2200.00 | 1
Frame Can Be Overridden on Referenced Windows
When you reference a named window, you can still specify a different frame. The base partition and order must stay the same, but the frame clause can differ. This is ideal for computing both running totals (UNBOUNDED PRECEDING) and moving averages (e.g., 6 PRECEDING) from the same window.
Production Insight
The WINDOW clause shines in complex reporting queries where you compute multiple window aggregates with the same partition/order. It reduces the risk of copy-paste errors, which in production can lead to inconsistent results between supposedly identical window definitions. Always use WINDOW for any query with two or more window functions sharing the same base frame.
Key Takeaway
The WINDOW clause names a window definition for reuse, eliminating duplication and preventing copy-paste bugs. Use it when you have multiple window functions that share the same PARTITION BY and ORDER BY.

FIRST_VALUE, LAST_VALUE, and NTH_VALUE — Row Navigation Within Partitions

FIRST_VALUE(column), LAST_VALUE(column), and NTH_VALUE(column, n) return values from specified positions within a window partition. FIRST_VALUE gets the first row's value according to the window's ORDER BY. LAST_VALUE gets the last row's value — but careful: the default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, so LAST_VALUE returns the value from the current row within the current frame, not necessarily the partition's absolute last. To get the true last value in the partition, specify a frame of ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.

NTH_VALUE(column, n) returns the value from the nth row (1-indexed) within the window. It's useful for getting, say, the second highest sale per region without a subquery.

These functions are most valuable when you need to compare the current row to the first or last row of a group — for example, showing the first order date for a customer alongside every subsequent order, or the percentage change from the first value.

first_last_nth.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- FIRST_VALUE and LAST_VALUE with proper frame
SELECT
    rep_name,
    sale_date,
    amount,
    FIRST_VALUE(amount) OVER (
        PARTITION BY rep_name
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS first_sale_amount,
    LAST_VALUE(amount) OVER (
        PARTITION BY rep_name
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS last_sale_amount,
    NTH_VALUE(amount, 2) OVER (
        PARTITION BY rep_name
        ORDER BY sale_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
    ) AS second_sale_amount
FROM sales
ORDER BY rep_name, sale_date;
Output
rep_name | sale_date | amount | first_sale | last_sale | second_sale
Alice Tanaka | 2024-01-05 | 1200.00 | 1200.00 | 1500.00 | 850.00
Alice Tanaka | 2024-01-12 | 850.00 | 1200.00 | 1500.00 | 850.00
Alice Tanaka | 2024-02-03 | 1500.00 | 1200.00 | 1500.00 | 850.00
Ben Okafor | 2024-01-08 | 2200.00 | 2200.00 | 2200.00 | NULL
LAST_VALUE Traps with Default Frame
If you omit the frame when using LAST_VALUE, the default RANGE frame will only see rows up to the current row. LAST_VALUE will then return the same value as the current row for every row — not the partition's final row. Always use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING with LAST_VALUE and NTH_VALUE to get correct partition-level results.
Production Insight
FIRST_VALUE and LAST_VALUE are useful for comparisons like 'show each order and how it compares to the customer's first order'. When combined with a CTE, you can compute differences such as days_since_first_order = sale_date - FIRST_VALUE(sale_date). Always check the frame — the default RANGE frame will not give you the true first/last of the partition. Use UNBOUNDED FOLLOWING for LAST_VALUE, and UNBOUNDED PRECEDING is fine for FIRST_VALUE but it's symmetric to be explicit.
Key Takeaway
FIRST_VALUE/LAST_VALUE/NTH_VALUE access rows by position within a partition. LAST_VALUE and NTH_VALUE require an explicit frame with UNBOUNDED FOLLOWING to return the true last/nth row of the partition. Default frame returns the current row for these functions.

Statistical Window Functions — NTILE and CUME_DIST

NTILE(n) divides the window into n as-equal-as-possible buckets (tiles) and assigns a bucket number from 1 to n to each row. It's commonly used for quartile analysis (NTILE(4)), decile analysis (NTILE(10)), or percentiles. Rows are evenly distributed across the buckets; if the count doesn't divide evenly, the first buckets get one extra row.

CUME_DIST() computes the cumulative distribution — the relative position of a row within the window. It returns a value between 0 and 1, representing the proportion of rows with a value less than or equal to the current row's value. For the row with the highest value, CUME_DIST = 1. For the lowest, it's 1/N (where N is the partition size). This is different from PERCENT_RANK(), which uses (rank-1)/(N-1).

NTILE is excellent for creating A/B test groups, splitting data into training and test sets, or identifying top/bottom deciles. CUME_DIST is ideal for percentile-based thresholds (top 10%, above median, etc.).

ntile_cume_dist.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- NTILE: Divide reps into 4 quartiles by total sales
SELECT
    rep_name,
    SUM(amount) AS total_sales,
    NTILE(4) OVER (ORDER BY SUM(amount) DESC) AS sales_quartile,
    CUME_DIST() OVER (ORDER BY SUM(amount) DESC) AS cumulative_dist
FROM sales
GROUP BY rep_name
ORDER BY total_sales DESC;

-- Use NTILE for A/B test bucket assignment (deterministic with ORDER BY)
SELECT
    user_id,
    NTILE(3) OVER (ORDER BY user_id) AS test_group  -- 1,2,3
FROM users;

-- Find top 10% of customers by revenue using CUME_DIST
WITH customer_revenue AS (
    SELECT
        customer_id,
        SUM(amount) AS total_revenue,
        CUME_DIST() OVER (ORDER BY SUM(amount) DESC) AS percentile
    FROM orders
    GROUP BY customer_id
)
SELECT customer_id, total_revenue
FROM customer_revenue
WHERE percentile <= 0.1
ORDER BY total_revenue DESC;
Output
rep_name | total_sales | sales_quartile | cumulative_dist
Ben Okafor | 22000.00 | 1 | 1.00
Alice Tanaka | 18500.00 | 1 | 0.67
Clara Webb | 14800.00 | 2 | 0.33
Dana Kim | 12100.00 | 2 | 0.00
NTILE Requires Stable Order for Deterministic Buckets
NTILE's bucket assignment is deterministic only if the ORDER BY is unique. If rows tie, the distribution may vary between executions. For A/B test assignment, use a unique column like user_id to guarantee each user ends up in the same bucket every time.
Production Insight
NTILE is the standard tool for splitting data into equal-sized groups. Use NTILE(5) for quintiles in performance reviews, NTILE(4) for quartile analysis, and NTILE(10) for decile reporting. CUME_DIST is more precise than NTILE for percentile thresholds — it gives the exact cumulative proportion, whereas NTILE only groups into buckets. For 'top 10%' use CUME_DIST <= 0.1, not NTILE(10) = 1, because NTILE forces equal bucket sizes and a hard cutoff at 10% boundaries.
Key Takeaway
NTILE divides rows into as-equal buckets for quantile analysis and A/B test split. CUME_DIST gives the exact cumulative proportion (0-1) for percentile-based thresholds. Use NTILE for grouping, CUME_DIST for precise cutoff thresholds.

Types of Window Functions — A Battlefield Taxonomy

You don't need a PhD in SQL to understand window functions. You need a clear head and a mental map of which tool does what. Competitors love to dump you into 'Aggregate vs Ranking' and call it a day. That's like handing a recruit a toolbox with 'hammer' and 'saw' labels — it's technically true, but it doesn't help you pick the right one when the server's on fire.

There are three real categories, not two. Aggregate window functions (SUM, AVG, COUNT, MAX, MIN) let you run rolling totals or moving averages without collapsing rows. They answer "What's the running sum up to this row?" — a GROUP BY killer. Ranking functions (ROW_NUMBER, RANK, DENSE_RANK, NTILE, CUME_DIST) assign positions. They answer "Where does this row sit relative to its peers?" — essential for leaderboards, deduplication, percentile splits. Value functions (LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE) reach across rows within the same partition. They answer "What's the previous order value?" or "Who was the first customer this month?" — row-level comparisons without a self-join.

Why this matters: Pick the wrong category and your query either collapses into a single row (using GROUP BY when you meant a window) or explodes into a cartesian product (reaching outside your partition). Know your categories before you write a single OVER clause.

TaxonomyDemo.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// io.thecodeforge — database tutorial

WITH sales AS (
  SELECT '2024-01-01' AS sale_date, 100.0 AS amount
  UNION ALL SELECT '2024-01-02', 150.0
  UNION ALL SELECT '2024-01-03', 200.0
  UNION ALL SELECT '2024-01-04', 120.0
)
SELECT
  sale_date,
  amount,
  SUM(amount) OVER (ORDER BY sale_date) AS rolling_total,      -- aggregate
  ROW_NUMBER() OVER (ORDER BY amount DESC) AS rank_by_amount,  -- ranking
  LAG(amount, 1) OVER (ORDER BY sale_date) AS prev_day_amount -- value
FROM sales;
Output
sale_date | amount | rolling_total | rank_by_amount | prev_day_amount
-----------+--------+---------------+----------------+----------------
2024-01-01 | 100.0 | 100.0 | 4 | NULL
2024-01-02 | 150.0 | 250.0 | 2 | 100.0
2024-01-03 | 200.0 | 450.0 | 1 | 150.0
2024-01-04 | 120.0 | 570.0 | 3 | 200.0
Production Trap:
Don't confuse aggregate window functions with aggregate GROUP BY. A window SUM retains every row; GROUP BY collapses. Accidentally omitting PARTITION BY in an aggregate window will compute a global total in every row — silent, wrong, and your dashboard will look flat. Always run a SELECT * to sanity-check row counts.
Key Takeaway
There are three categories of window functions: aggregate (rolling math), ranking (position assignment), and value (row-to-row navigation). Pick the category first, then the specific function.

PERCENT_RANK and CUME_DIST — Your Stats Toolkit for Real Data

Every dashboard has a percentile request. "Show me customers in the top 10%." "Flag orders above the 95th percentile." Your juniors will reach for NTILE and then fight with bucket boundaries. Stop them. PERCENT_RANK and CUME_DIST are the right tools for this job, and most tutorials skip them entirely.

PERCENT_RANK gives you the relative rank of a row within a partition, normalized to a 0-to-1 scale. It computes as (RANK - 1) / (total_rows - 1). First row gets 0. Last gets 1. Perfect for "top 10% high spenders" when you need a continuous score, not fixed buckets.

CUME_DIST stands for cumulative distribution. It's simpler: it tells you the proportion of rows with values less than or equal to the current row. So the 100th percentile row shows 1.0, meaning all rows are <= its value. That's your "95th percentile threshold" calculator for anomaly detection.

Real talk: PERCENT_RANK is better than NTILE when you need precise cutoffs. NTILE divides into equal buckets — if you have 101 rows, NTILE(10) creates buckets of 10 or 11 rows. That's lumpy. PERCENT_RANK gives you a smooth gradient. Use it in WHERE clauses to filter your outliers.

Here's how they differ on the same data — watch the output closely.

PercentileDemo.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
// io.thecodeforge — database tutorial

WITH scores AS (
  SELECT 50 AS score UNION ALL SELECT 60 UNION ALL SELECT 70
  UNION ALL SELECT 80 UNION ALL SELECT 90
)
SELECT
  score,
  RANK() OVER (ORDER BY score) AS simple_rank,
  PERCENT_RANK() OVER (ORDER BY score) AS pct_rank,
  CUME_DIST() OVER (ORDER BY score) AS cum_distribution
FROM scores
ORDER BY score;
Output
score | simple_rank | pct_rank | cum_distribution
------+-------------+---------------------+-----------------
50 | 1 | 0.0 | 0.2
60 | 2 | 0.25 | 0.4
70 | 3 | 0.5 | 0.6
80 | 4 | 0.75 | 0.8
90 | 5 | 1.0 | 1.0
Senior Shortcut:
Use PERCENT_RANK for 'top X%' filters. Wrap it in a subquery: SELECT FROM (SELECT , PERCENT_RANK() OVER (ORDER BY revenue DESC) AS rnk FROM customers) WHERE rnk <= 0.1. For anomaly detection, CUME_DIST is your friend: find the 95th percentile threshold by selecting the minimum value where CUME_DIST >= 0.95.
Key Takeaway
PERCENT_RANK gives smooth percentile scores; CUME_DIST gives cumulative distribution. Use them for precise thresholding instead of lumpy NTILE buckets.
● Production incidentPOST-MORTEMseverity: high

A Window Function Over an Unpartitioned 50M-Row Table Brought the Reporting Database Down

Symptom
The reporting database CPU hit 100% and all other dashboard queries queued behind the blocking query. The analyst's query was still running after 12 minutes and had to be manually killed.
Assumption
Window functions are a database feature — the engine will handle them efficiently. The query looked clean and worked in development.
Root cause
The query used SUM(revenue) OVER (ORDER BY transaction_date) with no PARTITION BY clause, computing a running total across all 50 million rows ordered by date. Without PARTITION BY, the window function operates over the entire table as a single partition. The ORDER BY inside OVER requires sorting all 50 million rows. No index on transaction_date meant a filesort on the full dataset held an exclusive sort buffer for the duration.
Fix
Added PARTITION BY to split the computation into smaller groups (by region_id), added an index on (region_id, transaction_date) to support the window partition and ordering. Query time dropped from 12 minutes to 1.8 seconds.
Key lesson
  • Window functions without PARTITION BY operate over the entire table as one window — on large tables this is a full table sort
  • Add a composite index on (PARTITION_BY_col, ORDER_BY_col) to support window function execution
  • Test window function queries on production-scale row counts before deploying — performance does not extrapolate from sample data
Production debug guideDiagnosing slow queries involving window functions4 entries
Symptom · 01
Window function query is slow and EXPLAIN shows Sort or WindowAgg with high cost
Fix
Check if PARTITION BY has an index. Create a composite index on (partition_col, order_col): CREATE INDEX idx_sales_region_date ON sales(region_id, transaction_date). The index allows the engine to skip sorting entirely.
Symptom · 02
Cannot filter on a window function result in the WHERE clause
Fix
Window function values cannot be used in WHERE in the same SELECT. Wrap in a CTE or subquery: WITH ranked AS (SELECT , ROW_NUMBER() OVER (...) AS rn FROM table) SELECT FROM ranked WHERE rn = 1.
Symptom · 03
ROW_NUMBER() results are inconsistent between runs on tied values
Fix
ROW_NUMBER() assigns arbitrary order to ties. Add a tiebreaker column to the ORDER BY: ROW_NUMBER() OVER (ORDER BY created_at DESC, id DESC). The id column (PK) guarantees a deterministic result.
Symptom · 04
Running total resets unexpectedly at certain points
Fix
Your PARTITION BY is grouping data in a way you did not intend. Add PARTITION BY explicitly or remove it entirely. Check whether the ORDER BY inside OVER matches your intended calculation direction.
FunctionPurposeHandles TiesBest Used For
ROW_NUMBER()Unique sequential rankArbitrary order for tiesSelect exactly one row per group — latest record, deduplication
RANK()Rank with gaps on tiesTied rows share rank, next rank skipsCompetition rankings where tied positions matter
DENSE_RANK()Rank without gaps on tiesTied rows share rank, no skipsNth highest value queries — 'find the 3rd highest salary'
SUM/AVG/COUNT OVER()Running aggregatesDepends on frame clauseRunning totals, moving averages, period comparisons
LAG(col, n)Access n rows before currentn/aPeriod-over-period comparison, previous value reference
LEAD(col, n)Access n rows after currentn/aFuture value reference, days-to-next-event calculations
PERCENT_RANK()Relative percentile position 0–1Tied rows share rankDistribution analysis — bottom 10% queries with CTE filter
NTILE(n)Divide into n equal bucketsDistributes evenlyQuartile/decile analysis, A/B test bucket assignment

Key takeaways

1
Window functions compute values across related rows without collapsing them
the opposite of GROUP BY which removes individual rows.
2
ROW_NUMBER for deduplication, RANK for competition rankings, DENSE_RANK for Nth-highest queries.
3
LAG/LEAD replace expensive self-joins for row-neighbour comparisons
the canonical tool for period-over-period analysis.
4
The default frame with ORDER BY is RANGE
specify ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW explicitly for running totals to avoid tie-handling bugs.
5
PARTITION BY without an index causes a filesort on every execution
always add a composite index on (partition_col, order_col) for large tables.

Common mistakes to avoid

3 patterns
×

Trying to filter on a window function result in the same SELECT's WHERE clause

Symptom
ERROR: column 'rn' does not exist — or — window functions are not allowed in WHERE
Fix
Window functions are evaluated after WHERE. Wrap the window function in a CTE: WITH ranked AS (SELECT , ROW_NUMBER() OVER (...) AS rn FROM table) SELECT FROM ranked WHERE rn = 1.
×

Using OVER() without PARTITION BY on a large table

Symptom
Query hangs or times out — EXPLAIN shows Sort node on the full table row count, high actual time on the WindowAgg node
Fix
Add PARTITION BY to split the computation into smaller groups. Create a composite index on (partition_col, order_col) to allow index-backed sorting. Test window function queries against production-scale data before deploying.
×

Relying on RANGE default for running totals when data has duplicate ORDER BY values

Symptom
Multiple rows with the same date show the same running total value — transactions from the same day all show the combined daily total instead of individual accumulation
Fix
Specify the frame explicitly: SUM(amount) OVER (ORDER BY sale_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW). ROWS processes each physical row independently regardless of ties.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is the difference between GROUP BY and a window function with PARTI...
Q02SENIOR
What is the difference between ROW_NUMBER(), RANK(), and DENSE_RANK()?
Q03SENIOR
Explain window function frames — what is the default frame and when does...
Q01 of 03JUNIOR

What is the difference between GROUP BY and a window function with PARTITION BY?

ANSWER
GROUP BY collapses multiple rows into a single summary row per group — the individual rows are gone from the result. A window function with PARTITION BY computes the aggregate within each partition but keeps every individual row in the output. The computed value appears as an additional column on each row. Example: GROUP BY rep_name returns one row per rep with a total. SUM(amount) OVER (PARTITION BY rep_name) returns every transaction with the rep's running total alongside each row. Use GROUP BY when you want summaries. Use window functions when you want derived values computed from a group but need the individual rows too.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
Can I use window functions in a WHERE clause?
02
What is the difference between SUM() OVER() and SUM() GROUP BY?
03
Are window functions supported in MySQL?
N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Notes here come from systems that actually shipped.

Follow
Verified
production tested
May 24, 2026
last updated
1,554
articles · all by Naren
🔥

That's SQL Advanced. Mark it forged?

9 min read · try the examples if you haven't

Previous
SQL Transactions and ACID
6 / 16 · SQL Advanced
Next
SQL CTEs — Common Table Expressions