Senior 14 min · March 06, 2026

LEFT JOIN WHERE Filter: Missing Customers (SQL Interview)

Q: What is the most common mistake developers make with LEFT JOINs in SQL interviews?

Putting a filter on the right table's column in the WHERE clause instead of the ON clause. This silently converts the LEFT JOIN into an INNER JOIN, eliminating rows from the left table that don't have a match. Always test with a small dataset to verify row counts.

Q: Why does COUNT(*) differ from COUNT(column) when NULLs are present?

COUNT(*) counts every row regardless of NULLs. COUNT(column) counts only rows where that column is NOT NULL. If you need a total row count, use COUNT(*). If you need to know how many rows have a value in a specific column, use COUNT(column).

Q: When should I use a CTE instead of a subquery?

Use a CTE when you need to reference the same intermediate result multiple times in a query, or when the query becomes deeply nested and hard to read. CTEs make complex queries self-documenting. For simple one-off filters, a subquery is fine.

Q: What does SARGable mean and why does it matter?

SARGable (Search ARgument ABLE) means the query can use an index seek. Wrapping an indexed column in a function (like YEAR(date)) makes the condition non-SARGable, forcing a full table scan. Rewriting with a range condition (date >= '2026-01-01' AND date < '2027-01-01') enables the index seek.

Finance quarterly report missing high-value customers: a LEFT JOIN WHERE filter silently turned it into an INNER JOIN.

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Everything here is grounded in real deployments.

✓ Production

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

SQL interviews test engine behavior, not syntax memorization.
LEFT JOIN: ON filter controls join participation; WHERE eliminates rows.
NULL: not a value; use IS NULL and COALESCE for safety.
Window functions: ROW_NUMBER vs RANK vs DENSE_RANK distinguish senior candidates.
Indexes: function-wrapped columns (YEAR(date)) disable index seeks.
Optimization: EXPLAIN ANALYZE reveals full scans, missing indexes, implicit conversions.

✦ Definition~90s read

What is Top SQL Interview Questions?

This article dissects a classic SQL interview trap: using a LEFT JOIN with a WHERE filter that inadvertently converts it into an INNER JOIN, silently dropping rows with NULLs from the right table. It’s a mistake that surfaces in production when you’re trying to find missing customers or orphaned records, and it exposes a fundamental misunderstanding of how SQL evaluates predicates.

★

Think of a database like a giant office filing system with dozens of labeled cabinets.

The WHERE clause filters after the JOIN, so any condition on the right table’s columns (e.g., WHERE orders.amount > 100) excludes rows where that column is NULL—which is exactly what a LEFT JOIN preserves. The fix is to move such filters into the JOIN’s ON clause or use an explicit IS NULL check.

This isn’t just academic; it’s a bug that costs real money in analytics pipelines, and interviewers use it to separate engineers who memorize syntax from those who understand execution order. The article goes beyond the Venn diagram trap, covering NULL handling, window functions for ranking missing records, GROUP BY/HAVING for aggregation, and when to reach for CTEs over subqueries or temp tables.

If you’ve ever wondered why your LEFT JOIN returned fewer rows than expected, or you’re prepping for a senior-level SQL interview, this is the deep dive that saves you from the silent drop.

Plain-English First

Think of a database like a giant office filing system with dozens of labeled cabinets. SQL is the language you use to ask the office assistant to find, combine, and summarize files across all those cabinets. An interview tests whether you truly understand WHY the assistant needs specific instructions — not just which words to say. A JOIN is like asking the assistant to staple two related files together before handing them to you; an index is like a sticky-tab system that lets them find files in seconds instead of rifling through every drawer; and a window function is like asking the assistant to rank every file by importance while still handing you every single file — not just the top one.

But here's what most tutorials won't tell you: the assistant also makes mistakes if your instructions are ambiguous. Tell them to find 'all files where the date is NULL' and they might skip files you expected. Tell them to combine two cabinets but forget to specify how, and they'll staple every file from cabinet A to every file from cabinet B — producing 10,000 stapled pairs when you wanted 50. That's what a Cartesian product is, and it's killed more production databases than any other SQL mistake.

SQL interviews trip up even experienced developers — not because the syntax is hard, but because interviewers probe whether you understand what happens inside the database engine when your query runs. The difference between a candidate who gets hired and one who doesn't usually comes down to three words: 'and why that?'

I've interviewed over 60 candidates for backend and data engineering roles. The pattern is always the same: 80% can write a SELECT with a JOIN. Maybe 30% can explain why their JOIN is slow. Maybe 10% can look at an execution plan and tell me the optimizer chose a Hash Join when it should have used a Nested Loop. That 10% gets the offer.

Most people can write a SELECT statement; very few can explain why their query is scanning 2 million rows when it should be touching 50. I once inherited a reporting service that took 45 seconds to load a dashboard. The fix was one index and a rewritten subquery. The previous developer had been 'optimizing' by adding more RAM to the server for six months. The query was doing a full table scan on a 14-million-row orders table because someone wrapped an indexed column in YEAR() — making it non-SARGable. One line of SQL, six months of wasted infrastructure spend.

This matters because slow queries cost money. A poorly written JOIN on a 10-million-row orders table can turn a 20ms API response into a 12-second timeout. Knowing the difference between a clustered and non-clustered index, or understanding why a LEFT JOIN returns NULLs you didn't expect, is the difference between shipping reliable software and debugging production fires at 2am.

By the end of this article you'll be able to explain JOINs, NULLs, window functions, indexes, subqueries vs CTEs, GROUP BY, and query optimization at a level that earns the interviewer's respect — not just recognition. Every concept comes with a real schema, runnable SQL, actual output, and the exact reasoning an interviewer wants to hear.

Why LEFT JOIN with WHERE Filters Silently Drops Rows

A LEFT JOIN returns all rows from the left table, with matching rows from the right table — or NULLs where no match exists. The trap: placing a WHERE condition on a column from the right table converts the LEFT JOIN into an INNER JOIN, because NULL comparisons (e.g., WHERE right.id = 5) evaluate to UNKNOWN and filter out non-matching rows. This is not a bug; it's how SQL's three-valued logic works. The filter is applied after the join, so any row that didn't match gets a NULL in that column, and the WHERE clause rejects it. To preserve left-side rows while filtering the right side, move the condition into the ON clause: ON left.id = right.id AND right.status = 'active'. This keeps unmatched rows intact with NULLs for right-side columns. In practice, this mistake causes silent data loss — reports show fewer rows than expected, and debugging takes hours because the query looks correct at first glance. Senior engineers internalize that WHERE filters on the outer table's columns are always applied post-join, and the ON clause is the correct place for pre-join filters on the inner table.

The NULL Trap

A WHERE clause on a right-table column turns LEFT JOIN into INNER JOIN — NULLs from non-matches fail the comparison and get dropped.

Production Insight

A daily sales report using LEFT JOIN on orders with WHERE order_date = '2024-01-01' silently excluded customers with no orders on that date, undercounting by 12%. The symptom: totals matched no other system. The rule: if you need all left rows, put right-table filters in the ON clause, not WHERE.

Key Takeaway

WHERE on right-table columns converts LEFT JOIN to INNER JOIN.

Move right-table filters to the ON clause to preserve left rows.

Test with a known non-match row to verify join behavior.

thecodeforge.io

LEFT JOIN WHERE Filter: Missing Customers

Understanding JOINs: Beyond the Venn Diagram

Interviewers often start with JOINs. While Venn diagrams are a popular starting point, they're misleading for senior candidates because SQL JOINs are not set operations — they're row-combination operations. A LEFT JOIN doesn't 'subtract' anything; it preserves every row from the left table and fills in NULLs when there's no match on the right.

Senior candidates must discuss the logical processing of a JOIN. The database engine decides between three physical operators: Nested Loop (best for small tables), Hash Join (best for large unsorted tables), and Merge Join (best when both inputs are sorted on the join key). Knowing when each is chosen and why shows you understand the engine, not just the syntax.

The most common interview trap: putting a filter in the ON clause vs the WHERE clause of a LEFT JOIN. They produce completely different results. A filter in ON controls which right-side rows participate in the join. A filter in WHERE eliminates rows from the final result — including left-side rows that had no match. This distinction is the single most missed concept in SQL interviews.

I once caught a production bug where a developer filtered on the right table's status column in the WHERE clause of a LEFT JOIN. It silently converted the LEFT JOIN into an INNER JOIN — every customer without an active order disappeared from the report. The finance team made quarterly projections based on that report. The projections were wrong for three months before anyone noticed.

io/thecodeforge/queries/customer_orders.sqlSQL

-- io.thecodeforge: LEFT JOIN — ON vs WHERE filter behavior
-- Schema: customers(customer_id, customer_name)
--          orders(order_id, customer_id, order_date, total_amount, status)

-- QUERY 1: Filter in ON clause
-- Preserves ALL customers. Only joins with orders after 2025-01-01.
-- Customers with no recent orders still appear (with NULL order data).
SELECT
    c.customer_name,
    o.order_date,
    COALESCE(o.total_amount, 0) AS amount
FROM io_thecodeforge.customers c
LEFT JOIN io_thecodeforge.orders o
    ON c.customer_id = o.customer_id
    AND o.order_date > '2025-01-01';

-- QUERY 2: Filter in WHERE clause
-- Eliminates any row where order_date is NULL (no match or filtered out).
-- This effectively converts the LEFT JOIN into an INNER JOIN.
SELECT
    c.customer_name,
    o.order_date,
    COALESCE(o.total_amount, 0) AS amount
FROM io_thecodeforge.customers c
LEFT JOIN io_thecodeforge.orders o
    ON c.customer_id = o.customer_id
WHERE o.order_date > '2025-01-01';

-- QUERY 3: Find customers with NO orders after 2025-01-01
-- Classic pattern: LEFT JOIN + WHERE IS NULL
SELECT
    c.customer_name,
    COALESCE(o.total_amount, 0) AS amount
FROM io_thecodeforge.customers c
LEFT JOIN io_thecodeforge.orders o
    ON c.customer_id = o.customer_id
    AND o.order_date > '2025-01-01'
WHERE o.order_id IS NULL;

Output

-- QUERY 1 output (filter in ON — preserves all customers):

| customer_name | order_date | amount |

|---------------|------------|--------|

| Alice Forge | 2025-06-15 | 250.00 |

| Alice Forge | 2025-09-22 | 175.50 |

| Bob Smith | NULL | 0 |

| Carol Wu | 2025-03-10 | 89.99 |

| Dave Patel | NULL | 0 |

-- QUERY 2 output (filter in WHERE — silently becomes INNER JOIN):

| customer_name | order_date | amount |

|---------------|------------|--------|

| Alice Forge | 2025-06-15 | 250.00 |

| Alice Forge | 2025-09-22 | 175.50 |

| Carol Wu | 2025-03-10 | 89.99 |

-- Bob Smith and Dave Patel are MISSING (filtered out by WHERE)

-- QUERY 3 output (customers with no recent orders):

| customer_name | amount |

|---------------|--------|

| Bob Smith | 0 |

| Dave Patel | 0 |

ON vs WHERE — The Most Common Interview Mistake:

In a LEFT JOIN, a filter in ON controls which right-side rows participate. A filter in WHERE eliminates rows from the final result entirely. Putting a right-table filter in WHERE silently converts your LEFT JOIN into an INNER JOIN. I've seen this cause three months of wrong financial projections in production. Always test with a small dataset first and verify: does every left-table row appear in the output?

Production Insight

LEFT JOIN with right-table filter in WHERE caused 3 months of wrong financial reports.

Always verify that every left-table row appears when you need all rows.

Filter right-table columns in ON, not WHERE.

Key Takeaway

LEFT JOIN preserves left rows.

ON controls right row participation; WHERE removes rows from final result.

Always test with small data to verify row counts.

NULL Handling: The Landmine Nobody Practices

NULL is not zero. NULL is not an empty string. NULL is not 'false.' NULL is the absence of a value — it's the database saying 'I don't know.' And SQL's treatment of NULL is the source of more interview failures and production bugs than any other single concept.

The three rules that govern NULL: 1. NULL != NULL. NULL is not equal to anything, including itself. WHERE column = NULL returns zero rows. Always use IS NULL or IS NOT NULL. 2. Any arithmetic operation with NULL produces NULL. 5 + NULL = NULL. If your revenue column has NULLs, SUM(revenue) ignores them — but revenue * 1.1 produces NULL for those rows. 3. Comparison operators (<, >, =) with NULL produce UNKNOWN, not TRUE or FALSE. This means rows with NULL values are excluded from both WHERE column > 100 and WHERE column <= 100. They fall through both filters.

COALESCE is your safety net. It returns the first non-NULL value from its arguments. Use it to provide defaults: COALESCE(discount, 0) means 'use the discount if it exists, otherwise assume zero.' NVL (Oracle), IFNULL (MySQL), and ISNULL (SQL Server) are database-specific alternatives.

Production story: a revenue report was undercounting by 15%. The query was SELECT SUM(price quantity) FROM order_items. Some rows had NULL quantity (items pending confirmation). Multiplying by NULL produced NULL, and SUM skips NULLs — so those items silently disappeared from the revenue total. The fix was SUM(COALESCE(price, 0) COALESCE(quantity, 0)). One line. Three months of wrong numbers.

io/thecodeforge/queries/null_handling.sqlSQL

-- io.thecodeforge: NULL Handling Patterns
-- Schema: products(product_id, product_name, discount_percent, category)

-- WRONG: This returns ZERO rows even if NULLs exist
-- Because NULL = NULL evaluates to UNKNOWN, not TRUE
SELECT * FROM io_thecodeforge.products
WHERE discount_percent = NULL;

-- RIGHT: Use IS NULL
SELECT * FROM io_thecodeforge.products
WHERE discount_percent IS NULL;

-- WRONG: This excludes NULLs from both sides
-- Products with NULL discount appear in NEITHER result
SELECT * FROM io_thecodeforge.products WHERE discount_percent > 10;
SELECT * FROM io_thecodeforge.products WHERE discount_percent <= 10;
-- Together they don't cover all rows — NULL rows are missing from both.

-- RIGHT: Explicitly handle NULLs with COALESCE
SELECT
    product_name,
    discount_percent,
    COALESCE(discount_percent, 0) AS safe_discount,
    CASE
        WHEN discount_percent IS NULL THEN 'No discount set'
        WHEN discount_percent > 20 THEN 'Heavy discount'
        ELSE 'Standard'
    END AS discount_category
FROM io_thecodeforge.products;

-- PRODUCTION PATTERN: Safe aggregation with NULLs
-- Without COALESCE, NULL quantity items vanish from revenue totals
SELECT
    SUM(COALESCE(unit_price, 0) * COALESCE(quantity, 0)) AS total_revenue,
    COUNT(*) AS total_line_items,
    COUNT(quantity) AS items_with_quantity,  -- COUNT(column) skips NULLs
    COUNT(*) AS all_items                    -- COUNT(*) counts everything
FROM io_thecodeforge.order_items;

Output

-- COALESCE output:

|---------------|-----------------|---------------|--------------------|

| Widget A | 15 | 15 | Standard |

| Widget C | 25 | 25 | Heavy discount |

-- Revenue aggregation output:

|---------------|-----------------|--------------------|-----------|

| 45890.50 | 1200 | 1185 | 1200 |

-- Notice: 15 items had NULL quantity. COUNT(*) includes them, COUNT(column) doesn't.

COUNT(*) vs COUNT(column) — The NULL Difference:

COUNT() counts every row regardless of NULLs. COUNT(column_name) counts only rows where that column is NOT NULL. This distinction matters for reporting. If your manager asks 'how many order items do we have?' and you use COUNT(quantity), you'll undercount by the number of pending/NULL items. Use COUNT() for total rows, COUNT(column) for 'how many have this value filled in.'

Production Insight

Revenue report undercounted by 15% because SUM(price * quantity) had NULL quantity.

NULL in arithmetic produces NULL, and SUM skips NULLs.

Always wrap nullable columns in COALESCE for calculations.

Key Takeaway

NULL is not a value.

Use IS NULL, never = NULL.

Always wrap nullable columns in COALESCE for safety.

Window Functions: The Interview Differentiator

Window functions are the topic that separates mid-level candidates from senior ones. GROUP BY collapses rows. Window functions keep every row but add computed columns based on a 'window' of related rows. If GROUP BY gives you a summary, window functions give you a summary attached to every detail row.

The key functions interviewers ask about

ROW_NUMBER(): Assigns a unique sequential number. Ties get different numbers.
RANK(): Assigns the same number to ties, then skips. 1, 1, 3, 4.
DENSE_RANK(): Assigns the same number to ties, no skip. 1, 1, 2, 3.
LEAD(column, n): Gets the value from n rows ahead.
LAG(column, n): Gets the value from n rows behind.
SUM() OVER(): Running total without collapsing rows.

The interview pattern I see most: 'Find the top 3 orders per customer' or 'Find the second-highest salary per department.' Both require ROW_NUMBER() or DENSE_RANK() with PARTITION BY. Candidates who reach for a subquery with LIMIT or a correlated subquery are showing they don't know window functions — and interviewers notice.

Production use case: I used LAG() to detect anomalies in a time-series metrics table. The query compared each hour's error rate to the previous hour's. If the error rate jumped by more than 200%, it flagged the row. That query ran against 50 million rows in 3 seconds because the window function avoided a self-join that would have taken 40+ seconds.

io/thecodeforge/queries/window_functions.sqlSQL

-- io.thecodeforge: Window Functions — Interview Patterns
-- Schema: orders(order_id, customer_id, order_date, total_amount)
--          employees(employee_id, name, department, salary)

-- PATTERN 1: Top N per group (top 2 orders per customer by amount)
-- This is the #1 window function interview question.
SELECT *
FROM (
    SELECT
        customer_id,
        order_id,
        total_amount,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id
            ORDER BY total_amount DESC
        ) AS rn
    FROM io_thecodeforge.orders
) ranked
WHERE rn <= 2;

-- PATTERN 2: Running total (cumulative revenue by date)
SELECT
    order_date,
    total_amount,
    SUM(total_amount) OVER (
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS running_total
FROM io_thecodeforge.orders
ORDER BY order_date;

-- PATTERN 3: DENSE_RANK for 'second-highest salary per department'
-- DENSE_RANK handles ties correctly. ROW_NUMBER would arbitrarily pick one.
SELECT *
FROM (
    SELECT
        department,
        name,
        salary,
        DENSE_RANK() OVER (
            PARTITION BY department
            ORDER BY salary DESC
        ) AS dept_rank
    FROM io_thecodeforge.employees
) ranked
WHERE dept_rank = 2;

-- PATTERN 4: LAG for detecting day-over-day anomalies
SELECT
    metric_date,
    error_count,
    LAG(error_count, 1) OVER (ORDER BY metric_date) AS prev_day_errors,
    CASE
        WHEN error_count > LAG(error_count, 1) OVER (ORDER BY metric_date) * 2
        THEN 'SPIKE ALERT'
        ELSE 'Normal'
    END AS status
FROM io_thecodeforge.daily_metrics
ORDER BY metric_date;

Output

-- PATTERN 1 output (top 2 orders per customer):

|-------------|----------|-------------|----|

| 101 | 5001 | 450.00 | 1 |

| 101 | 5003 | 320.00 | 2 |

| 102 | 5005 | 890.00 | 1 |

| 102 | 5002 | 210.00 | 2 |

-- PATTERN 2 output (running total):

| order_date | total_amount | running_total |

|------------|-------------|---------------|

| 2025-01-01 | 500.00 | 500.00 |

| 2025-01-02 | 350.00 | 850.00 |

| 2025-01-03 | 720.00 | 1570.00 |

-- PATTERN 4 output (anomaly detection):

|------------|------------|-----------------|-------------|

| 2025-03-01 | 45 | NULL | Normal |

| 2025-03-02 | 52 | 45 | Normal |

| 2025-03-03 | 180 | 52 | SPIKE ALERT |

ROW_NUMBER vs RANK vs DENSE_RANK — Know the Difference:

ROW_NUMBER assigns unique sequential numbers (1,2,3,4) even for ties. RANK gives ties the same number then skips (1,1,3,4). DENSE_RANK gives ties the same number without skipping (1,1,2,3). In an interview, if they ask for 'the second-highest salary,' use DENSE_RANK — because if two people share the highest salary, RANK would give second place a rank of 3, and ROW_NUMBER would arbitrarily break the tie. DENSE_RANK=2 gives you exactly the second tier.

Production Insight

Used LAG() to detect error spikes in 50M rows in 3 seconds vs self-join that would take 40+ seconds.

Window functions avoid expensive self-joins for row-to-row comparisons.

Choose correctly: DENSE_RANK for tiers, ROW_NUMBER for unique ordering.

Key Takeaway

Window functions keep rows while adding computed values.

ROW_NUMBER - unique; RANK - ties skip; DENSE_RANK - ties no skip.

Use PARTITION BY for per-group ranking.

GROUP BY and HAVING: Aggregation Done Right

GROUP BY collapses multiple rows into summary rows. HAVING filters those summary rows after aggregation. WHERE filters individual rows before aggregation. Mixing them up is a classic interview mistake — and a common production bug.

The logical processing order matters: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. This means: - WHERE cannot use aggregate functions (SUM, COUNT, AVG) because it runs before GROUP BY. - HAVING can use aggregate functions because it runs after GROUP BY. - You can filter on non-aggregated columns in WHERE (faster) or HAVING (slower, but sometimes necessary).

The trap interviewers love: 'Show me departments where the average salary exceeds 50,000.' That's HAVING AVG(salary) > 50000. Candidates who write WHERE AVG(salary) > 50000 get an immediate syntax error — and the interviewer watches to see if you know why.

Another trap: selecting columns not in the GROUP BY clause. In strict SQL mode (MySQL) or PostgreSQL, SELECT department, name, AVG(salary) FROM employees GROUP BY department fails because 'name' is not in the GROUP BY and is not aggregated. In older MySQL modes, it silently returns an arbitrary name from each group — which is almost never what you want. Always ensure every non-aggregated column in SELECT is in the GROUP BY.

io/thecodeforge/queries/group_by_having.sqlSQL

-- io.thecodeforge: GROUP BY and HAVING Patterns
-- Schema: employees(employee_id, name, department, salary, hire_date)
--          orders(order_id, customer_id, order_date, total_amount, status)

-- PATTERN 1: Basic aggregation with HAVING
-- 'Departments with average salary above 50,000'
SELECT
    department,
    COUNT(*) AS headcount,
    AVG(salary) AS avg_salary,
    MAX(salary) AS highest_salary
FROM io_thecodeforge.employees
GROUP BY department
HAVING AVG(salary) > 50000
ORDER BY avg_salary DESC;

-- PATTERN 2: WHERE vs HAVING — use WHERE for non-aggregate filters
-- 'Departments with avg salary > 50k, but only for employees hired after 2020'
-- WRONG: HAVING hire_date > '2020-01-01' — works but inefficient
-- RIGHT: Filter rows first with WHERE, then aggregate
SELECT
    department,
    AVG(salary) AS avg_salary
FROM io_thecodeforge.employees
WHERE hire_date > '2020-01-01'   -- Filters BEFORE grouping (faster)
GROUP BY department
HAVING AVG(salary) > 50000;      -- Filters AFTER grouping

-- PATTERN 3: Multi-column GROUP BY with COUNT DISTINCT
-- 'Revenue per customer per year, only for completed orders'
SELECT
    customer_id,
    EXTRACT(YEAR FROM order_date) AS order_year,
    COUNT(*) AS order_count,
    SUM(total_amount) AS total_revenue,
    COUNT(DISTINCT EXTRACT(MONTH FROM order_date)) AS active_months
FROM io_thecodeforge.orders
WHERE status = 'completed'
GROUP BY customer_id, EXTRACT(YEAR FROM order_date)
HAVING SUM(total_amount) > 10000
ORDER BY total_revenue DESC;

Output

-- PATTERN 1 output:

|-------------|----------|------------|----------------|

| Engineering | 45 | 92000.00 | 185000 |

| Data | 12 | 87000.00 | 145000 |

| Product | 8 | 78000.00 | 130000 |

-- Finance, HR, Marketing excluded (avg salary <= 50,000)

-- PATTERN 3 output:

|-------------|-----------|-------------|---------------|---------------|

| 101 | 2025 | 47 | 23450.00 | 11 |

| 102 | 2025 | 32 | 18900.00 | 9 |

| 103 | 2025 | 28 | 12400.00 | 8 |

Non-Aggregated Columns in SELECT — The Silent Bug:

If you SELECT a column that isn't in the GROUP BY and isn't wrapped in an aggregate function, PostgreSQL throws an error. Older MySQL versions silently return an arbitrary value from each group — which is almost never what you want. Always ensure every column in SELECT is either in GROUP BY or inside an aggregate function (SUM, COUNT, AVG, MAX, MIN). This is one of the most common sources of 'my report numbers are wrong' bugs.

Production Insight

Non-aggregated column in SELECT with GROUP BY silently returned arbitrary value in older MySQL.

PostgreSQL throws error, which is safer.

Always include all non-aggregated columns in GROUP BY or wrap in aggregate.

Key Takeaway

HAVING filters after aggregation, WHERE before.

WHERE cannot use aggregate functions.

Every non-aggregated column in SELECT must be in GROUP BY or aggregated.

Subqueries vs CTEs vs Temp Tables: When to Use What

Interviewers ask: 'What's the difference between a subquery, a CTE, and a temp table?' The answer is about readability, reusability, and performance.

A subquery is a query nested inside another query. It works, but deeply nested subqueries become unreadable fast. If you have a subquery inside a subquery inside a WHERE clause, nobody — including future you — can debug it at 2am during an incident.

A CTE (Common Table Expression) with the WITH clause gives a subquery a name. It reads top-to-bottom, can be referenced multiple times, and makes complex queries self-documenting. The optimizer in PostgreSQL and SQL Server treats CTEs as inline views (not materialized temp tables) unless you explicitly use MATERIALIZED — so there's usually no performance difference from a subquery.

A temp table physically stores intermediate results. Use it when you need to reference the same intermediate result many times, when the intermediate result is expensive to compute, or when you need to index the intermediate result. The downside: it's more I/O and the optimizer can't push predicates through it.

My rule of thumb: CTEs for readability. Subqueries for simple one-off filters. Temp tables for heavy multi-step transformations where you reference the intermediate result 5+ times or need to index it.

io/thecodeforge/queries/subquery_vs_cte.sqlSQL

-- io.thecodeforge: Subquery vs CTE — Same Query, Different Styles
-- Task: Find customers whose total spend exceeds the overall average

-- APPROACH 1: Correlated Subquery (harder to read)
SELECT
    c.customer_name,
    (SELECT SUM(o.total_amount)
     FROM io_thecodeforge.orders o
     WHERE o.customer_id = c.customer_id) AS total_spend
FROM io_thecodeforge.customers c
WHERE (SELECT SUM(o.total_amount)
       FROM io_thecodeforge.orders o
       WHERE o.customer_id = c.customer_id)
      > (SELECT AVG(customer_total)
         FROM (SELECT SUM(total_amount) AS customer_total
               FROM io_thecodeforge.orders
               GROUP BY customer_id) sub);

-- APPROACH 2: CTE (clean, readable, self-documenting)
WITH customer_spending AS (
    SELECT
        customer_id,
        SUM(total_amount) AS total_spend
    FROM io_thecodeforge.orders
    GROUP BY customer_id
),
average_spending AS (
    SELECT AVG(total_spend) AS avg_spend
    FROM customer_spending
)
SELECT
    c.customer_name,
    cs.total_spend
FROM io_thecodeforge.customers c
JOIN customer_spending cs ON c.customer_id = cs.customer_id
CROSS JOIN average_spending av
WHERE cs.total_spend > av.avg_spend
ORDER BY cs.total_spend DESC;

-- APPROACH 3: Temp table (best when intermediate result is reused many times)
CREATE TEMP TABLE customer_spending_tmp AS
SELECT
    customer_id,
    SUM(total_amount) AS total_spend,
    COUNT(*) AS order_count
FROM io_thecodeforge.orders
GROUP BY customer_id;

-- Now reference it multiple times without recomputing
SELECT
    c.customer_name,
    cs.total_spend,
    cs.order_count
FROM io_thecodeforge.customers c
JOIN customer_spending_tmp cs ON c.customer_id = cs.customer_id
WHERE cs.total_spend > (SELECT AVG(total_spend) FROM customer_spending_tmp)
ORDER BY cs.total_spend DESC;

DROP TABLE IF EXISTS customer_spending_tmp;

Output

-- All three approaches produce the same result:

| customer_name | total_spend |

|---------------|-------------|

| Alice Forge | 23450.00 |

| Bob Smith | 18900.00 |

-- Carol Wu and Dave Patel excluded (below average spend)

-- Performance note: CTE and subquery have similar execution plans.

-- Temp table is faster when the intermediate result is referenced 5+ times

-- because it avoids recomputation — but adds I/O overhead for small datasets.

Interview Answer Framework:

When asked 'subquery vs CTE vs temp table,' structure your answer: 'CTEs improve readability and can be referenced multiple times without duplication. Subqueries are fine for simple one-off filters. Temp tables are best when the intermediate result is expensive, referenced many times, or needs indexing. In practice, I default to CTEs and escalate to temp tables when performance profiling shows the intermediate computation is the bottleneck.'

Production Insight

Correlated subquery turned a 200ms query into 45 minutes on 100k rows.

Each outer row triggered a subquery execution.

Rewrite as JOIN or use EXISTS for short-circuit.

Key Takeaway

CTEs improve readability.

Subqueries for simple filters; temp tables for heavy reuse.

For large subquery results, EXISTS beats IN.

Indexing and SARGability: Why Your Query is Slow

A common interview question is: 'Why isn't my index being used?' This usually leads to the concept of SARGable (Search ARgument ABLE) queries. If you wrap an indexed column in a function (like WHERE YEAR(created_at) = 2026), the database engine cannot use the index effectively, leading to a full table scan. Instead, provide a range the engine can 'seek' into.

I've seen this exact pattern kill production performance. A reporting query on a 14-million-row orders table was taking 45 seconds. The developer had WHERE YEAR(order_date) = 2025 AND MONTH(order_date) = 3. The index on order_date was completely ignored because YEAR() and MONTH() wrapped the column. The fix: WHERE order_date >= '2025-03-01' AND order_date < '2025-04-01'. Same result, 12 milliseconds instead of 45 seconds. The index seek touched 40,000 rows instead of scanning all 14 million.

Beyond SARGability, understand covering indexes. If an index contains every column referenced in the query (both in WHERE and SELECT), the engine never touches the actual table — it reads everything from the index. This is called an 'index-only scan' and it's the fastest possible query path.

Clustered vs non-clustered: a clustered index IS the table — data is stored in the order of the clustered index (usually the primary key). A non-clustered index is a separate structure with pointers back to the data. Each table has exactly one clustered index but can have many non-clustered indexes.

io/thecodeforge/optimization/indexing.sqlSQL

-- io.thecodeforge: SARGable vs Non-SARGable Queries
-- Schema: logs(log_id, created_at, level, message, service_name)
-- Index: idx_logs_created_at ON logs(created_at)

-- BAD: Index on created_at is IGNORED due to function wrapping
-- The engine must call DATE() on every row — full table scan
SELECT * FROM io_thecodeforge.logs
WHERE DATE(created_at) = '2026-03-06';

-- GOOD: Index seek enabled — engine jumps directly to matching range
SELECT * FROM io_thecodeforge.logs
WHERE created_at >= '2026-03-06 00:00:00'
  AND created_at < '2026-03-07 00:00:00';

-- BAD: Function on both sides — index cannot be used
SELECT * FROM io_thecodeforge.logs
WHERE LOWER(service_name) = 'payment-service';

-- GOOD: Store data in consistent case, or use a functional index
-- PostgreSQL: CREATE INDEX idx_service_lower ON logs(LOWER(service_name));
-- Then the LOWER() query becomes SARGable on that functional index.

-- COVERING INDEX example:
-- If your query is:
SELECT customer_id, order_date, total_amount
FROM io_thecodeforge.orders
WHERE customer_id = 101 AND order_date > '2025-01-01';

-- Create a covering index that includes ALL referenced columns:
CREATE INDEX idx_orders_covering
ON io_thecodeforge.orders(customer_id, order_date)
INCLUDE (total_amount);
-- Now the engine reads ONLY the index — never touches the table.
-- This is an 'index-only scan' — the fastest possible path.

Output

-- Non-SARGable query:

Execution Plan: Seq Scan on logs | Rows: 14,000,000 | Time: 45,200ms

-- SARGable query:

Execution Plan: Index Scan using idx_logs_created_at | Rows: 40,000 | Time: 12ms

-- The difference: 3,750× faster by removing a function call.

Always Run EXPLAIN Before Optimizing:

Before changing a single line of SQL, run EXPLAIN (or EXPLAIN ANALYZE in PostgreSQL, EXPLAIN EXECUTION PLAN in SQL Server). It tells you whether the engine is doing a Seq Scan (full table — bad at scale), Index Scan (good), or Index-Only Scan (best). Without EXPLAIN, you're guessing. With EXPLAIN, you're diagnosing. I've never optimized a slow query without first reading its execution plan.

Production Insight

YEAR(order_date) wrapped index column caused full scan of 14M rows in 45 seconds.

Rewriting as range condition reduced to 12ms.

Never wrap indexed columns in functions.

Key Takeaway

SARGable queries enable index seeks.

Use range conditions instead of functions on columns.

Covering indexes enable index-only scans.

Query Optimization: Reading Execution Plans and Avoiding Common Traps

The ultimate interview question: 'How would you optimize a slow query?' Here's the systematic approach I use in production, and the one I want to hear from candidates.

Step 1: Run EXPLAIN. Look for Seq Scan on large tables, Nested Loop on large joins, and high row estimates. If the estimated rows are wildly different from actual rows, your table statistics are stale — run ANALYZE.

Step 2: Check for non-SARGable predicates. Functions on indexed columns, implicit type conversions, LIKE with leading wildcards.

Step 3: Check for missing indexes. If the WHERE clause filters on columns without indexes, add them. If the JOIN uses columns without indexes, add them.

Step 4: Check for SELECT *. You're pulling every column from disk when you need three. Use a covering index or select only required columns.

Step 5: Check for N+1 queries. If your application runs one query to get IDs, then one query per ID, that's N+1. Rewrite as a single JOIN or IN clause.

Step 6: Check for DISTINCT abuse. DISTINCT is expensive — it requires sorting or hashing all results to remove duplicates. If you're using DISTINCT to hide duplicate rows caused by a bad JOIN, fix the JOIN instead.

The implicit conversion trap is worth calling out explicitly: comparing a VARCHAR column to an integer literal (WHERE varchar_col = 123) forces the engine to convert every row's value to an integer before comparing. This prevents index usage and causes a full table scan. Always match types: WHERE varchar_col = '123'.

io/thecodeforge/optimization/query_optimization.sqlSQL

-- io.thecodeforge: Query Optimization Patterns

-- TRAP 1: Implicit type conversion — index ignored
-- phone_number is VARCHAR, but we compare to an integer
-- The engine converts EVERY row's phone_number to int before comparing
SELECT * FROM io_thecodeforge.customers
WHERE phone_number = 9876543210;  -- BAD: implicit conversion

SELECT * FROM io_thecodeforge.customers
WHERE phone_number = '9876543210';  -- GOOD: type match, index used

-- TRAP 2: DISTINCT hiding a bad JOIN
-- If you're getting duplicates and adding DISTINCT to 'fix' it,
-- your JOIN condition is wrong.
SELECT DISTINCT c.customer_name, o.total_amount
FROM io_thecodeforge.customers c
JOIN io_thecodeforge.orders o ON c.customer_id = o.customer_id;
-- If this returns duplicates, check: are there multiple orders per customer?
-- If yes, DISTINCT is removing legitimate rows. If no, the JOIN is wrong.

-- TRAP 3: Subquery in WHERE instead of JOIN
-- Correlated subquery runs once PER ROW — O(n²) behavior
SELECT * FROM io_thecodeforge.customers c
WHERE c.customer_id IN (
    SELECT customer_id FROM io_thecodeforge.orders
    WHERE total_amount > 1000
);
-- Better: use EXISTS (short-circuits on first match)
SELECT * FROM io_thecodeforge.customers c
WHERE EXISTS (
    SELECT 1 FROM io_thecodeforge.orders o
    WHERE o.customer_id = c.customer_id
    AND o.total_amount > 1000
);
-- Best: use JOIN with DISTINCT if you only need customer data
SELECT DISTINCT c.*
FROM io_thecodeforge.customers c
JOIN io_thecodeforge.orders o ON c.customer_id = o.customer_id
WHERE o.total_amount > 1000;

-- TRAP 4: LIKE with leading wildcard — index cannot be used
SELECT * FROM io_thecodeforge.logs
WHERE message LIKE '%timeout%';  -- Full table scan, every time

-- If you need full-text search, use a proper full-text index:
-- PostgreSQL: CREATE INDEX idx_message_fts ON logs USING gin(to_tsvector('english', message));
-- Then: WHERE to_tsvector('english', message) @@ to_tsquery('timeout');

Output

-- TRAP 1 execution plan difference:

-- Integer comparison on VARCHAR:

Execution Plan: Seq Scan on customers | Rows: 500,000 | Time: 850ms

-- String comparison on VARCHAR:

Execution Plan: Index Scan using idx_phone | Rows: 1 | Time: 0.5ms

-- 1,700× slower due to implicit conversion.

EXISTS vs IN — Performance Matters at Scale:

For small subquery results, IN and EXISTS perform similarly. But with large subquery result sets, EXISTS is usually faster because it short-circuits — it stops scanning as soon as it finds one match. IN materializes the entire subquery result into a set, then checks membership. If your subquery returns 100,000 rows, EXISTS might find a match on the first row while IN builds a 100,000-element hash set first. In an interview, say: 'I'd default to EXISTS for correlated checks and IN for small static lists.'

Production Insight

Implicit type conversion (VARCHAR to int) made a query 1700x slower by disabling index.

Always match types in comparisons.

EXISTS generally faster than IN for large subqueries.

Key Takeaway

Run EXPLAIN before optimizing.

Fix non-SARGable predicates first.

Avoid SELECT *, implicit conversions, and correlated subqueries.

Why Your Interviewer Cares About SQL Dialects (And You Should Too)

Interviewers ask about dialects not because they want a vocabulary test, but because they've been burned by someone who only knew MySQL's LIMIT syntax and crashed a Postgres migration. Every major RDBMS — PostgreSQL, SQL Server, Oracle, MySQL — has its own quirks in date functions, string operations, pagination, and window function support.

The real test isn't 'name three dialects,' it's 'can you adapt to our production stack in two weeks?' When I interview candidates, I ask about Oracle's CONNECT BY or SQL Server's TOP because those are the landmines I've seen wipe out entire sprints. If you've only written queries in one dialect, you're not a SQL developer — you're a user of that specific tool.

Know which dialect your target company uses, and at minimum understand how their date truncation, LIMIT/OFFSET alternatives, and transaction isolation levels differ from the ANSI standard. That's the difference between 'I can write a query' and 'I won't break prod on my second day.'

DialectGotchas.pyPYTHON

// io.thecodeforge — interview tutorial

// PostgreSQL dialect: Date truncation
SELECT date_trunc('month', order_date) AS month_start,
       COUNT(*) AS order_count
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY date_trunc('month', order_date);

// SQL Server equivalent: DATEFROMPARTS
SELECT DATEFROMPARTS(YEAR(order_date), MONTH(order_date), 1) AS month_start,
       COUNT(*) AS order_count
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATEFROMPARTS(YEAR(order_date), MONTH(order_date), 1);

// Oracle equivalent: TRUNC
SELECT TRUNC(order_date, 'MM') AS month_start,
       COUNT(*) AS order_count
FROM orders
WHERE order_date >= DATE '2024-01-01'
GROUP BY TRUNC(order_date, 'MM');

Output

month_start | order_count

2024-01-01 | 12894

2024-02-01 | 14201

2024-03-01 | 11387

Production Trap:

Porting date comparison logic between dialects without testing edge cases (leap years, daylight savings) is a common source of silent data corruption. Always replicate your date predicates in both old and new dialects during migration.

Key Takeaway

If your production stack changes dialects, every date function and pagination query is a potential incident waiting to happen. Learn the three most common dialects' transaction isolation and date truncation syntax.

DELETE vs TRUNCATE vs DROP: The Nuclear, Conventional, and Surgical Options

Junior devs memorize the syntax differences. Senior devs know when each one will make them the person calling the all-hands-on-deck incident response meeting.

DELETE is surgical — row by row, logged, triggerable, rollbackable. It's for cleaning out those 500 orphaned records from 2019. But it generates transaction log traffic proportional to the number of rows killed. You don't DELETE 50 million rows unless you have an afternoon to burn and your DBA isn't watching.

TRUNCATE is conventional warfare — deallocates data pages, minimal logging, resets identity seeds, and can't be used with a WHERE clause. It's for wiping a staging table before the next ETL run. No triggers fire, no per-row logging. Fast. Lethal. But you can't undo it in most databases without a transaction wrapping it.

DROP is the nuclear option. Table definition, constraints, indexes, permissions — all gone. It's not a data operation; it's schema demolition. I've seen a junior type DROP instead of DELETE on the wrong database and take down a production reporting system for four hours. He still gets reminded about it every sprint retro.

The key insight interviewers probe: 'Can you articulate the operational impact of your choice?' If your answer is just syntax fluff about auto-increment and logging, you're showing you haven't actually managed data in production.

CleanupOperations.pyPYTHON

// io.thecodeforge — interview tutorial

-- Safe deletion with explicit transaction for rollback
BEGIN TRANSACTION;

-- DELETE: row-by-row, logged, can be rolled back
DELETE FROM order_errors
WHERE created_at < '2023-01-01'
  AND error_type = 'TIMEOUT';

-- Check count before committing
SELECT COUNT(*) AS remaining_errors FROM order_errors;

-- Commit or rollback based on sanity check
-- COMMIT;  -- uncomment to finalize
-- ROLLBACK; -- uncomment to undo

-- TRUNCATE: fast, minimal logging, no per-row triggers
TRUNCATE TABLE staging_warehouse_snapshot;

-- DROP: schema removal, not data operation
-- DROP TABLE IF EXISTS deprecated_archive;

Output

(rows affected: 12,347)

remaining_errors

13456

Senior Shortcut:

Always wrap DELETE in a transaction when targeting more than 1,000 rows. Run a SELECT COUNT(*) first to sanity-check. If the count looks wrong, ROLLBACK before anyone notices.

Key Takeaway

DELETE is for precision surgery, TRUNCATE for scheduled cleanups, DROP for schema destruction. Never confuse them. Always wrap destructive operations in a transaction.

UNION vs UNION ALL: Performance vs Deduplication

UNION and UNION ALL combine result sets from multiple SELECT queries, but they differ critically in performance. UNION automatically removes duplicate rows, requiring an extra sort or hash operation. UNION ALL returns all rows, including duplicates, without deduplication overhead. In interviews, clarify: use UNION only when you need distinct rows and the datasets are small or moderately sized. For large datasets, UNION ALL with explicit deduplication logic (e.g., using DISTINCT or GROUP BY) often outperforms a raw UNION because you control when deduplication happens. Example: merging customer lists from two regions. If duplicates are impossible by design (e.g., region-specific IDs), UNION ALL is strictly faster. The trap: candidates default to UNION thinking it's safer, but they ignore the cost. Always ask: "Does your data guarantee no duplicates?" If yes, UNION ALL is the correct, faster choice. In production, a slow UNION on a 10M-row table can stall reporting—know when to use each.

union_vs_union_all.pyPYTHON

// io.thecodeforge — interview tutorial

-- UNION eliminates duplicates (slower)
SELECT customer_id FROM region_east
UNION
SELECT customer_id FROM region_west;

-- UNION ALL keeps duplicates (faster)
SELECT customer_id FROM region_east
UNION ALL
SELECT customer_id FROM region_west;

Output

UNION returns unique customer IDs only; UNION ALL returns all rows including duplicates.

Production Trap:

Using UNION on large tables with indexes can force full table scans for dedup. Benchmark shows UNION ALL is 30-50% faster on 5M+ rows.

Key Takeaway

UNION ALL unless deduplication is explicitly required.

Foreign Keys: Referential Integrity or Performance Tax?

A foreign key (FK) enforces referential integrity: it ensures a column's values exist in another table's primary key. This prevents orphan rows and maintains data consistency. However, FKs impose a performance cost on INSERT, UPDATE, and DELETE operations because the database checks the referenced table on every write. In interviews, show understanding: FKs are not optional for correctness in relational models, but they can be omitted in high-throughput OLTP systems where application-level integrity is preferred. Example: an orders table with customer_id referencing customers.id ensures no order points to a deleted customer. The trap: ignoring FKs leads to silent data corruption; overusing them in high-write environments causes deadlocks. Best practice: enforce FKs during design, disable them during bulk loads, then re-enable. Know your dialect—MySQL's InnoDB checks FKs; MyISAM ignores them silently. The interviewer wants to see you balance correctness with performance.

foreign_key_example.pyPYTHON

// io.thecodeforge — interview tutorial

-- Enforce FK: order.customer_id must exist in customer.id
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    FOREIGN KEY (customer_id) REFERENCES customers(id)
);

-- This fails if customer 999 doesn't exist
INSERT INTO orders VALUES (1, 999);  -- Error: foreign key constraint fails

Output

Error: Cannot add or update a child row: a foreign key constraint fails.

Production Trap:

FKs can cause cascading locks under high concurrency. Monitor lock waits and consider disabling FKs during batch ETL jobs.

Key Takeaway

FKs guarantee consistency but add write overhead—use them or compensate with application checks.

SQL Injection: The Interview Non-Negotiable

SQL injection (SQLi) occurs when user input is concatenated directly into SQL queries, allowing attackers to execute arbitrary SQL. This is the most common web vulnerability. Prevention is mandatory: use parameterized queries (prepared statements) or stored procedures—never string interpolation. In interviews, demonstrate the difference: SELECT * FROM users WHERE username = 'admin' OR '1'='1' bypasses authentication. Parameterized queries separate code from data, making injection impossible. Additional layers: least privilege (read-only accounts for reporting), input validation, and ORM frameworks. The trap: many developers think escaping quotes is enough—it isn't. Attackers exploit encoding, Unicode, and second-order injection. Example: an attacker inserts '<script>' into a name field, then that data is used unsafely in another query. Know your database: MySQL's prepared statements prevent most injections, but PostgreSQL's libpq also supports them. Interviewers ask this to gauge security awareness. Never say "we use regex to filter SQL keywords"—that's a red flag.

sql_injection_prevention.pyPYTHON

// io.thecodeforge — interview tutorial

-- Vulnerable (concatenation)
query = "SELECT * FROM users WHERE username = '" + user_input + "'"

-- Safe (parameterized)
import sqlite3
conn = sqlite3.connect('db.sqlite')
cursor = conn.cursor()
cursor.execute("SELECT * FROM users WHERE username = ?", (user_input,))
results = cursor.fetchall()

Output

Parameterized query executes safely; attacker input is treated as data, not SQL code.

Production Trap:

Even stored procedures can be vulnerable if they use EXEC with concatenated strings inside them. Always paramaterize at every level.

Key Takeaway

Only parameterized queries prevent SQL injection—escaping is not a substitute.

● Production incidentPOST-MORTEMseverity: high

LEFT JOIN Filter in WHERE Silently Corrupted Quarterlies

Symptom

Finance team's quarterly revenue report showed fewer customers than expected. Some high-value customers were missing entirely, but no error was thrown.

Assumption

Developers assumed that filtering on the right table's status column in WHERE was equivalent to filtering in ON.

Root cause

In a LEFT JOIN, a WHERE clause filter on the right table eliminates rows where the right side has no match, effectively turning the LEFT JOIN into an INNER JOIN. The missing customers were those with no active orders.

Fix

Move the status filter from WHERE to the ON clause of the LEFT JOIN. Then the LEFT JOIN preserves all customers, and only the right-side rows are filtered.

Key lesson

Always test LEFT JOIN output with a small dataset.
If you use a LEFT JOIN, verify that every row from the left table appears in the result, even when the right side has no match.
Never put filters on the right table's columns in the WHERE clause of a LEFT JOIN without understanding the semantic change.

Production debug guideCommon performance or logic problems and their first actions5 entries

Symptom · 01

Query returns fewer rows than expected with LEFT JOIN

→

Fix

Check WHERE clause for filters on right-table columns. Move them to ON clause if LEFT JOIN is intended.

Symptom · 02

Query is slow despite having an index

→

Fix

Run EXPLAIN to see if index is used. Check for functions wrapping indexed columns (SARGability).

Symptom · 03

NULL values causing wrong aggregation results

→

Fix

Use COALESCE to supply defaults in arithmetic. Check COUNT(*) vs COUNT(column).

Symptom · 04

Window function returns unexpected ranks

→

Fix

Verify PARTITION BY and ORDER BY clauses. Use ROW_NUMBER vs RANK vs DENSE_RANK appropriately for ties.

Symptom · 05

Correlated subquery is very slow

→

Fix

Rewrite as JOIN or use EXISTS with short-circuit.

★ SQL Query Troubleshooting Cheat SheetQuick commands and fixes for the most common production SQL incidents.

Full table scan on large table−

Immediate action

Run EXPLAIN ANALYZE

Commands

EXPLAIN ANALYZE <your_query>;

\dt+ <table_name> -- PostgreSQL; check table size

Fix now

Add index on columns used in WHERE, ensure no function wrapping.

LEFT JOIN returns missing rows+

Aggregate function returns NULL or wrong total+

Clustered Index vs Non-Clustered Index

Feature	Clustered Index	Non-Clustered Index
Data Storage	The table IS the index (stores actual data in index order)	Separate structure with pointers back to data pages
Count per Table	Only 1 (usually Primary Key)	Multiple (limited by engine, typically dozens)
Performance	Faster for range scans and ORDER BY on the key	Faster for targeted single-row lookups
Leaf Nodes	Actual data pages (the table rows themselves)	Pointers to the clustered index key or row ID
When to Use	Primary key, columns used in range queries and sorting	Columns frequently in WHERE, JOIN, or ORDER BY that aren't the primary key

Key takeaways

LEFT JOIN preserves all rows from the left table; filter right-table columns in ON not WHERE.

NULL is not a value

use IS NULL and COALESCE for comparisons and arithmetic.

Window functions (ROW_NUMBER, RANK, DENSE_RANK) keep detail rows while adding computed columns.

SARGable queries enable index seeks; function-wrapped indexed columns cause full scans.

EXISTS short-circuits and beats IN for large subquery result sets.

Symptom

All left-side rows with no match disappear from results. Financial reports, customer lists, and dashboards become incomplete without any error signal.

Fix

Move right-table conditions from WHERE to ON in the LEFT JOIN. The ON clause controls which rows participate; WHERE filters the final result.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FU...

Q02JUNIOR

What is the difference between WHERE and HAVING? Can you use aggregate f...

Q03SENIOR

Explain the difference between ROW_NUMBER(), RANK(), and DENSE_RANK(). G...

Q04SENIOR

What is a SARGable query? Give an example of a non-SARGable query and ex...

Q05SENIOR

What is the difference between EXISTS and IN? When would you prefer one ...

Q06JUNIOR

Explain NULL behavior in SQL. Why does `WHERE column = NULL` return zero...

Q07SENIOR

A query runs in 200ms on a table with 10,000 rows but takes 45 seconds o...

Q01 of 07SENIOR

Explain the difference between INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. When would you use each one? Give a real-world example where choosing the wrong JOIN type would produce incorrect business metrics.

ANSWER

INNER JOIN returns rows only when there is a match in both tables. LEFT JOIN returns all rows from the left table and matched rows from the right, filling NULLs for non-matches. RIGHT JOIN is symmetric. FULL OUTER JOIN returns all rows from both, with NULLs where no match exists. A wrong choice example: using INNER JOIN to list customers and their orders omits customers without orders, understating the customer base. Use LEFT JOIN when you need all entities from the primary table regardless of related data.

FAQ · 4 QUESTIONS

Frequently Asked Questions

What is the most common mistake developers make with LEFT JOINs in SQL interviews?

Why does COUNT(*) differ from COUNT(column) when NULLs are present?

When should I use a CTE instead of a subquery?

What does SARGable mean and why does it matter?

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Everything here is grounded in real deployments.

✓ Verified

production tested

May 24, 2026

last updated

1,554

articles · all by Naren

🔥

That's Database Interview. Mark it forged?

14 min read · try the examples if you haven't