Skip to content
Home Database SQL Subqueries

SQL Subqueries

Where developers are forged. · Structured learning · Free forever.
📍 Part of: SQL Basics → Topic 10 of 16
SQL subqueries explained — scalar subqueries, correlated subqueries, EXISTS vs IN, subqueries in FROM, and when to use CTEs instead.
⚙️ Intermediate — basic Database knowledge assumed
In this tutorial, you'll learn
SQL subqueries explained — scalar subqueries, correlated subqueries, EXISTS vs IN, subqueries in FROM, and when to use CTEs instead.
  • Correlated subqueries run once per row of the outer query — replace with JOINs for large tables.
  • EXISTS short-circuits on first match; IN collects all values first.
  • NOT IN returns empty set if the subquery contains any NULL — use NOT EXISTS instead.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer

A subquery is a SELECT statement nested inside another query. It can appear in WHERE (filter rows), SELECT (compute values), or FROM (as a derived table). Correlated subqueries reference the outer query and run once per row. EXISTS is usually faster than IN for large datasets because it short-circuits.

Scalar and Non-Correlated Subqueries

Example · SQL
1234567891011121314151617181920212223
-- Scalar subquery: returns exactly one value
SELECT
    employee_name,
    salary,
    (SELECT AVG(salary) FROM employees) AS avg_salary,
    salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees;

-- Subquery in WHERE with IN
SELECT product_name
FROM products
WHERE category_id IN (
    SELECT id FROM categories WHERE type = 'Electronics'
);

-- Subquery in FROM (derived table / inline view)
SELECT dept, avg_salary
FROM (
    SELECT department AS dept, AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department
) dept_averages
WHERE avg_salary > 60000;
▶ Output
-- Subqueries compose SELECT statements freely

Correlated Subqueries

A correlated subquery references the outer query. It re-executes for each row of the outer query — potentially slow on large tables.

Example · SQL
12345678910111213141516171819202122
-- Find employees earning more than their department's average
-- Correlated: inner query references e.department from outer query
SELECT e.name, e.salary, e.department
FROM employees e
WHERE e.salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department = e.department  -- references outer query!
);

-- This runs the subquery ONCE PER ROW of the outer employees table
-- On 100,000 employees: 100,000 AVG() computations — slow!

-- Better: use a JOIN with a derived table (computes AVG once per department)
SELECT e.name, e.salary, e.department
FROM employees e
JOIN (
    SELECT department, AVG(salary) AS dept_avg
    FROM employees
    GROUP BY department
) dept ON e.department = dept.department
WHERE e.salary > dept.dept_avg;
▶ Output
-- JOIN version computes AVG once per department — much faster

EXISTS vs IN

Example · SQL
1234567891011121314151617181920
-- IN: collects all values from subquery, then checks membership
SELECT name FROM customers
WHERE id IN (SELECT customer_id FROM orders);

-- EXISTS: short-circuits — stops as soon as it finds one match
SELECT name FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders WHERE customer_id = c.id
);

-- Performance:
-- IN: subquery runs once, returns a set. Fast when subquery returns few rows.
-- EXISTS: correlated, but stops at first match. Fast when many matches expected.
-- For NOT: NOT EXISTS is almost always faster than NOT IN
-- NOT IN fails silently if subquery contains NULL — NOT EXISTS handles it correctly

-- Dangerous NULL behaviour with NOT IN:
SELECT name FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders WHERE customer_id IS NOT NULL);
-- Without IS NOT NULL: returns empty set if ANY order has NULL customer_id
▶ Output
-- EXISTS short-circuits; NOT EXISTS handles NULL correctly

🎯 Key Takeaways

  • Correlated subqueries run once per row of the outer query — replace with JOINs for large tables.
  • EXISTS short-circuits on first match; IN collects all values first.
  • NOT IN returns empty set if the subquery contains any NULL — use NOT EXISTS instead.
  • Subqueries in FROM are derived tables — they execute once and can be indexed in some databases.
  • CTEs (WITH clause) are cleaner than nested subqueries and execute once (in most databases).

Interview Questions on This Topic

  • QWhat is a correlated subquery and why can it be slow?
  • QWhat is the difference between IN and EXISTS?
  • QWhy is NOT IN dangerous when the subquery can contain NULL?

Frequently Asked Questions

When should I use a CTE instead of a subquery?

Use a CTE when the subquery is reused multiple times, when it improves readability for complex logic, or when you need recursion. CTEs defined with WITH are named and readable. In PostgreSQL and SQL Server, CTEs execute once and are memoized. In MySQL 8+, CTE behaviour is similar. For simple one-off subqueries, either works.

Can a subquery return multiple columns?

In a WHERE clause with IN, the subquery can return multiple columns if you use row value constructors: WHERE (col1, col2) IN (SELECT a, b FROM ...). In a scalar context (SELECT or comparison operators), the subquery must return exactly one column and one row.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousSQL Aggregate FunctionsNext →SQL UNION and INTERSECT
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged