Junior 6 min · March 05, 2026

SQL UNION vs INTERSECT — Why UNION Dropped 1,200 Sales

Monthly reports missing 1,200 sales because UNION deduplicates rows that only matched by coincidence.

N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Written from production experience, not tutorials.

Follow
Production
production tested
May 23, 2026
last updated
1,554
articles · all by Naren
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • UNION stacks result sets vertically — combines rows from two SELECT statements into one result
  • UNION deduplicates rows (slower); UNION ALL keeps all rows including duplicates (faster)
  • Both queries in a UNION must have the same number of columns with compatible data types
  • INTERSECT returns only rows appearing in both queries — the overlap between two result sets
  • EXCEPT (or MINUS in Oracle) returns rows in the first query but not the second — set difference
  • Biggest mistake: using UNION instead of UNION ALL when rows can't overlap — UNION's deduplication wastes CPU for no benefit
✦ Definition~90s read
What is SQL UNION and INTERSECT?

UNION and INTERSECT are SQL set operators that combine results from two or more SELECT statements, but they serve fundamentally different purposes and are not interchangeable. UNION appends rows from multiple queries into a single result set, deduplicating by default (use UNION ALL to skip dedup for performance).

Imagine you have two guest lists for two separate parties.

INTERSECT returns only rows that appear in both queries. The critical asymmetry: UNION is additive—it grows your result set—while INTERSECT is subtractive—it shrinks it. A common trap is using UNION when you need INTERSECT, which can silently inflate row counts and misrepresent data, as happened to the sales team that saw 1,200 phantom sales by combining two queries that should have been intersected.

In practice, you reach for UNION when you need a consolidated view from disjoint sources, like merging current and archived orders. You use INTERSECT when you need the overlap, such as customers who bought in both Q1 and Q2. The EXCEPT operator (or MINUS in Oracle/PostgreSQL) is the surgical counterpart—it returns rows from the first query that don't exist in the second.

These operators are part of the SQL standard and supported across PostgreSQL, SQL Server, MySQL (8.0+), and Oracle, though MySQL lacks INTERSECT and EXCEPT natively (use IN or EXISTS instead).

When NOT to use them: avoid set operators for simple filtering—WHERE clauses are faster. Don't use UNION where a JOIN would suffice, as UNION forces a full scan of both result sets. And never use INTERSECT as a hammer for every overlap problem; EXCEPT is often more efficient for exclusion patterns.

The dialect war between MINUS and EXCEPT is real—Oracle and PostgreSQL use MINUS, while SQL Server and SQLite use EXCEPT—so check your database docs before writing cross-platform queries.

Plain-English First

Imagine you have two guest lists for two separate parties. UNION combines both lists into one big list — everyone who was invited to either party. INTERSECT gives you only the names that appear on BOTH lists — the people who were invited to both parties. That's the whole idea. One operator is about 'everyone from anywhere', the other is about 'only the people everywhere'.

Every non-trivial application eventually needs to pull data from more than one source and stitch it together in a meaningful way. Maybe you're running an e-commerce platform and you need one report that shows customers from two different regional databases. Maybe you're a data analyst trying to find which products appear in both this quarter's bestseller list AND last quarter's. These aren't edge cases — they're everyday production problems that UNION and INTERSECT were built to solve cleanly and efficiently.

Before these set operators existed, developers worked around the problem with clunky JOINs or application-layer logic that merged result sets in code — slow, error-prone, and hard to read. UNION and INTERSECT push that merging logic where it belongs: inside the database engine, which is orders of magnitude better at set operations than your application server. They also make intent explicit. A UNION tells any future reader of your SQL exactly what you're doing: combining two independent result sets. That clarity is worth a lot in a codebase that needs to be maintained for years.

By the end of this article you'll understand not just the syntax but the mental model behind set operators. You'll know when to reach for UNION ALL instead of UNION to avoid a silent performance trap, when INTERSECT is cleaner than a correlated subquery, how duplicate handling works in both operators, and you'll walk away with three real-world query patterns you can adapt immediately.

Why UNION and INTERSECT Are Not Symmetric — And What That Cost a Sales Team

UNION and INTERSECT are set operations on SQL result sets. UNION combines rows from two queries, removing duplicates by default (UNION ALL skips dedup). INTERSECT returns only rows present in both queries. Both compare entire rows, not individual columns — that’s the core mechanic that catches most teams.

UNION performs a sort or hash-based deduplication, making it O(n log n) on the combined row count. INTERSECT also requires dedup, but its real cost is in the row-wise comparison — if your tables have 50 columns, every column participates. That’s why UNION dropping 1,200 sales happened: a UNION between a sales table and a refund table included a status column that differed between the two, so rows that should have matched were treated as distinct, inflating the result set.

Use UNION when you need to append distinct rows from multiple sources — e.g., combining current and archived orders. Use INTERSECT when you need the overlap — e.g., customers who bought both Product A and Product B. Never use either when a JOIN or EXISTS would be more precise; set operations are row-based, not key-based, and that mismatch causes silent data corruption.

UNION Does Not Match on Keys
UNION compares every column in the row. If two rows differ in even one column — like a timestamp or status — they are both kept. That’s how 1,200 phantom sales appeared.
Production Insight
A team UNIONed a live sales table with a refund table expecting to deduplicate by order_id. The refund table had an extra refund_reason column, so every row was unique — 1,200 duplicate orders appeared in the report.
Symptom: sales totals suddenly jumped by 15% with no code change. The UNION was silently adding rows instead of merging them.
Rule: Before any set operation, verify that both SELECT lists have identical column expressions — same number, same types, same aliases. Use UNION ALL unless you explicitly need dedup.
Key Takeaway
UNION and INTERSECT compare entire rows, not keys — a single column mismatch duplicates rows.
UNION ALL is almost always faster than UNION; only use UNION when you must remove duplicates.
For overlap detection, INTERSECT is often slower than EXISTS with a join — test both on real data volumes.
SQL Set Operators: UNION vs INTERSECT vs EXCEPT THECODEFORGE.IO SQL Set Operators: UNION vs INTERSECT vs EXCEPT Flow from combining to intersecting to excluding result sets UNION Combines two result sets, removes duplicates INTERSECT Returns rows common to both queries EXCEPT / MINUS Returns rows from first query not in second Combined Pattern Order: UNION then INTERSECT then EXCEPT ⚠ UNION drops duplicates silently; INTERSECT may surprise Use UNION ALL to keep duplicates; test INTERSECT with sample data THECODEFORGE.IO
thecodeforge.io
SQL Set Operators: UNION vs INTERSECT vs EXCEPT
Sql Union Intersect

UNION — Combining Two Result Sets Into One (And the UNION ALL Trap)

UNION stacks the rows from two SELECT statements on top of each other. Think of it as a vertical JOIN — instead of adding columns sideways, it adds rows downward. The critical rule is that both queries must return the same number of columns, and the corresponding columns must have compatible data types. The column names in the final output come from the first SELECT statement, not the second.

Here's the part that trips people up: plain UNION automatically removes duplicate rows across the combined result. This sounds helpful, but it means the database has to sort or hash the entire result set to find and remove those duplicates — even if you know there are none. That's wasted CPU and I/O on every execution.

UNION ALL skips the deduplication step entirely. It just appends. It's always faster than UNION. You should default to UNION ALL and only use plain UNION when you genuinely need duplicates removed. A surprising number of production queries use UNION where UNION ALL was intended, causing a quiet but real performance tax.

The most common real-world use case is consolidating data from partitioned tables — for example, an orders_2023 and orders_2024 table that were split for archival reasons but need to be queried together for a full-history report.

union_orders_report.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
-- Scenario: orders are split across two yearly tables.
-- We need a single list of all customers who placed an order in either year.

-- Table: orders_2023
-- | customer_id | customer_email          | order_total |
-- |-------------|-------------------------|-------------|
-- | 101         | alice@example.com       | 250.00      |
-- | 102         | bob@example.com         | 89.99       |
-- | 103         | carol@example.com       | 410.50      |

-- Table: orders_2024
-- | customer_id | customer_email          | order_total |
-- |-------------|-------------------------|-------------|
-- | 102         | bob@example.com         | 120.00      |  -- Bob ordered in both years
-- | 104         | dave@example.com        | 305.75      |
-- | 105         | eve@example.com         | 55.00       |


-- ── EXAMPLE 1: UNION (deduplicates — Bob appears only ONCE) ──────────────────
SELECT customer_id, customer_email
FROM orders_2023

UNION  -- removes duplicate rows before returning results (slower)

SELECT customer_id, customer_email
FROM orders_2024;

-- Output:
-- | customer_id | customer_email    |
-- |-------------|-------------------|
-- | 101         | alice@example.com |
-- | 102         | bob@example.com   |  -- only one row for Bob
-- | 103         | carol@example.com |
-- | 104         | dave@example.com  |
-- | 105         | eve@example.com   |


-- ── EXAMPLE 2: UNION ALL (keeps ALL rows — Bob appears TWICE) ────────────────
SELECT customer_id, customer_email
FROM orders_2023

UNION ALL  -- no deduplication — just appends. Faster.

SELECT customer_id, customer_email
FROM orders_2024;

-- Output:
-- | customer_id | customer_email    |
-- |-------------|-------------------|
-- | 101         | alice@example.com |
-- | 102         | bob@example.com   |  -- first occurrence (2023)
-- | 103         | carol@example.com |
-- | 102         | bob@example.com   |  -- second occurrence (2024)
-- | 104         | dave@example.com  |
-- | 105         | eve@example.com   |


-- ── EXAMPLE 3: UNION with ORDER BY and column alias ─────────────────────────
-- ORDER BY can only appear ONCE — at the very end, after the final SELECT.
-- It applies to the entire combined result, not just one half.
SELECT customer_id, customer_email, order_total, 2023 AS order_year
FROM orders_2023

UNION ALL

SELECT customer_id, customer_email, order_total, 2024 AS order_year
FROM orders_2024

ORDER BY customer_id, order_year;  -- sorts the combined result

-- Output:
-- | customer_id | customer_email    | order_total | order_year |
-- |-------------|-------------------|-------------|------------|
-- | 101         | alice@example.com | 250.00      | 2023       |
-- | 102         | bob@example.com   | 89.99       | 2023       |
-- | 102         | bob@example.com   | 120.00      | 2024       |
-- | 103         | carol@example.com | 410.50      | 2023       |
-- | 104         | dave@example.com  | 305.75      | 2024       |
-- | 105         | eve@example.com   | 55.00       | 2024       |
Output
Example 1 (UNION): 5 rows — Bob deduplicated
Example 2 (UNION ALL): 6 rows — Bob appears twice
Example 3 (UNION ALL + ORDER BY): 6 rows sorted by customer_id then year
Watch Out: UNION's Hidden Performance Tax
Plain UNION forces the database to run a DISTINCT operation over the entire combined result set — even on millions of rows. Always ask yourself: 'Do I actually need duplicates removed here?' If the two source queries are pulling from non-overlapping data (different date ranges, different regions), use UNION ALL. It can be 2–10x faster on large tables and the query plan will confirm the difference.
Production Insight
UNION ALL is almost always what you want — it's faster because it skips the deduplication step.
UNION's deduplication is based on full row equality across all columns — not on any primary key.
Use UNION ALL + GROUP BY/DISTINCT for explicit, intentional deduplication where you control the uniqueness definition.
Key Takeaway
UNION ALL is faster and safer than UNION — it never silently drops rows based on coincidental column value equality.
Use UNION only when you explicitly want to deduplicate by full row equality.
If the two queries can never produce identical rows (different source tables, different date ranges), UNION ALL is always correct.

INTERSECT — Finding What's Common to Both Queries

INTERSECT returns only the rows that appear in the result of BOTH queries. Where UNION is additive, INTERSECT is a filter. It's essentially asking: 'What do these two result sets have in common?'

Like UNION, INTERSECT removes duplicates by default — if a value exists three times in both queries, it still only appears once in the output. Most databases don't have an INTERSECT ALL variant (PostgreSQL does; MySQL famously doesn't support INTERSECT natively at all before version 8.0.31).

The real power of INTERSECT is replacing complex correlated subqueries or EXISTS clauses with something far more readable. Consider finding customers who bought from your platform in both January and February. You could write a self-JOIN with GROUP BY, or a nested subquery with IN, but INTERSECT expresses the intent in plain English: 'Give me everyone from the January buyers AND the February buyers.'

INTERSECT compares entire rows, not just one column. Every column in both SELECT lists must match for a row to be included. This is both its power and a common source of confusion — we'll cover that in the gotchas section.

intersect_loyal_customers.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
-- Scenario: A SaaS company wants to identify "sticky" users —
-- customers who were active in BOTH January AND February 2024.
-- These are the users worth targeting for an annual plan upgrade.

-- Table: user_activity
-- | user_id | activity_month | feature_used       |
-- |---------|----------------|--------------------|
-- | 201     | 2024-01        | dashboard          |
-- | 202     | 2024-01        | reports            |
-- | 203     | 2024-01        | dashboard          |
-- | 201     | 2024-02        | api_integration    |
-- | 204     | 2024-02        | dashboard          |
-- | 202     | 2024-02        | reports            |


-- ── EXAMPLE 1: Basic INTERSECT ───────────────────────────────────────────────
-- Find user_ids active in BOTH months
SELECT user_id
FROM user_activity
WHERE activity_month = '2024-01'

INTERSECT  -- only rows that appear in BOTH result sets pass through

SELECT user_id
FROM user_activity
WHERE activity_month = '2024-02';

-- Output:
-- | user_id |
-- |---------|
-- | 201     |  -- active in both January and February
-- | 202     |  -- active in both January and February
-- (user 203 only appears in Jan, user 204 only in Feb — excluded)


-- ── EXAMPLE 2: INTERSECT as a cleaner alternative to IN + subquery ───────────
-- Same result, but compare readability:

-- The subquery way (harder to read at a glance):
SELECT DISTINCT user_id
FROM user_activity
WHERE activity_month = '2024-01'
  AND user_id IN (
      SELECT user_id          -- buried logic — reader must mentally trace this
      FROM user_activity
      WHERE activity_month = '2024-02'
  );

-- The INTERSECT way (reads like the business requirement itself):
SELECT user_id FROM user_activity WHERE activity_month = '2024-01'
INTERSECT
SELECT user_id FROM user_activity WHERE activity_month = '2024-02';

-- Output for both: same two rows — user 201 and 202.
-- INTERSECT wins on readability. Query optimizers often produce the same plan.


-- ── EXAMPLE 3: INTERSECT with multiple columns ───────────────────────────────
-- Find (user_id, feature_used) pairs that appear in BOTH months.
-- i.e. users who used the SAME feature in both Jan AND Feb.
SELECT user_id, feature_used
FROM user_activity
WHERE activity_month = '2024-01'

INTERSECT

SELECT user_id, feature_used
FROM user_activity
WHERE activity_month = '2024-02';

-- Output:
-- | user_id | feature_used |
-- |---------|--------------|
-- | 202     | reports      |  -- User 202 used 'reports' in BOTH months
-- (User 201 is excluded here because they used 'dashboard' in Jan
--  but 'api_integration' in Feb — different feature, so no match)
Output
Example 1: 2 rows — user_id 201 and 202
Example 2: Same 2 rows via subquery approach (equivalent result)
Example 3: 1 row — only user 202 used the same feature in both months
Pro Tip: INTERSECT as a Data Audit Tool
INTERSECT is genuinely useful during database migrations and audits. Run INTERSECT between a query on the old table and the same query on the new table — if the row counts match and INTERSECT returns the same count, your migration preserved the data correctly. It's a fast sanity check that takes two minutes to write.
Production Insight
INTERSECT is rarely used in OLTP but powerful for cohort analysis: find customers who placed orders in both January AND February.
INTERSECT is equivalent to a JOIN on all columns between the two result sets — the JOIN version is often faster with proper indexes.
For finding common records between two large result sets, IN with a subquery or INNER JOIN typically outperforms INTERSECT.
Key Takeaway
INTERSECT = 'rows that appear in both results' — useful for cohort overlap and data reconciliation.
For large datasets, an INNER JOIN on the relevant key often outperforms INTERSECT.
Not all databases support INTERSECT — MySQL added it in version 8.0.31.

Combining Set Operators in Real-World Query Patterns

In production, you rarely use UNION or INTERSECT in isolation. The real skill is knowing how to chain them together and how to mix them with subqueries, CTEs, and aggregations to answer complex business questions.

Set operators follow a specific precedence order: INTERSECT binds more tightly than UNION or EXCEPT. So if you write Query A UNION Query B INTERSECT Query C, the database will evaluate B INTERSECT C first, then UNION that result with A. This is counterintuitive and the source of very subtle bugs. Always use parentheses when chaining more than two queries.

Another pattern worth knowing: wrapping a UNION or INTERSECT inside a CTE (Common Table Expression) lets you treat the combined result as a named table and then run further aggregations on top of it. This is much cleaner than nesting UNION queries inside subqueries three levels deep.

Finally, remember that set operators work on result sets, not tables directly. Each SELECT can have its own WHERE clause, JOINs, and even aggregations — as long as the final column list matches. This means you can build each 'half' of the query independently and then combine them, which is a great way to break down a complex reporting requirement into manageable pieces.

combined_set_operators_report.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
-- Scenario: A retail analytics team needs a 'VIP Customer Report'.
-- Definition of VIP:
--   (a) Customers who spent > $500 in 2023  OR  > $500 in 2024 (high spenders)
--   AND
--   (b) That combined list must ALSO have made a purchase in the last 30 days
--       (still active — not just historically high-value)

-- Tables available:
-- orders(order_id, customer_id, order_total, order_date)
-- customers(customer_id, full_name, email, signup_date)


-- ── Step 1: Use a CTE to build the 'high spender' pool via UNION ALL ─────────
WITH high_spenders AS (
    -- High spenders from 2023
    SELECT customer_id
    FROM orders
    WHERE YEAR(order_date) = 2023
    GROUP BY customer_id
    HAVING SUM(order_total) > 500  -- only customers whose 2023 total exceeded $500

    UNION  -- plain UNION because we want to deduplicate across years
           -- a customer shouldn't appear twice just for being high-value in both years

    -- High spenders from 2024
    SELECT customer_id
    FROM orders
    WHERE YEAR(order_date) = 2024
    GROUP BY customer_id
    HAVING SUM(order_total) > 500
),

-- ── Step 2: Find recently active customers ───────────────────────────────────
recently_active AS (
    SELECT DISTINCT customer_id
    FROM orders
    WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'  -- activity in last 30 days
)

-- ── Step 3: INTERSECT — only customers who are in BOTH groups ────────────────
-- We join back to customers for the display columns
SELECT
    c.customer_id,
    c.full_name,
    c.email
FROM customers c
WHERE c.customer_id IN (
    -- The INTERSECT here is the heart of the query:
    -- high spenders who are also recently active
    SELECT customer_id FROM high_spenders
    INTERSECT
    SELECT customer_id FROM recently_active
)
ORDER BY c.full_name;


-- ── Alternative: explicit parentheses when chaining UNION + INTERSECT ────────
-- Without parentheses, INTERSECT binds first — this is almost never what you want
-- when mixing operators. Always parenthesize.

-- WRONG — INTERSECT runs before UNION due to precedence:
SELECT customer_id FROM orders_2023
UNION
SELECT customer_id FROM orders_2024
INTERSECT                              -- binds to orders_2024 first!
SELECT customer_id FROM recently_active;

-- RIGHT — parentheses make intent unambiguous:
(SELECT customer_id FROM orders_2023
 UNION
 SELECT customer_id FROM orders_2024)
INTERSECT
SELECT customer_id FROM recently_active;  -- now intersects the full union result
Output
VIP Customer Report:
| customer_id | full_name | email |
|-------------|------------------|-------------------------|
| 101 | Alice Mercer | alice@example.com |
| 203 | Carlos Reyes | carlos@example.com |
(Results vary by data — these are illustrative rows)
Precedence demo: the 'WRONG' and 'RIGHT' versions return different row counts
when orders_2024 and recently_active have partial overlap.
Interview Gold: Precedence Catches Everyone
Interviewers love asking 'what does UNION INTERSECT do without parentheses?' The answer: INTERSECT has higher precedence than UNION (similar to how * binds before + in arithmetic). In A UNION B INTERSECT C, the engine evaluates B INTERSECT C first. Most working developers don't know this. Knowing it signals genuine depth.
Production Insight
Set operators (UNION, INTERSECT, EXCEPT) require identical column counts and compatible types across both queries.
Column names in the final result come from the first SELECT — aliases in the second query are ignored.
For complex multi-source reports, CTEs with UNION ALL inside them are cleaner than chaining multiple UNION operators.
Key Takeaway
Column names come from the first SELECT in a UNION — aliases in subsequent queries are ignored.
All queries in a UNION chain must have the same column count and compatible types.
For readability, use CTEs (WITH clause) to name each source before combining with UNION ALL.

EXCEPT Is Your Surgical Strike — Use It Before INTERSECT Becomes a Hammer

You reach for INTERSECT when you want overlap. You reach for EXCEPT when you want differences. The junior mistake is thinking they're interchangeable. They're not.

EXCEPT returns rows from the first query that don't appear in the second. It's a left-anti-semi-join in disguise. If you're debugging a data reconciliation between two systems — say, a payments log and a ledger — EXCEPT tells you exactly what's missing from one side. INTERSECT tells you only what's duplicated. Two very different questions.

Performance gotcha: EXCEPT scans both result sets fully. If your first query returns 2M rows and the second 500K, EXCEPT will compare them all. But if you know the second query is a subset, you can often rewrite EXCEPT as a correlated NOT EXISTS and get a faster index-based plan. Don't blindly trust the optimiser.

Here's the rule: EXCEPT when you're auditing. INTERSECT when you're synchronising. Pick based on the question, not the syntax.

ReconcilePaymentLogs.sqlSQL
1
2
3
4
5
6
7
8
9
10
// io.thecodeforge — database tutorial

-- Find payments logged but not yet in the general ledger
SELECT transaction_id, amount, processed_at
FROM payment_gateway.payments
WHERE status = 'captured'
EXCEPT
SELECT transaction_id, amount, processed_at
FROM finance.general_ledger
WHERE entry_type = 'sale';
Output
transaction_id | amount | processed_at
---------------+--------+------------------
TXN-88293 | 149.99 | 2025-03-12 14:22
TXN-77124 | 22.50 | 2025-03-12 14:23
Performance Trap:
EXCEPT removes duplicates inside each result set first. If you want to count missing rows (including duplicates), use NOT EXISTS with a subquery instead. EXCEPT will silently deduplicate and lie to you.
Key Takeaway
EXCEPT answers 'what's in A but not B?' — use it for reconciliation, not listing. If duplicates matter, NOT EXISTS is your friend.

MINUS vs EXCEPT — The Dialect War That Killed an Hour of Your Life

You wrote a perfect EXCEPT query. It works on PostgreSQL. You migrate to Oracle. It explodes. Why? Because Oracle doesn't speak EXCEPT. It speaks MINUS.

Same operation. Different keyword. And if you're maintaining a multi-dialect codebase — welcome to the reason we have SQL standards that everyone ignores.

EXCEPT is ANSI SQL:2003. PostgreSQL, SQL Server, and SQLite use it. Oracle and legacy DB2 use MINUS. MySQL didn't have either until 8.0.31, and even now it's behind a flag. The real pain comes when you're writing ETL that must run across two vendors. You end up wrapping your set logic in a stored procedure that checks version first, or you build an ORM layer that translates.

Don't get philosophical. Get practical: maintain a compatibility matrix in your docs. Every time you use a set operator, tag it with the target engine. And if you're writing a migration from Oracle to Postgres, search-and-replace every MINUS to EXCEPT. Your future self will thank you.

One more thing: MINUS and EXCEPT both imply DISTINCT semantics. If you need to preserve duplicates, you're in HELL — use UNION ALL with a row-number trick instead.

PortableSetOp.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
// io.thecodeforge — database tutorial

-- PostgreSQL / SQL Server version
SELECT employee_id FROM hr.employees_current
EXCEPT
SELECT employee_id FROM hr.employees_baseline;

-- Oracle version (identical logic, different word)
SELECT employee_id FROM hr.employees_current
MINUS
SELECT employee_id FROM hr.employees_baseline;
Output
employee_id
------------
4421
5593
7710
Senior Shortcut:
If you're writing a stored procedure that must work on both Oracle and PostgreSQL, wrap the set operator in a string and use EXECUTE IMMEDIATE (PL/SQL) or EXECUTE (PL/pgSQL) after checking current_database(). It's ugly. It works.
Key Takeaway
EXCEPT and MINUS are the same operation under different names. Know your target engine, or you'll debug at 3 AM.
● Production incidentPOST-MORTEMseverity: high

Monthly Sales Report Missing 1,200 Transactions Due to Accidental UNION Deduplication

Symptom
Monthly sales totals were consistently 3-5% lower than expected. Finance found 1,200 transactions per month 'missing' from the report.
Assumption
UNION was used to 'clean up' the data, assuming duplicate rows in the combined result would be data errors that should be removed.
Root cause
Two customers placed orders for the exact same amount on the same day. UNION's deduplication compared full rows — and these rows matched on all columns (amount, date, product_id, region). UNION treated them as duplicates and kept only one, silently dropping real revenue.
Fix
Changed UNION to UNION ALL. Added a note explaining that order_id is the uniqueness key, not the full row content. Real deduplication on order_id was added as a separate step.
Key lesson
  • UNION ALL is almost always the correct choice — UNION's deduplication is based on all columns which is rarely the right uniqueness definition
  • If deduplication is needed, do it explicitly on the primary key: SELECT DISTINCT order_id, ... or GROUP BY order_id
  • Prefer UNION ALL and add explicit deduplication logic rather than relying on UNION's implicit row-equality check
Production debug guideDiagnosing missing rows, column count errors, and type mismatches3 entries
Symptom · 01
Error: each UNION query must have the same number of columns
Fix
Both SELECT statements in a UNION must return the same number of columns. Add NULL AS column_name placeholders for columns that don't exist in one query: SELECT id, name, NULL AS phone FROM employees UNION SELECT id, name, contact_number FROM contractors.
Symptom · 02
UNION returns fewer rows than expected
Fix
UNION deduplicates by full row equality — if two real rows happen to have identical values in all columns, one is dropped. Switch to UNION ALL to keep all rows, then add explicit deduplication on the primary key if needed.
Symptom · 03
INTERSECT returns empty result despite rows that appear to match
Fix
Column types or case sensitivity may differ. Check for trailing spaces: TRIM(col) in both queries. Check case: LOWER(col) in both. Also verify that all columns in the SELECT list match — INTERSECT compares full rows.
Feature / AspectUNION / UNION ALLINTERSECT
What it returnsAll rows from both queries combinedOnly rows that appear in BOTH queries
Duplicate handlingUNION removes them; UNION ALL keeps themAlways removes duplicates (most databases)
Performance defaultUNION ALL is fast; UNION runs DISTINCTSimilar cost to UNION — hashing/sorting required
MySQL supportFull support in all versionsSupported from MySQL 8.0.31 only
PostgreSQL supportFull support including UNION ALLFull support including INTERSECT ALL
Use WHENMerging partitioned tables, combining report sourcesFinding overlap between two independent datasets
Readable alternative toMultiple OR conditions across tablesIN with subquery, EXISTS, or self-JOIN
Can chain multiple?Yes — A UNION B UNION CYes — watch precedence with UNION mix
ORDER BY placementOnce, at the very end of the full queryOnce, at the very end of the full query
Column count ruleMust match across all SELECT statementsMust match across all SELECT statements

Key takeaways

1
UNION ALL is almost always what you want
plain UNION silently deduplicates and pays a sorting/hashing cost every time. Use it only when removing duplicates is a genuine requirement.
2
INTERSECT replaces correlated subqueries and IN clauses with code that reads exactly like the business requirement
'give me what's in both sets'. When your query can be stated that way, INTERSECT is the right tool.
3
INTERSECT matches on entire rows, not single columns
if your SELECT returns two columns, both must match. This surprises almost every developer the first time they hit an unexpectedly empty result.
4
INTERSECT has higher precedence than UNION
in a chained query without parentheses, INTERSECT binds first. Always parenthesize mixed set-operator queries or you'll spend an afternoon debugging a subtle logic bug.

Common mistakes to avoid

3 patterns
×

Using UNION instead of UNION ALL when rows cannot overlap

Symptom
Report is slower than necessary — UNION performs a sort-based or hash-based deduplication step even when the two queries could never produce identical rows (e.g., orders from January vs orders from February)
Fix
Use UNION ALL by default. Switch to UNION only when you explicitly need to remove duplicate rows and can verify that full-row equality is the correct uniqueness definition.
×

Different column counts or incompatible types in UNION queries

Symptom
Error: each UNION query must have the same number of columns — or a type casting error on the rows that actually differ
Fix
Match column counts by adding NULL placeholders: SELECT id, name, NULL AS phone FROM table1 UNION SELECT id, name, phone FROM table2. Cast incompatible types explicitly: CAST(int_col AS VARCHAR) to match a VARCHAR column in the other query.
×

Expecting INTERSECT to find rows matching on one key column only

Symptom
INTERSECT returns no results despite rows with matching IDs existing in both queries — because INTERSECT compares all columns, and the other columns differ
Fix
INTERSECT compares full rows across all SELECT columns. To find rows matching on a specific key, use a JOIN or IN subquery: SELECT id FROM query1 WHERE id IN (SELECT id FROM query2). This is more explicit and usually faster.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What is the difference between UNION and UNION ALL?
Q02SENIOR
When would you use INTERSECT in a real production query?
Q03JUNIOR
You need to combine employee records from two tables — employees and con...
Q01 of 03JUNIOR

What is the difference between UNION and UNION ALL?

ANSWER
UNION combines the result sets of two SELECT queries and removes duplicate rows — it performs an implicit DISTINCT on the combined result. UNION ALL combines the result sets and keeps all rows, including duplicates. UNION ALL is almost always preferable: it's faster (no deduplication step), it never silently drops real data, and deduplication based on full row equality is rarely the right uniqueness definition. Use UNION only when you explicitly need to remove rows where all column values are identical. For explicit deduplication on a specific key, use UNION ALL followed by SELECT DISTINCT on the key column.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
What is the difference between SQL UNION and INTERSECT?
02
Does MySQL support INTERSECT?
03
Can I use ORDER BY with UNION or INTERSECT?
N
Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Written from production experience, not tutorials.

Follow
Verified
production tested
May 23, 2026
last updated
1,554
articles · all by Naren
🔥

That's SQL Basics. Mark it forged?

6 min read · try the examples if you haven't

Previous
SQL Subqueries
11 / 16 · SQL Basics
Next
SQL NULL Handling