Senior 4 min · March 05, 2026

SQL JOINs — 800 Orders Dropped by Wrong INNER JOIN

A wrong INNER JOIN dropped 800 orders/month, causing a 15-20% revenue reporting gap.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • INNER JOIN returns only rows matching on both sides — unmatched rows are silently dropped
  • LEFT JOIN preserves every row from the left table, NULLs fill the right side where no match
  • RIGHT JOIN is LEFT JOIN with tables swapped — prefer LEFT JOIN for readability
  • FULL OUTER JOIN shows all rows from both tables, NULLs on the unmatched side
  • Anti-join pattern: LEFT JOIN + WHERE right_table.id IS NULL finds rows with no match
  • Biggest mistake: aggregating across multiple one-to-many JOINs causes row explosion — SUM doubles silently
Plain-English First

Imagine your school has two notebooks: one lists every student's name and ID, and another lists which students signed up for after-school clubs. A JOIN is just the teacher combining those two notebooks to answer questions like 'which students are in Drama club?' or 'which students haven't joined ANY club yet?' The database does the same thing — it stitches two tables together along a shared column so you can ask questions that span both. That shared column is the bridge, and the type of JOIN you pick decides what happens when the bridge only goes one way.

Every real-world application stores data across multiple tables. An e-commerce site keeps customers in one table, orders in another, and products in a third. That's not sloppy design — it's intentional. Splitting data this way (called normalisation) eliminates duplication and keeps your database consistent. But the moment you need to answer a question like 'show me every customer and what they ordered last month,' you need a way to bring those tables back together. That's exactly what JOINs were built for, and they're one of the most-used features in SQL.

Before JOINs existed, developers would pull raw data from each table separately and stitch it together in application code. This was slow, bug-prone, and pushed enormous amounts of data across the network. JOINs moved that logic into the database engine itself — where it can be optimised, indexed, and executed orders of magnitude faster than anything you'd write by hand in Python or Java.

By the end of this article you'll know not just the syntax of each JOIN type, but — more importantly — you'll know which JOIN to reach for in a given situation, why the wrong JOIN silently returns misleading results, and how to spot JOIN-related performance traps before they bite you in production.

INNER JOIN: Only the Records That Match on Both Sides

An INNER JOIN returns rows only when a matching value exists in both tables. Think of it as the intersection in a Venn diagram. If a customer has never placed an order, they won't appear in the result. If an order somehow has no matching customer, it won't appear either. Only the overlap makes the cut.

This is the right JOIN when you genuinely only care about complete relationships. 'Show me every order alongside the customer who placed it' — that's INNER JOIN territory. You're not interested in orders without customers (data anomalies you'd fix separately) or customers who haven't ordered yet (a different business question entirely).

One subtlety worth knowing: INNER JOIN is the default JOIN in most SQL dialects. Writing JOIN without a keyword in front of it gives you an INNER JOIN. Being explicit with INNER JOIN makes your intent clearer to anyone reading the query later — including future-you at 11pm debugging a production incident.

inner_join_orders.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
-- SETUP: Two tables representing a small online bookstore
CREATE TABLE customers (
    customer_id   INT PRIMARY KEY,
    full_name     VARCHAR(100),
    email         VARCHAR(100)
);

CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    customer_id   INT,
    book_title    VARCHAR(150),
    order_total   DECIMAL(8, 2)
);

INSERT INTO customers VALUES
    (1, 'Alice Nguyen',   'alice@example.com'),
    (2, 'Ben Okafor',     'ben@example.com'),
    (3, 'Clara Svensson', 'clara@example.com');  -- Clara has NO orders yet

INSERT INTO orders VALUES
    (101, 1, 'The Pragmatic Programmer', 39.99),
    (102, 1, 'Clean Code',               34.99),
    (103, 2, 'Designing Data-Intensive Applications', 49.99),
    (104, 9, 'Unknown Book', 9.99);               -- customer_id 9 does NOT exist

SELECT
    c.full_name,
    c.email,
    o.order_id,
    o.book_title,
    o.order_total
FROM customers AS c
INNER JOIN orders AS o
    ON c.customer_id = o.customer_id
ORDER BY c.full_name, o.order_id;
Output
full_name | order_id | book_title | order_total
Alice Nguyen | 101 | The Pragmatic Programmer | 39.99
Alice Nguyen | 102 | Clean Code | 34.99
Ben Okafor | 103 | Designing Data-Intensive Applications | 49.99
-- Clara MISSING (no orders) -- Order 104 MISSING (no customer_id 9)
Watch Out:
INNER JOIN silently drops unmatched rows. If you run a report with INNER JOIN and your numbers look 'a bit low,' the first thing to check is whether you should be using a LEFT JOIN instead. Missing customers or orphaned records won't throw an error — they'll just quietly vanish from your results.
Production Insight
Every dropped row from INNER JOIN is silent — no error, no warning, just a smaller result set.
In finance or audit queries, 'slightly low' numbers from a wrong JOIN type can persist for months undetected.
Always validate JOIN result counts against SELECT COUNT(*) on the driving table before shipping a report query.
Key Takeaway
INNER JOIN = only confirmed matches survive.
Unmatched rows vanish silently — wrong for optional relationships.
Default to LEFT JOIN for reporting queries, then restrict deliberately.

LEFT JOIN: Keep Everything From the Left, Match What You Can From the Right

A LEFT JOIN returns every row from the left (first) table, and fills in matching data from the right table where it exists. Where there's no match on the right side, SQL fills those columns with NULL. The left table is never filtered — it always shows up in full.

This is the JOIN you reach for when the left table contains the 'source of truth' and the right table has optional enrichment data. Classic scenarios: 'show me all customers and their orders (including customers who haven't ordered yet),' or 'list all products and how many times each has been reviewed (including products with zero reviews).'

NULL in the right-side columns is actually meaningful signal here — it tells you 'this row exists but has no matching data on the right.' You can exploit that intentionally by filtering WHERE right_table.column IS NULL, which gives you only the rows with no match. That pattern is called an anti-join and it's surprisingly powerful for finding gaps: unreviewed products, uninvoiced contracts, customers who never completed onboarding.

left_join_customers_orders.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
-- All customers and their orders — including customers with no orders
SELECT
    c.customer_id,
    c.full_name,
    o.order_id,
    o.book_title,
    COALESCE(CAST(o.order_total AS VARCHAR), 'No orders yet') AS order_summary
FROM customers AS c
LEFT JOIN orders AS o
    ON c.customer_id = o.customer_id
ORDER BY c.customer_id, o.order_id;

-- Anti-join: find customers who have NEVER ordered
SELECT c.customer_id, c.full_name, c.email
FROM customers AS c
LEFT JOIN orders AS o ON c.customer_id = o.customer_id
WHERE o.order_id IS NULL;
Output
customer_id | full_name | order_id | order_summary
1 | Alice Nguyen | 101 | 39.99
1 | Alice Nguyen | 102 | 34.99
2 | Ben Okafor | 103 | 49.99
3 | Clara Svensson | NULL | No orders yet <- Clara appears this time
Pro Tip:
The anti-join pattern (LEFT JOIN + WHERE right_table.id IS NULL) is almost always faster than NOT IN or NOT EXISTS with a subquery, especially on large tables. It gives the query planner a chance to use index scans on the JOIN column rather than executing a subquery once per row.
Production Insight
The most common LEFT JOIN bug: filtering the right table in WHERE instead of ON.
WHERE orders.status = 'active' eliminates NULL rows, silently converting LEFT JOIN to INNER JOIN.
Rule: conditions that restrict the right-side table go in the ON clause, not WHERE.
Key Takeaway
LEFT JOIN = complete left table guaranteed, NULLs on the right where no match.
Anti-join (WHERE right.id IS NULL) finds gaps — unmatched records from the left side.
Never filter the right-side table in WHERE — it destroys the LEFT JOIN guarantee.

RIGHT JOIN and FULL OUTER JOIN: Completing the Picture

A RIGHT JOIN is the mirror image of LEFT JOIN — every row from the right table is preserved, and unmatched rows from the left get NULLs. In practice, most developers avoid RIGHT JOIN entirely because you can always rewrite it as a LEFT JOIN by swapping the table order. The result is identical, but LEFT JOIN reads more naturally left-to-right.

FULL OUTER JOIN is the union of LEFT and RIGHT JOIN — every row from both tables appears in the result. Where a match exists, the columns are populated. Where there's no match on either side, you get NULLs for the missing half. This is rare in everyday queries, but it's invaluable for data reconciliation work: comparing two systems to find records that exist in one but not the other, like syncing a CRM against a billing platform.

Not all databases support FULL OUTER JOIN natively. MySQL doesn't — but you can emulate it cleanly with a UNION of a LEFT JOIN and a RIGHT JOIN.

full_outer_join_reconciliation.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Reconciling two payment systems after a migration
SELECT
    COALESCE(l.payment_ref, n.payment_ref) AS payment_ref,
    l.amount        AS legacy_amount,
    n.amount        AS new_amount,
    CASE
        WHEN l.payment_ref IS NULL THEN 'New system only'
        WHEN n.payment_ref IS NULL THEN 'Legacy only — was it migrated?'
        ELSE 'Matched'
    END AS reconciliation_status
FROM legacy_payments AS l
FULL OUTER JOIN new_payments AS n
    ON l.payment_ref = n.payment_ref
ORDER BY payment_ref;

-- MySQL workaround (no FULL OUTER JOIN support)
SELECT l.payment_ref, n.payment_ref
FROM legacy_payments l LEFT JOIN new_payments n ON l.payment_ref = n.payment_ref
UNION
SELECT l.payment_ref, n.payment_ref
FROM legacy_payments l RIGHT JOIN new_payments n ON l.payment_ref = n.payment_ref;
Output
payment_ref | legacy_amount | new_amount | reconciliation_status
PAY-001 | 250.00 | 250.00 | Matched
PAY-002 | 175.50 | 175.50 | Matched
PAY-003 | 340.00 | NULL | Legacy only — was it migrated?
PAY-004 | NULL | 99.00 | New system only
Interview Gold:
Interviewers love asking 'when would you use a FULL OUTER JOIN?' The answer that impresses: data reconciliation and migration validation — comparing two independent datasets to surface discrepancies. Saying 'I've never needed it' is a red flag; saying 'I use it for system sync audits' shows production experience.
Production Insight
FULL OUTER JOIN is the go-to tool for post-migration reconciliation — comparing old vs. new system records.
MySQL's lack of native support is a real operational gotcha; keep the UNION workaround handy.
During any data migration, run a FULL OUTER JOIN validation before decommissioning the source system.
Key Takeaway
FULL OUTER JOIN = all rows from both tables, NULLs on the unmatched half.
Primary use case: data reconciliation between two independent datasets.
MySQL needs the UNION of LEFT + RIGHT JOIN to emulate it — add that to your toolkit.

JOIN Performance: Why Your Query Slows Down at Scale

Understanding JOIN types is only half the battle. In production, a perfectly correct JOIN can bring a database to its knees if you haven't thought about how the engine executes it.

The most important thing you can do for JOIN performance is index your JOIN columns. When you write ON orders.customer_id = customers.customer_id, both columns should be indexed. The primary key side is usually indexed automatically. The foreign key side often isn't — and that's where most JOIN slowdowns originate. Without an index, the database performs a full table scan for every row on the other side.

The second trap is joining on expressions or functions: ON LOWER(a.email) = LOWER(b.email). This forces a full scan because the index can't be used against a computed result. Store data in a consistent format at write time, not query time.

Finally, watch your row explosion problem. JOINing a one-to-many relationship and then running SUM() on it is a classic mistake. If a customer has 10 orders and you join to a promotions table where they have 3 active promos, your SUM suddenly operates on 30 rows instead of 10.

join_performance_patterns.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- WRONG: naive join causes row multiplication before aggregation
SELECT
    c.full_name,
    SUM(o.order_total) AS total_spent  -- DOUBLES because of 2 promo rows!
FROM customers c
INNER JOIN orders o   ON c.customer_id = o.customer_id
INNER JOIN promos p   ON c.customer_id = p.customer_id
WHERE c.customer_id = 1
GROUP BY c.full_name;

-- CORRECT: pre-aggregate in CTEs before joining
WITH aggregated_orders AS (
    SELECT customer_id, SUM(order_total) AS total_spent, COUNT(*) AS order_count
    FROM orders GROUP BY customer_id
),
aggregated_promos AS (
    SELECT customer_id, COUNT(*) AS active_promos
    FROM promos GROUP BY customer_id
)
SELECT c.full_name, ao.total_spent, ao.order_count, COALESCE(ap.active_promos, 0)
FROM customers c
INNER JOIN aggregated_orders ao ON c.customer_id = ao.customer_id
LEFT  JOIN aggregated_promos ap ON c.customer_id = ap.customer_id;

-- Always index your foreign key JOIN columns
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_promos_customer_id ON promos(customer_id);
Output
-- WRONG: Alice Nguyen | 149.96 (should be 74.98 — doubled by 2 promo rows)
-- CORRECT: Alice Nguyen | 74.98 | 2 orders | 2 promos
Watch Out:
The row explosion bug is insidious because the query runs without errors and returns results that look plausible. If your SUM is suspiciously round or exactly 2x what you expect, check whether you're joining multiple one-to-many tables before aggregating. Always aggregate first in a CTE, then join the summarised result.
Production Insight
Row explosion from multi-table JOINs produces wrong numbers that look completely reasonable — no error fires.
Foreign key columns are the most commonly un-indexed columns in production databases; always check with EXPLAIN.
For JOIN performance deep dives, see SQL Indexes and SQL EXPLAIN and Execution Plans.
Key Takeaway
Index every JOIN column — primary keys are auto-indexed, foreign keys often aren't.
Joining on functions (LOWER, CAST) kills index usage — normalise data at write time.
Aggregate in CTEs before joining one-to-many tables — never let row explosion reach your SUM.
● Production incidentPOST-MORTEMseverity: high

Silent Revenue Under-Reporting from a Wrong JOIN Type

Symptom
Monthly revenue figures in the analytics dashboard didn't match the payments team's manual reconciliation. The gap was 15-20% and fluctuating. Business stakeholders were making pricing decisions on stale data.
Assumption
The team assumed INNER JOIN was correct because 'we only care about completed transactions.' Every order should have a customer, so INNER JOIN seemed safe.
Root cause
A batch import process occasionally created orders with a temporary customer_id that was later reassigned. The INNER JOIN silently dropped ~800 orders per month whose customer_id no longer matched any row in the customers table. These weren't data errors — they were legitimate revenue that the JOIN was excluding.
Fix
Switched the customer JOIN to LEFT JOIN and added a WHERE clause for completed orders. Added a data quality alert: if LEFT JOIN produces NULL customer_id on a completed order, trigger an alert rather than silently excluding it.
Key lesson
  • INNER JOIN is never 'safe by default' — it makes a hard assumption that referential integrity is perfect
  • Always cross-check JOIN results against a raw COUNT(*) on the source table
  • Missing rows don't produce errors — they produce wrong numbers that look plausible
Production debug guideSymptom-to-action for the most common JOIN failures in production5 entries
Symptom · 01
Result row count is lower than expected — some records are missing
Fix
Compare with SELECT COUNT(*) on the left table alone. If counts differ, an INNER JOIN is dropping unmatched rows. Switch to LEFT JOIN and check for NULLs on the right side to understand what's missing.
Symptom · 02
SUM() or COUNT() returns values that are multiples of the expected answer
Fix
You have row explosion from joining multiple one-to-many tables before aggregating. Use EXPLAIN to see row estimates. Pre-aggregate each dimension table in a CTE first, then join the summarised results.
Symptom · 03
LEFT JOIN returns the same rows as INNER JOIN — NULLs never appear
Fix
You have a WHERE clause filtering on a right-table column (e.g. WHERE orders.status = 'active'). This eliminates NULL rows. Move the condition to the ON clause: ON a.id = b.id AND b.status = 'active'.
Symptom · 04
JOIN query runs fine on small datasets but times out in production
Fix
Run EXPLAIN ANALYZE. Look for Seq Scan on the larger table. Add an index on the JOIN column: CREATE INDEX idx_orders_customer_id ON orders(customer_id). Foreign key columns are the most commonly forgotten index targets.
Symptom · 05
FULL OUTER JOIN syntax error in MySQL
Fix
MySQL does not support FULL OUTER JOIN natively. Emulate it with: SELECT ... FROM a LEFT JOIN b ON ... UNION SELECT ... FROM a RIGHT JOIN b ON ...
JOIN TypeRows From LeftRows From RightNULLs?Best Used For
INNER JOINMatched onlyMatched onlyNoConfirmed relationships — orders with customers
LEFT JOINAll rowsMatched onlyRight sideOptional relationships — customers with or without orders
RIGHT JOINMatched onlyAll rowsLeft sideSame as LEFT JOIN with tables swapped — rarely preferred
FULL OUTER JOINAll rowsAll rowsBoth sidesData reconciliation — gaps between two independent datasets
CROSS JOINAll rowsAll rows (every combo)NoGenerating combinations — test data, calendars

Key takeaways

1
INNER JOIN silently drops unmatched rows
wrong for optional relationships and dangerous in finance or audit queries where missing rows mean wrong totals.
2
LEFT JOIN is your default workhorse for optional relationships
NULLs on the right are meaningful signal you can query against with the anti-join pattern.
3
Filtering the right table in WHERE after a LEFT JOIN converts it to an INNER JOIN
right-table conditions belong in the ON clause.
4
Joining multiple one-to-many tables before aggregating causes row explosion
always pre-aggregate in CTEs first.

Common mistakes to avoid

3 patterns
×

Using INNER JOIN when LEFT JOIN is needed

Symptom
Reports show fewer rows than expected and records quietly disappear with no error — your COUNT is lower than a direct SELECT COUNT(*) on the source table
Fix
Ask: do I need rows even when there's no match on the right side? If yes, use LEFT JOIN. Validate join result counts against SELECT COUNT(*) on the left table alone after changing any JOIN type.
×

Filtering a LEFT JOIN's right table in the WHERE clause

Symptom
Your LEFT JOIN behaves exactly like an INNER JOIN — NULL rows for unmatched records never appear, defeating the whole purpose
Fix
Move right-table conditions from WHERE into the ON clause: ON a.id = b.id AND b.status = 'active'. WHERE filters run after the JOIN and eliminate NULLs, converting LEFT JOIN to INNER JOIN silently.
×

Aggregating across multiple one-to-many JOINs in a single query

Symptom
SUM(), COUNT() or AVG() returns values that are exact multiples (2x, 3x) of the correct answer — query runs fine and returns plausible-looking results
Fix
Pre-aggregate each one-to-many table independently in a CTE or subquery before joining the results together. Never let row multiplication happen before your aggregate function runs.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR
What's the difference between a LEFT JOIN and an INNER JOIN, and how do ...
Q02SENIOR
If you do a LEFT JOIN and then filter on a column from the right table i...
Q03SENIOR
Your SUM of order totals is exactly double the expected value. How do yo...
Q01 of 03JUNIOR

What's the difference between a LEFT JOIN and an INNER JOIN, and how do you decide which to use?

ANSWER
INNER JOIN returns only rows with matching values in both tables — any row without a match on either side is excluded. LEFT JOIN returns every row from the left table and fills in NULLs where no right-side match exists. Decision rule: if the relationship is mandatory (every order must have a customer), INNER JOIN is appropriate. If the right-side data is optional (you want all customers regardless of whether they have orders), use LEFT JOIN. In practice, LEFT JOIN is the safer default for reporting queries because it makes missing data visible rather than hiding it.
FAQ · 3 QUESTIONS

Frequently Asked Questions

01
What is the difference between INNER JOIN and LEFT JOIN in SQL?
02
Does the order of tables in a JOIN matter?
03
Why does my LEFT JOIN give the same result as an INNER JOIN?
🔥

That's SQL Basics. Mark it forged?

4 min read · try the examples if you haven't

Previous
SQL ORDER BY and LIMIT
7 / 16 · SQL Basics
Next
SQL GROUP BY and HAVING