Intermediate 8 min · March 05, 2026

Database Relationships

ON DELETE CASCADE — The $2.3M Revenue Discrepancy

Q: What is the difference between a foreign key and a relationship in a database?

A relationship is the logical concept — the rule that a customer can have many orders. A foreign key is the physical mechanism that enforces that rule in the database. The foreign key is a column in the child table that must match a value in the parent table's primary key, and the database engine rejects any insert or update that would violate that link.

Q: Can a table have more than one foreign key?

Absolutely — and it's common. A junction table in a Many-to-Many relationship has at least two foreign keys by definition. An orders table might have a foreign key to customers and another to shipping_addresses. Each FK represents an independent relationship to a different parent table, and they don't interfere with each other.

Q: When should I use a self-referencing relationship instead of a separate parent table?

Use a self-referencing table when the parent and child are fundamentally the same type of thing — an employee managing other employees, a category containing subcategories, a comment replying to another comment. If the parent and child are genuinely different entities (like a manager role vs a developer role with different attributes), a separate table is cleaner. The test is: do both levels share the exact same columns and meaning?

A cleanup script deleted 340 customers and silently removed 14,000 orders via CASCADE.

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Drawn from code that ran under real load.

✓ Production

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of fundamentals
✓Comfortable reading code examples
✓Basic production concepts

● Production Incident 🔎 Debug Guide ⚙ Triage Commands

⚡Quick Answer

Database relationships are rules describing how rows in one table connect to rows in another
One-to-Many (1:N): one parent, many children — FK lives on the child table
Many-to-Many (M:N): both sides connect to many — always requires a junction table
One-to-One (1:1): rare — use only for sparse data, security isolation, or proven query performance
Always declare ON DELETE behaviour explicitly — RESTRICT is the safe default
Missing UNIQUE on a junction table composite key silently corrupts COUNT and aggregation queries

✦ Definition~90s read

What is Database Relationships?

Database relationships define how tables connect to each other through shared keys, enforcing data integrity and preventing the orphaned records that silently corrupt analytics. Without explicit relationships, you're essentially managing separate CSV files — any application bug or manual data change can create mismatches that compound into revenue discrepancies like the $2.3M example in this article.

★

Think of a library.

Foreign keys are the mechanism that enforces these relationships at the database level, rejecting any insert, update, or delete that would break referential integrity.

The four relationship types map directly to real-world data patterns. One-to-many (e.g., one customer to many orders) covers roughly 80% of business relationships and is implemented by adding a foreign key column to the 'many' side. Many-to-many (e.g., products to categories) always requires a junction table with two foreign keys — never store comma-separated IDs in a single column, as that breaks queryability and indexing.

One-to-one is rare but useful for schema partitioning (splitting large tables for performance or security), while self-referencing relationships (e.g., employee to manager) use a single foreign key pointing back to the same table's primary key.

When you skip foreign keys for 'flexibility' or performance, you trade data integrity for a ticking time bomb. The ON DELETE CASCADE clause automates cleanup when a parent record is deleted — without it, you either manually delete children first or risk orphaned rows that silently inflate counts and skew aggregations.

The $2.3M discrepancy in this article originated from exactly this: a missing cascade on a one-to-many relationship between orders and line items, causing stale line items to double-count revenue after a bulk cleanup operation.

Plain-English First

Think of a library. One library card belongs to exactly one person — but that person can borrow many books, and each book can be borrowed by many different people over time. That's the whole concept of database relationships: it's just a set of rules describing how rows in one table connect to rows in another. Get those rules right and your data stays clean and consistent forever. Get them wrong and you'll be untangling duplicate rows at 2am.

Every real application — a Netflix, an Airbnb, a humble todo list — is powered by tables that talk to each other. The moment you store a user's orders, a product's reviews, or a student's enrolled courses, you're dealing with database relationships. They're not optional theory; they're the skeleton your entire data model is built on.

A wrong relationship can corrupt data silently, make queries nightmarishly slow, or force you to rewrite half your schema six months into production. The problem they solve is data redundancy and integrity — without relationships, you'd copy a customer's name and address into every single order row.

By the end of this article you'll be able to identify which relationship type belongs in a given scenario, write the SQL to implement each one correctly with foreign keys, design a clean junction table for many-to-many links, and avoid the three classic mistakes that trip up even experienced developers.

Why Database Relationships Are Not Optional

A database relationship is a logical link between two tables, enforced by foreign keys that guarantee referential integrity. The core mechanic: a column in one table references the primary key of another, preventing orphaned rows and ensuring every child has a valid parent. Without relationships, your data is just a collection of unrelated spreadsheets.

In practice, relationships define cardinality — one-to-one, one-to-many, many-to-many — and dictate how deletions propagate. ON DELETE CASCADE is one such rule: when a parent row is deleted, all dependent child rows are automatically removed. This is not magic; it's a declarative constraint executed at the database level, bypassing application code entirely. The key property: it's atomic and consistent — no partial deletions, no race conditions.

Use relationships and cascade rules when child data has no meaning without its parent — orders without customers, comments without posts. In real systems, failing to define them leads to silent data corruption: dangling references that crash joins, inflate counts, and produce phantom revenue. A $2.3M discrepancy often starts with a missing foreign key.

⚠ Cascade Is Not a Default

ON DELETE CASCADE is a sharp tool — use it only when child rows are truly owned by the parent, not when they're merely associated.

📊 Production Insight

A billing system deletes a customer without cascading to invoices — invoices remain, summing to $2.3M in phantom revenue.

Symptoms: balance sheets don't reconcile, audit logs show orphaned invoice rows with null customer_id.

Rule: if a child row's existence depends on the parent, enforce cascade at the schema level — never rely on application logic to clean up.

🎯 Key Takeaway

Foreign keys are the only way to guarantee referential integrity at the database level.

ON DELETE CASCADE is atomic — no application code can match its consistency.

Missing relationships cause silent data corruption that compounds over time, not immediate failures.

thecodeforge.io

Database Relationships

One-to-Many: The Relationship You'll Use 80% of the Time

A One-to-Many (1:N) relationship means one row in Table A can be associated with many rows in Table B, but each row in Table B points back to exactly one row in Table A. Classic examples: one customer → many orders, one blog post → many comments, one department → many employees.

The pattern is always the same: the 'many' side holds the foreign key. An order row holds a customer_id. A comment row holds a post_id. You never put a list of IDs inside the 'one' side — relational databases don't store arrays in columns, and if you find yourself wanting to, that's a design smell.

Why does this matter so much? Because it's the primary tool for eliminating redundancy. You store the customer's name and email exactly once in the customers table. Every order just references that one row. Update the email in one place and every order instantly reflects it. That's referential integrity — the database guarantees the customer_id in every order actually exists in the customers table, because you declared a FOREIGN KEY constraint.

one_to_many_orders.sqlSQL

-- ─────────────────────────────────────────────────────────
-- SCENARIO: An e-commerce store where customers place orders.
-- One customer can place many orders.
-- The foreign key lives on the 'many' side (orders table).
-- ─────────────────────────────────────────────────────────

-- Step 1: Create the 'one' side first (parent table)
CREATE TABLE customers (
    customer_id   INT           PRIMARY KEY AUTO_INCREMENT,
    full_name     VARCHAR(100)  NOT NULL,
    email         VARCHAR(150)  NOT NULL UNIQUE
);

-- Step 2: Create the 'many' side (child table)
-- Notice: customer_id here is a FOREIGN KEY pointing to the parent
CREATE TABLE orders (
    order_id      INT           PRIMARY KEY AUTO_INCREMENT,
    customer_id   INT           NOT NULL,                    -- FK column
    order_date    DATE          NOT NULL,
    total_amount  DECIMAL(10,2) NOT NULL,
    CONSTRAINT fk_order_customer
        FOREIGN KEY (customer_id)
        REFERENCES customers(customer_id)
        ON DELETE RESTRICT   -- prevent deleting a customer who has orders
        ON UPDATE CASCADE    -- if customer_id changes, propagate it
);

-- Step 3: Seed some data
INSERT INTO customers (full_name, email) VALUES
    ('Sarah Mitchell', 'sarah@example.com'),
    ('James Okafor',   'james@example.com');

INSERT INTO orders (customer_id, order_date, total_amount) VALUES
    (1, '2024-03-01', 129.99),  -- Sarah's first order
    (1, '2024-04-15', 49.00),   -- Sarah's second order
    (2, '2024-04-20', 310.50);  -- James's only order

-- Step 4: Fetch every customer alongside their order count
-- This is the most common query pattern for 1:N relationships
SELECT
    c.full_name,
    COUNT(o.order_id)        AS total_orders,
    SUM(o.total_amount)      AS lifetime_value
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id  -- LEFT JOIN keeps customers with 0 orders
GROUP BY c.customer_id, c.full_name
ORDER BY lifetime_value DESC;

Output

full_name | total_orders | lifetime_value

-----------------+--------------+---------------

James Okafor | 1 | 310.50

Sarah Mitchell | 2 | 178.99

⚠ Watch Out: Always Declare ON DELETE Behaviour

If you omit ON DELETE, the database default is RESTRICT — but don't rely on the default. Be explicit. Choosing ON DELETE CASCADE on orders means deleting a customer silently wipes all their orders. That's almost never what you want in a financial system. Use RESTRICT to block the delete and force the application to handle cleanup deliberately.

📊 Production Insight

FK columns need explicit indexes — foreign key constraints do NOT auto-create indexes in MySQL or PostgreSQL.

Without an index on orders.customer_id, every JOIN and every DELETE on the parent triggers a full table scan on orders.

Rule: always CREATE INDEX on every foreign key column immediately after table creation.

🎯 Key Takeaway

The foreign key always lives on the 'many' side — if you're unsure which table gets it, ask which side has many rows per relationship.

Always create an explicit index on FK columns — the constraint alone does not guarantee query performance.

ON DELETE RESTRICT is the safe default; CASCADE is the exception, not the rule.

When to Use One-to-Many

IfOne entity clearly owns many sub-entities (customer → orders)

→

UseUse 1:N — put FK on the child table

IfEach child belongs to exactly one parent

→

UseUse 1:N — the asymmetric ownership is the signal

IfYou're tempted to store a comma-separated list of IDs in one column

→

UseExtract to a proper child table with one row per relationship

IfBoth sides can have multiple connections to the other

→

UseSkip 1:N — use Many-to-Many with a junction table instead

Many-to-Many: Why You Always Need a Junction Table

A Many-to-Many (M:N) relationship means rows on both sides can relate to multiple rows on the other side. Students enrol in many courses; each course has many students. A product appears in many orders; each order contains many products. Doctors treat many patients; patients see many doctors.

Here's the critical insight: you cannot model M:N directly between two tables. There's no column you can add to students or courses that handles multiple associations cleanly. What you need is a third table — a junction table (also called a bridge or associative table) — that turns the M:N into two separate 1:N relationships.

The junction table holds the foreign keys from both sides and its own primary key. But here's where it gets interesting: the junction table often carries its own meaningful data. An enrolment isn't just a link — it has an enrolment date, a grade, a status. That extra data is what makes the junction table a first-class entity in your schema, not just plumbing. When you recognize that, your design becomes far more expressive and your queries become cleaner.

many_to_many_enrolments.sqlSQL

-- ─────────────────────────────────────────────────────────
-- SCENARIO: A university system.
-- Students enrol in many courses; each course has many students.
-- The junction table (enrolments) turns M:N into two 1:N links.
-- ─────────────────────────────────────────────────────────

CREATE TABLE students (
    student_id    INT          PRIMARY KEY AUTO_INCREMENT,
    full_name     VARCHAR(100) NOT NULL,
    email         VARCHAR(150) NOT NULL UNIQUE
);

CREATE TABLE courses (
    course_id     INT          PRIMARY KEY AUTO_INCREMENT,
    course_code   VARCHAR(10)  NOT NULL UNIQUE,  -- e.g. 'CS101'
    course_title  VARCHAR(200) NOT NULL
);

-- Junction table: each row represents ONE student enrolled in ONE course
CREATE TABLE enrolments (
    enrolment_id    INT         PRIMARY KEY AUTO_INCREMENT,
    student_id      INT         NOT NULL,
    course_id       INT         NOT NULL,
    enrolled_on     DATE        NOT NULL,
    final_grade     CHAR(2),                    -- NULL until the course ends

    -- Composite UNIQUE ensures a student can't enrol in the same course twice
    UNIQUE KEY uq_student_course (student_id, course_id),

    CONSTRAINT fk_enrolment_student
        FOREIGN KEY (student_id) REFERENCES students(student_id)
        ON DELETE CASCADE,   -- remove enrolments if student is deleted

    CONSTRAINT fk_enrolment_course
        FOREIGN KEY (course_id) REFERENCES courses(course_id)
        ON DELETE RESTRICT   -- block deleting a course that has enrolments
);

-- Seed data
INSERT INTO students (full_name, email) VALUES
    ('Priya Nair',    'priya@uni.edu'),
    ('Tom Bergmann',  'tom@uni.edu'),
    ('Aisha Mensah',  'aisha@uni.edu');

INSERT INTO courses (course_code, course_title) VALUES
    ('CS101', 'Introduction to Programming'),
    ('DB201', 'Database Design Fundamentals'),
    ('ML301', 'Machine Learning Basics');

INSERT INTO enrolments (student_id, course_id, enrolled_on) VALUES
    (1, 1, '2024-01-10'),  -- Priya in CS101
    (1, 2, '2024-01-10'),  -- Priya in DB201
    (2, 1, '2024-01-11'),  -- Tom in CS101
    (2, 3, '2024-01-11'),  -- Tom in ML301
    (3, 2, '2024-01-12'),  -- Aisha in DB201
    (3, 3, '2024-01-12');  -- Aisha in ML301

-- Query 1: Which courses is Priya enrolled in?
SELECT
    s.full_name,
    c.course_code,
    c.course_title,
    e.enrolled_on
FROM enrolments e
JOIN students s ON e.student_id = s.student_id
JOIN courses  c ON e.course_id  = c.course_id
WHERE s.full_name = 'Priya Nair'
ORDER BY e.enrolled_on;

-- Query 2: How many students are in each course?
SELECT
    c.course_code,
    c.course_title,
    COUNT(e.student_id) AS student_count
FROM courses c
LEFT JOIN enrolments e ON c.course_id = e.course_id
GROUP BY c.course_id, c.course_code, c.course_title
ORDER BY student_count DESC;

Output

-- Query 1 result:

full_name | course_code | course_title | enrolled_on

------------+-------------+---------------------------------+------------

Priya Nair | CS101 | Introduction to Programming | 2024-01-10

Priya Nair | DB201 | Database Design Fundamentals | 2024-01-10

-- Query 2 result:

course_code | course_title | student_count

------------+---------------------------------+--------------

CS101 | Introduction to Programming | 2

DB201 | Database Design Fundamentals | 2

ML301 | Machine Learning Basics | 2

💡Pro Tip: Name Your Junction Table After the Relationship, Not the Tables

Don't name it students_courses — name it enrolments. Why? Because it IS an enrolment, not just a link. Naming it after the business concept signals to every future developer that this table carries meaning and may store extra attributes. It also makes your queries read like English: 'SELECT FROM enrolments' tells a story; 'SELECT FROM students_courses' does not.

📊 Production Insight

Junction tables grow faster than either parent — if each user has 50 favourites and you have 1M users, the favourites table has 50M rows.

Always add a composite index on both FK columns for the most common query direction.

Without it, every 'find all courses for this student' query scans the entire junction table.

🎯 Key Takeaway

Many-to-Many cannot exist without a junction table — it is not optional, it is the only correct implementation.

The junction table is a real entity — name it after the business concept and expect it to carry meaningful attributes.

Missing UNIQUE on the composite key is the #1 source of duplicate data bugs in M:N schemas.

When to Use Many-to-Many

IfBoth sides can independently have multiple connections

→

UseUse M:N — create a junction table with composite UNIQUE key

IfThe relationship itself carries business data (grade, quantity, role)

→

UseUse M:N — the junction table becomes a first-class entity

IfYou're considering a comma-separated column to store multiple IDs

→

UseExtract to a junction table — comma-separated IDs break JOINs and indexes

IfOne side always has exactly one parent (each order belongs to one customer)

→

UseSkip M:N — use One-to-Many with FK on the child table

thecodeforge.io

Database Relationships

One-to-One: Rare but Powerful for Schema Partitioning

A One-to-One (1:1) relationship means each row in Table A corresponds to at most one row in Table B, and vice versa. It's the least common relationship type — and beginners often ask: why not just put all those columns in one table?

The answer is: sometimes you should. But there are three legitimate reasons to split into a 1:1 relationship. First, optional data: a users table might have profile details (bio, avatar, website) that only some users ever fill in. Keeping sparse, optional columns in a separate user_profiles table avoids storing NULL across millions of rows. Second, security partitioning: store sensitive data like password hashes or payment tokens in a separate table with tighter access controls. Third, performance: if you have a table with 50 columns and some queries only ever need 5 of them, splitting into a 'hot' and 'cold' table dramatically reduces the I/O per query.

The implementation is a foreign key on the dependent table that also has a UNIQUE constraint, enforcing that no two rows can point to the same parent.

one_to_one_user_profiles.sqlSQL

-- ─────────────────────────────────────────────────────────
-- SCENARIO: A SaaS app separating core login data from
-- optional profile details. Most queries only touch 'users'.
-- Profile data is only loaded when the profile page is viewed.
-- ─────────────────────────────────────────────────────────

-- Core login data — accessed on EVERY authenticated request
CREATE TABLE users (
    user_id       INT          PRIMARY KEY AUTO_INCREMENT,
    username      VARCHAR(50)  NOT NULL UNIQUE,
    email         VARCHAR(150) NOT NULL UNIQUE,
    password_hash VARCHAR(255) NOT NULL,
    created_at    TIMESTAMP    DEFAULT CURRENT_TIMESTAMP
);

-- Extended profile — only loaded when a user visits their profile page
CREATE TABLE user_profiles (
    profile_id    INT           PRIMARY KEY AUTO_INCREMENT,
    user_id       INT           NOT NULL UNIQUE,  -- UNIQUE enforces the 1:1
    display_name  VARCHAR(100),
    bio           TEXT,
    avatar_url    VARCHAR(500),
    website_url   VARCHAR(500),
    location      VARCHAR(100),

    CONSTRAINT fk_profile_user
        FOREIGN KEY (user_id) REFERENCES users(user_id)
        ON DELETE CASCADE  -- delete the profile if the user account is removed
);

-- Seed data: only some users have profiles
INSERT INTO users (username, email, password_hash) VALUES
    ('sarah_m',  'sarah@example.com', '$2b$12$hashed...'),
    ('james_o',  'james@example.com', '$2b$12$hashed...'),
    ('priya_n',  'priya@example.com', '$2b$12$hashed...');

-- Only Sarah and Priya have filled in their profiles
INSERT INTO user_profiles (user_id, display_name, bio, location) VALUES
    (1, 'Sarah Mitchell', 'Software engineer & coffee enthusiast.', 'Dublin, Ireland'),
    (3, 'Priya Nair',     'ML researcher. Writes about data.',       'Bangalore, India');

-- Fetch a user's profile — use LEFT JOIN so users without a profile still appear
SELECT
    u.username,
    u.email,
    COALESCE(p.display_name, u.username) AS display_name,  -- fallback to username
    p.bio,
    p.location
FROM users u
LEFT JOIN user_profiles p ON u.user_id = p.user_id
ORDER BY u.user_id;

Output

username | email | display_name | bio | location

----------+---------------------+----------------+--------------------------------------+-----------------

sarah_m | sarah@example.com | Sarah Mitchell | Software engineer & coffee enthusiast| Dublin, Ireland

james_o | james@example.com | james_o | NULL | NULL

priya_n | priya@example.com | Priya Nair | ML researcher. Writes about data. | Bangalore, India

🔥Interview Gold: Why Not Just Use One Table?

This is a classic interview question. The correct answer is: you often should use one table. Split into 1:1 only when you have a clear reason — optional sparse columns, security isolation, or query performance partitioning. Splitting without a reason adds JOIN complexity for zero benefit. Being able to articulate this trade-off shows senior-level thinking.

📊 Production Insight

Splitting a table 1:1 without a measurable performance problem adds a mandatory JOIN to every query that needs both sets of columns.

Measure first: run EXPLAIN on your top 10 queries against the merged table.

If none scan excessive columns or show high I/O, keep it as one table — premature splitting is premature optimization.

🎯 Key Takeaway

One-to-One splits are a deliberate performance or security decision, not a default.

Merge into one table unless you have sparse columns, access control requirements, or a proven query performance problem.

The UNIQUE constraint on the FK column is what enforces the 1:1 — without it, you accidentally have a 1:N.

When to Use One-to-One

IfColumns are optional and sparse (profile filled by <20% of users)

→

UseSplit 1:1 — avoid NULL bloat across millions of rows

IfColumns contain sensitive data (password hashes, payment tokens)

→

UseSplit 1:1 — isolate into a table with tighter GRANT permissions

IfTable has 50+ columns but queries only touch 5

→

UseSplit 1:1 into hot/cold tables — reduces I/O per query

IfAll columns are frequently queried together and none are sparse

→

UseKeep as one table — splitting adds JOIN cost with no benefit

Self-Referencing Relationships: When a Table Points to Itself

A self-referencing (or recursive) relationship is when a row in a table has a foreign key pointing to another row in the same table. It sounds strange until you see the use cases: an employees table where each employee has a manager_id that points to another employee, a categories table where subcategories have a parent_category_id, or a comments table with threaded replies.

This is one of those patterns that feels clever the first time you see it, but it comes with tradeoffs. The big advantage is that you don't need a separate managers table or a separate categories table for each level of hierarchy — the structure is infinitely deep by design. The tradeoff is that querying hierarchical data in SQL requires recursive Common Table Expressions (CTEs), which not all developers are comfortable writing.

Knowing this pattern exists — and knowing when it's cleaner than a separate table — is a mark of a developer who thinks about schema design holistically rather than just creating tables reactively.

self_referencing_employees.sqlSQL

-- ─────────────────────────────────────────────────────────
-- SCENARIO: A company org chart stored in a single table.
-- Each employee can have a manager, who is also an employee.
-- The CEO has no manager, so manager_id is NULL at the top.
-- ─────────────────────────────────────────────────────────

CREATE TABLE employees (
    employee_id   INT          PRIMARY KEY AUTO_INCREMENT,
    full_name     VARCHAR(100) NOT NULL,
    job_title     VARCHAR(100) NOT NULL,
    manager_id    INT          NULL,  -- NULL means this person is the top of the chain

    CONSTRAINT fk_employee_manager
        FOREIGN KEY (manager_id)
        REFERENCES employees(employee_id)  -- points to the SAME table
        ON DELETE SET NULL  -- if a manager is removed, reports become unmanaged
);

-- Build an org chart: CEO → VP → Managers → Developers
INSERT INTO employees (full_name, job_title, manager_id) VALUES
    ('Linda Forsythe',  'CEO',                NULL),   -- id=1, no manager
    ('Carlos Rivera',   'VP of Engineering',  1),      -- id=2, reports to Linda
    ('Aiko Tanaka',     'VP of Product',      1),      -- id=3, reports to Linda
    ('Ben Hughes',      'Engineering Manager',2),      -- id=4, reports to Carlos
    ('Fatima Al-Rashid','Senior Developer',   4),      -- id=5, reports to Ben
    ('Noah Eriksson',   'Developer',          4);      -- id=6, reports to Ben

-- Recursive CTE to walk the full org chart top-down
-- This works in PostgreSQL, MySQL 8+, SQL Server, and SQLite 3.35+
WITH RECURSIVE org_chart AS (

    -- Anchor: start with the CEO (no manager)
    SELECT
        employee_id,
        full_name,
        job_title,
        manager_id,
        0 AS depth,                         -- depth 0 = top level
        full_name AS reporting_chain
    FROM employees
    WHERE manager_id IS NULL

    UNION ALL

    -- Recursive step: find direct reports of the current level
    SELECT
        e.employee_id,
        e.full_name,
        e.job_title,
        e.manager_id,
        oc.depth + 1,
        CONCAT(oc.reporting_chain, ' → ', e.full_name)  -- build the chain string
    FROM employees e
    JOIN org_chart oc ON e.manager_id = oc.employee_id  -- join child to parent
)
SELECT
    REPEAT('    ', depth) || full_name AS indented_name,  -- indent by depth
    job_title,
    depth
FROM org_chart
ORDER BY reporting_chain;

Output

indented_name | job_title | depth

---------------------------------+----------------------+-------

Linda Forsythe | CEO | 0

Carlos Rivera | VP of Engineering | 1

Ben Hughes | Engineering Manager | 2

Fatima Al-Rashid | Senior Developer | 3

Noah Eriksson | Developer | 3

Aiko Tanaka | VP of Product | 1

Mental Model

Self-Referencing: Think Trees, Not Tables

A self-referencing table is a tree stored as flat rows — each row knows its parent, and recursive queries walk the branches.

Each row has a parent_id pointing to another row in the same table
Root nodes have parent_id = NULL — the recursion anchor
Recursive CTEs walk from root to leaves by joining child to parent at each level
The structure is infinitely deep — no fixed number of tables needed
Trade-off: querying requires recursive CTEs, which not all developers write comfortably

📊 Production Insight

Recursive CTEs on tables with >100K rows and deep hierarchies (depth > 10) can hit memory limits and slow down dramatically.

For read-heavy hierarchical queries at scale, consider a materialized path column (e.g., '1.2.5') or a closure table that pre-computes all ancestor-descendant pairs.

Rule: recursive CTEs are correct but not always fast — benchmark against your actual depth and row count before shipping to production.

🎯 Key Takeaway

Self-referencing tables replace N separate level-tables with one table and a parent_id column.

Always add a MAXRECURSION limit or depth guard — circular references will crash your database without one.

For high-read, deep-hierarchy workloads, pre-compute with a closure table instead of relying on recursive CTEs at query time.

When to Use Self-Referencing

IfParent and child are the same entity type (employee manages employee)

→

UseUse self-referencing — one table handles arbitrary depth

IfHierarchy depth is unknown or variable (categories, comments, org chart)

→

UseUse self-referencing — avoids creating N tables for N levels

IfParent and child have genuinely different columns (manager has direct_reports_count, developer has tech_stack)

→

UseUse separate tables — different schemas mean different entities

IfHierarchy is queried millions of times per second with deep nesting

→

UseConsider closure table or materialized path — recursive CTEs may not meet latency SLAs

Foreign Keys: The Only Thing Preventing Orphaned Data

You've drawn a line on a whiteboard connecting 'users' to 'orders'. That's cute. Now make the database enforce it before a bulk delete screws your production reporting.

Foreign keys aren't decoration. They're the safety net that stops you from having orders referencing deleted users, or user profiles pointing to non-existent addresses. Without them, you're relying on application-level discipline — which breaks the second someone runs a raw UPDATE.

The pain point hits hardest during cascading deletes or when you try to backfill data. If your foreign key isn't indexed, every JOIN between these tables becomes a full table scan. That's how a simple page load turns into a 30-second query.

Here's the rule: define the constraint AND index the referencing column. The constraint prevents data rot. The index makes your joins fast. Most ORMs create the constraint but forget the index. Your database doesn't care about your ORM's opinion.

ForeignKeyEnforcement.sqlSQL

// io.thecodeforge — database tutorial

-- Bad: no foreign key, no index
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT NOT NULL,  -- no constraint, no index
    total DECIMAL(10,2)
);

-- Good: foreign key with index
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INT NOT NULL,
    total DECIMAL(10,2),
    CONSTRAINT fk_orders_users
        FOREIGN KEY (user_id)
        REFERENCES users(id)
        ON DELETE CASCADE  -- or RESTRICT, based on your business rules
);

CREATE INDEX idx_orders_user_id ON orders(user_id);  -- critical for JOIN performance

-- Check constraint violations don't exist:
SELECT count(*) FROM orders o
LEFT JOIN users u ON o.user_id = u.id
WHERE u.id IS NULL;
-- Output: 0 (if enforcement is working)

Output

count

-------

⚠ Production Trap:

If you're inheriting a legacy schema, run the orphan check query above before adding constraints. You might discover that 'cascade delete' will nuke 20% of your orders. Fix the data first, then lock it down.

🎯 Key Takeaway

If a column references a primary key in another table, it needs a foreign key constraint AND an index. Non-negotiable.

Composite Keys: When a Single Column Can't Uniquely Identify a Row

Ignore the cargo cult that tells you every table needs a surrogate integer primary key. Sometimes real-world data won't fit that mold.

A composite primary key — using two or more columns — is mandatory when no single column is unique, but the combination naturally is. Think enrollment tables: student_id + course_id. A student can take many courses, a course has many students, but the pair is unique.

Why not just slap an auto-increment 'id' on there? You can, but then you lose the built-in uniqueness enforcement for the real-world combination. You'd need a separate unique constraint on the pair. That's redundant, and it's another index burning disk space. The composite key is both primary key and unique constraint in one.

The tradeoff is real: composite keys make your JOIN queries longer to write, and they complicate foreign key references from child tables. If 'enrollments' has details in a 'grades' table, you'll need to repeat both columns in the foreign key. That's verbose but correct.

Only use composites when the pair is genuinely the natural key and stable (won't change). Avoid them for transactional tables where speed of inserts matters more than logical purity.

CompositePrimaryKey.sqlSQL

// io.thecodeforge — database tutorial

-- Student-course enrollment with composite primary key
CREATE TABLE enrollments (
    student_id INT NOT NULL,
    course_id INT NOT NULL,
    enrolled_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    grade CHAR(2),
    
    -- Natural key: no two enrollments for same student+course pair
    PRIMARY KEY (student_id, course_id),
    
    FOREIGN KEY (student_id) REFERENCES students(id),
    FOREIGN KEY (course_id) REFERENCES courses(id)
);

-- Child table referencing composite key
CREATE TABLE assignments (
    student_id INT NOT NULL,
    course_id INT NOT NULL,
    assignment_id SERIAL,
    score DECIMAL(5,2),
    
    PRIMARY KEY (student_id, course_id, assignment_id),
    
    -- Must reference both columns
    FOREIGN KEY (student_id, course_id) 
        REFERENCES enrollments(student_id, course_id)
);

-- Query needs both columns
SELECT e.*, a.score
FROM enrollments e
JOIN assignments a 
  ON e.student_id = a.student_id 
 AND e.course_id = a.course_id
WHERE e.student_id = 42;

Output

student_id | course_id | grade | score

------------+----------+-------+-------

42 | 101 | A | 95.50

42 | 203 | B+ | 87.00

💡Senior Shortcut:

Composite keys shine in 'relationship tables' (many-to-many via junction table) where the combination IS the identity. But if you ever expect the natural pair to change (e.g., a course ID might be reassigned), stick with a surrogate key and add a separate unique constraint on the pair.

🎯 Key Takeaway

Composite primary keys are correct when the column pair is the natural, stable identifier. Use them to eliminate redundant unique constraints, but be ready for verbose foreign keys.

Ternary and Higher-Degree Relationships: Why Binary Assumptions Fail

Most databases model relationships between two entities: a student enrolls in a course. But real-world constraints often involve three or more entities simultaneously. Ternary relationships (three entities) solve cases a chain of binary tables cannot. Example: a doctor prescribes a specific medication to a specific patient. That's not three separate one-to-many links — it's one fact constrained by all three. If you model doctor-patient, patient-medication, and medication-doctor as separate pairs, you can insert invalid combinations. A ternary table DoctorPrescription(DoctorID, PatientID, MedicationID) with a composite primary key enforces that a prescribing event is exactly one row. Higher-degree relationships work the same way but require more columns. The cost is query complexity — joining four or five tables for one fact. Only use ternary+ when a single business rule involves all entities simultaneously; otherwise, decompose into binary relationships.

TernaryDoctorPrescription.sqlSQL

// io.thecodeforge — database tutorial

CREATE TABLE Doctor (
    DoctorID INT PRIMARY KEY,
    Name VARCHAR(100)
);

CREATE TABLE Patient (
    PatientID INT PRIMARY KEY,
    Name VARCHAR(100)
);

CREATE TABLE Medication (
    MedicationID INT PRIMARY KEY,
    DrugName VARCHAR(100)
);

CREATE TABLE Prescription (
    DoctorID INT,
    PatientID INT,
    MedicationID INT,
    Dosage VARCHAR(50),
    PRIMARY KEY (DoctorID, PatientID, MedicationID),
    FOREIGN KEY (DoctorID) REFERENCES Doctor(DoctorID),
    FOREIGN KEY (PatientID) REFERENCES Patient(PatientID),
    FOREIGN KEY (MedicationID) REFERENCES Medication(MedicationID)
);

Output

// Three foreign keys enforce existence. Composite PK prevents duplicate combinations.

⚠ Production Trap:

Ternary relationships tempt you to add every possible combination as foreign keys. If any entity can be null, you need a binary table instead.

🎯 Key Takeaway

Use a ternary table only when all three entities must exist together for one business rule.

Mapping Cardinalities: One Number Changes Everything

Mapping cardinality defines the maximum number of relationship instances one entity can participate in. The four formal options — one-to-one, one-to-many, many-to-one, many-to-many — dictate physical schema design. A one-to-many cardinality creates a foreign key on the 'many' side. A many-to-many cardinality always demands a junction table. Many-to-one is simply the inverse perspective of one-to-many. Choosing the wrong mapping cardinality guarantees duplicate data or missing links. For example, labeling a department-to-employee relationship as many-to-many would allow an employee to belong to multiple departments simultaneously — which might be correct for a matrix organization, but incorrect for a strict reporting hierarchy. Always audit each relationship with two questions: 'How many of entity A can relate to one B?' and 'How many of B can relate to one A?' The numbers set the foreign key placement and table structure from day one.

CardinalityExamples.sqlSQL

// io.thecodeforge — database tutorial

-- One-to-Many: One department, many employees
CREATE TABLE Department (
    DeptID INT PRIMARY KEY
);

CREATE TABLE Employee (
    EmpID INT PRIMARY KEY,
    DeptID INT FOREIGN KEY REFERENCES Department(DeptID)
);

-- Many-to-Many: Many students, many courses
CREATE TABLE Student (
    StudentID INT PRIMARY KEY
);

CREATE TABLE Course (
    CourseID INT PRIMARY KEY
);

CREATE TABLE Enrollment (
    StudentID INT FOREIGN KEY REFERENCES Student(StudentID),
    CourseID INT FOREIGN KEY REFERENCES Course(CourseID),
    PRIMARY KEY (StudentID, CourseID)
);

Output

// FK on Employee = one-to-many. Junction table = many-to-many.

⚠ Production Trap:

Developers often skip cardinality analysis and guess the table structure. That leads to junction tables for one-to-many relationships or missing foreign keys.

🎯 Key Takeaway

Always answer 'how many on each side?' before writing CREATE TABLE — cardinality dictates foreign keys.

Self-Referential Relationships: Hierarchical Data Models

Self-referential relationships occur when a table references itself. This is essential for modeling hierarchical data like organizational charts, category trees, or threaded comments. The foreign key points to the primary key within the same table. For example, an employees table can have a manager_id column that references employee_id. This allows you to represent a tree structure where each employee reports to a manager. However, querying hierarchical data in SQL requires recursive Common Table Expressions (CTEs) to traverse the tree. Without careful indexing, performance can degrade on deep hierarchies. A common pitfall is creating cycles (e.g., an employee being their own manager), which can be prevented with a CHECK constraint or application logic. Always use ON DELETE SET NULL or ON DELETE CASCADE carefully—cascading deletes can wipe out entire subtrees unintentionally. For deeper hierarchies, consider using nested sets or materialized path patterns for faster reads.

self_referential.sqlSQL

-- Create employees table with self-referential foreign key
CREATE TABLE employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(100),
    manager_id INT,
    FOREIGN KEY (manager_id) REFERENCES employees(employee_id)
        ON DELETE SET NULL
);

-- Insert sample data
INSERT INTO employees VALUES (1, 'Alice', NULL);
INSERT INTO employees VALUES (2, 'Bob', 1);
INSERT INTO employees VALUES (3, 'Charlie', 1);
INSERT INTO employees VALUES (4, 'Diana', 2);

-- Recursive CTE to get all subordinates of Alice
WITH RECURSIVE org_tree AS (
    SELECT employee_id, name, manager_id, 0 AS depth
    FROM employees
    WHERE employee_id = 1
    UNION ALL
    SELECT e.employee_id, e.name, e.manager_id, t.depth + 1
    FROM employees e
    JOIN org_tree t ON e.manager_id = t.employee_id
)
SELECT * FROM org_tree;

⚠ Beware of Cascading Deletes in Hierarchies

📊 Production Insight

For large hierarchies, consider using materialized path or nested sets to avoid recursive CTE performance issues. Index the foreign key column (e.g., manager_id) to speed up joins.

🎯 Key Takeaway

Self-referential foreign keys model hierarchies, but querying them requires recursive CTEs, and cascading deletes can be dangerous.

Many-to-Many with Junction Table: Best Practices

Many-to-many relationships require a junction table (also called associative or linking table) to break the M:N relationship into two one-to-many relationships. The junction table contains foreign keys referencing the primary keys of the two related tables, and often includes additional attributes like timestamps or quantities. Best practices include: always define a composite primary key on the two foreign key columns to prevent duplicate associations; use surrogate primary keys only if you need to reference the association itself; index both foreign key columns individually for efficient joins; and consider using ON DELETE CASCADE on both foreign keys so that deleting a parent row automatically removes associated junction rows. Avoid storing business logic in the junction table that could be normalized into a separate entity. For example, an orders and products table connect via order_items with quantity and price. This design is robust and scalable. Always validate that the junction table's foreign keys are not nullable to ensure data integrity.

many_to_many_junction.sqlSQL

-- Create tables for a many-to-many relationship
CREATE TABLE students (
    student_id INT PRIMARY KEY,
    name VARCHAR(100)
);

CREATE TABLE courses (
    course_id INT PRIMARY KEY,
    title VARCHAR(100)
);

-- Junction table with composite primary key
CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    enrolled_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES students(student_id) ON DELETE CASCADE,
    FOREIGN KEY (course_id) REFERENCES courses(course_id) ON DELETE CASCADE
);

-- Index foreign key columns for performance
CREATE INDEX idx_enrollments_student ON enrollments(student_id);
CREATE INDEX idx_enrollments_course ON enrollments(course_id);

-- Query: Get all courses for a student
SELECT c.title
FROM courses c
JOIN enrollments e ON c.course_id = e.course_id
WHERE e.student_id = 1;

💡Composite Primary Key vs Surrogate Key

📊 Production Insight

In high-traffic systems, consider denormalizing aggregated counts (e.g., enrollment count) into the parent tables to avoid frequent joins, but keep the junction table normalized.

🎯 Key Takeaway

A junction table with composite primary key and cascading deletes ensures data integrity and efficient querying in many-to-many relationships.

Polymorphic Associations: Anti-Pattern or Valid Design?

Polymorphic associations allow a foreign key to reference multiple tables by storing both the target table name and the target ID. This is common in Rails-like frameworks for features like comments or tags that can belong to different entities. However, this design is often considered an anti-pattern because it bypasses referential integrity—the database cannot enforce that the foreign key points to a valid row. Additionally, queries become complex and indexing is less effective. A better approach is to use separate junction tables or concrete foreign keys for each relationship. If you must use polymorphic associations, enforce integrity at the application level and consider using a single table inheritance pattern. In PostgreSQL, you can use table inheritance or partitioning to simulate polymorphic behavior with proper constraints. For most cases, avoid polymorphic associations in favor of explicit foreign keys. They may seem flexible but introduce long-term maintenance and performance issues.

polymorphic_association.sqlSQL

-- Polymorphic association (anti-pattern example)
CREATE TABLE comments (
    comment_id INT PRIMARY KEY,
    body TEXT,
    target_type VARCHAR(50),  -- e.g., 'Post', 'Video'
    target_id INT,
    -- No foreign key constraint possible
    created_at TIMESTAMP
);

-- Better alternative: separate junction tables
CREATE TABLE post_comments (
    comment_id INT PRIMARY KEY,
    post_id INT REFERENCES posts(post_id) ON DELETE CASCADE,
    body TEXT
);

CREATE TABLE video_comments (
    comment_id INT PRIMARY KEY,
    video_id INT REFERENCES videos(video_id) ON DELETE CASCADE,
    body TEXT
);

-- Or use a single table with nullable foreign keys
CREATE TABLE comments (
    comment_id INT PRIMARY KEY,
    body TEXT,
    post_id INT REFERENCES posts(post_id) ON DELETE CASCADE,
    video_id INT REFERENCES videos(video_id) ON DELETE CASCADE,
    CHECK ( (post_id IS NOT NULL AND video_id IS NULL) OR (post_id IS NULL AND video_id IS NOT NULL) )
);

🔥Polymorphic Associations: Use with Caution

📊 Production Insight

If you must use polymorphic associations, add triggers or application-level checks to ensure referential integrity, and consider using PostgreSQL's foreign data wrappers or table inheritance for enforcement.

🎯 Key Takeaway

Polymorphic associations are generally an anti-pattern due to lack of referential integrity; use separate tables or nullable foreign keys with constraints instead.

● Production incidentPOST-MORTEMseverity: high

Orphaned Orders Broke Revenue Reports After Cascade Delete

Symptom

Finance reported a $2.3M revenue discrepancy between the application dashboard and the accounting system. The orders table had 14,000 fewer rows than the previous day's backup.

Assumption

The cleanup script only targeted test accounts created in the last 24 hours — the team assumed it was safe to run against production.

Root cause

The orders table had ON DELETE CASCADE on customer_id. When the script deleted 340 customer rows, the database silently deleted every associated order — including legitimate orders placed by customers whose accounts shared a creation-date pattern with test accounts.

Fix

Changed ON DELETE CASCADE to ON DELETE RESTRICT on the orders.customer_id foreign key. Implemented a soft-delete pattern (deleted_at column) for customer cleanup. Added a pre-deletion audit query that counts affected child rows before any bulk delete.

Key lesson

Never use ON DELETE CASCADE on tables with financial or audit data
Always run a COUNT query on child tables before deleting parent rows
Use ON DELETE RESTRICT as your default — switch to CASCADE only when child data is genuinely meaningless without the parent
Soft-delete (deleted_at) is almost always safer than hard-delete for customer-facing data

Production debug guideCommon symptoms when database relationships are misconfigured5 entries

Symptom · 01

COUNT query on junction table returns more rows than expected

→

Fix

Check for missing UNIQUE constraint on the composite key — duplicate rows are being inserted silently

Symptom · 02

DELETE on parent table fails with foreign key violation error

→

Fix

ON DELETE RESTRICT is blocking the delete — check for existing child rows first, then handle cleanup deliberately

Symptom · 03

JOIN returns zero rows despite data existing in both tables

→

Fix

Verify FK column types match exactly (INT vs BIGINT, VARCHAR length mismatch) — type mismatch causes silent join failures

Symptom · 04

Query on 1:N relationship is extremely slow despite correct indexes

→

Fix

Check if the FK column has an index — foreign key constraints do NOT auto-create indexes in most databases

Symptom · 05

Recursive CTE query runs until timeout

→

Fix

Check for circular references in self-referencing table — add MAXRECURSION limit and a depth guard in WHERE clause

★ Database Relationship Quick DebugFast diagnostic steps when relationship issues hit production

Duplicate rows in junction table−

Immediate action

Identify duplicates and check for missing UNIQUE constraint

Commands

SELECT student_id, course_id, COUNT(*) FROM enrolments GROUP BY student_id, course_id HAVING COUNT(*) > 1;

SHOW INDEX FROM enrolments;

Fix now

ALTER TABLE enrolments ADD UNIQUE KEY uq_student_course (student_id, course_id);

Orphaned child rows with no parent+

Slow JOIN on foreign key column+

Circular reference in self-referencing table+

Relationship Type Comparison

Aspect	One-to-Many (1:N)	Many-to-Many (M:N)	One-to-One (1:1)	Self-Referencing
Real-world example	Customer → Orders	Students ↔ Courses	User → User Profile	Employee → Manager
Where does the FK live?	On the 'many' (child) table	In a dedicated junction table	On the dependent (optional) table	Same table — FK references own PK
Extra table needed?	No	Yes — always	No, but sometimes worth it	No — single table handles it
Can carry extra data?	Yes, on the child rows	Yes, on the junction table rows	Yes, on the dependent table	Yes, on each row
Query complexity	Simple JOIN	Two JOINs through junction	Simple LEFT JOIN	Recursive CTE required
Main design risk	Forgetting the FK constraint	Missing UNIQUE on junction pair	Unnecessary splitting of one table	Circular references causing infinite loops
Use when...	Hierarchy is clear and asymmetric	Both sides have multiple connections	Data is sparse, sensitive, or rarely accessed	Entities form a variable-depth hierarchy of the same type

⚙ Quick Reference

11 commands from this guide

File	Command / Code	Purpose
one_to_many_orders.sql	CREATE TABLE customers (	One-to-Many
many_to_many_enrolments.sql	CREATE TABLE students (	Many-to-Many
one_to_one_user_profiles.sql	CREATE TABLE users (	One-to-One
self_referencing_employees.sql	CREATE TABLE employees (	Self-Referencing Relationships
ForeignKeyEnforcement.sql	CREATE TABLE orders (	Foreign Keys
CompositePrimaryKey.sql	CREATE TABLE enrollments (	Composite Keys
TernaryDoctorPrescription.sql	CREATE TABLE Doctor (	Ternary and Higher-Degree Relationships
CardinalityExamples.sql	CREATE TABLE Department (	Mapping Cardinalities
self_referential.sql	CREATE TABLE employees (	Self-Referential Relationships
many_to_many_junction.sql	CREATE TABLE students (	Many-to-Many with Junction Table
polymorphic_association.sql	CREATE TABLE comments (	Polymorphic Associations

Key takeaways

The foreign key always lives on the 'many' side in a 1:N relationship

if you're ever unsure which table gets the FK, ask yourself which side has 'many' rows per relationship, and put it there.

Many-to-Many relationships cannot exist without a junction table

and that junction table is a real entity that often carries meaningful business data like enrolment dates, quantities, or statuses.

One-to-One splits are a deliberate performance or security decision, not a default

merging into one table is usually cleaner unless you have sparse columns, access control requirements, or a proven query performance problem.

Always declare your ON DELETE behaviour explicitly on every foreign key

relying on the database default is a silent time bomb waiting to corrupt or orphan your data when a parent row gets deleted.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01JUNIOR

What is the difference between a One-to-Many and a Many-to-Many relation...

Q02SENIOR

You have a products table and an orders table, and each order can contai...

Q03SENIOR

A colleague suggests storing a user's list of favourite tags as a comma-...

Q01 of 03JUNIOR

What is the difference between a One-to-Many and a Many-to-Many relationship, and how do you physically implement each one in SQL?

ANSWER

A One-to-Many means one row in Table A connects to many rows in Table B, but each B row points to exactly one A row. Implementation: put a foreign key column on the 'many' side (e.g., orders.customer_id references customers.customer_id). A Many-to-Many means rows on both sides can connect to multiple rows on the other. Implementation: create a junction table with two foreign keys — one to each side — and a composite UNIQUE key to prevent duplicates. The junction table turns M:N into two separate 1:N relationships.

FAQ · 3 QUESTIONS

Frequently Asked Questions

What is the difference between a foreign key and a relationship in a database?

Can a table have more than one foreign key?

When should I use a self-referencing relationship instead of a separate parent table?

Naren Founder & Principal Engineer

20+ years shipping high-throughput database systems. Drawn from code that ran under real load.

✓ Verified

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

🔥

That's Database Design. Mark it forged?

8 min read · try the examples if you haven't