Junior 6 min · March 06, 2026

Database Normalization — Why UPDATE Changed One Copy

Customer address updated in one table, stale in three others.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • Normalization removes data redundancy by splitting tables into related facts.
  • Each normal form eliminates a specific type of anomaly: update, insert, or delete.
  • 1NF: atomic columns, no repeating groups.
  • 2NF: 1NF + no partial dependency (all non-key attributes depend on full primary key).
  • 3NF: 2NF + no transitive dependency (non-key attributes depend only on the key).
  • BCNF: 3NF + every determinant is a candidate key. Stricter, handles edge 3NF violations.
Plain-English First

Imagine your kitchen junk drawer — phone chargers, takeout menus, batteries, and a 2019 birthday card all crammed together. Finding anything takes forever, and when you add something new, it falls on the floor. Normalization is the act of giving every item its own logical home: chargers in one drawer, menus on the fridge, batteries in a labeled box. Your database tables are that junk drawer, and normalization is the tidying system that makes sure every piece of data lives exactly where it belongs — no duplicates, no confusion, no mystery.

Every developer eventually ships a database that works perfectly in development and turns into a slow, inconsistent nightmare in production. Orders that reference customers who no longer exist. A city name spelled three different ways in the same column. A single UPDATE that should change one thing accidentally changes forty rows. These aren't bugs in your code — they're anomalies baked into your schema design. Normalization is the discipline that prevents them before they start.

The core problem normalization solves is redundancy. When the same piece of information lives in multiple places, those copies drift apart. You update one row but miss another, and now your data is lying to you. Normalization is a set of progressive rules — called Normal Forms — that restructure your tables so each fact is stored exactly once. Remove the redundancy, and you remove the entire class of update, insert, and delete anomalies that come with it.

By the end of this article you'll be able to look at any flat table and identify which normal form it violates and exactly why. You'll know how to decompose that table step by step into clean, well-structured relations up through Boyce-Codd Normal Form (BCNF). You'll also understand the real-world trade-off where sometimes you deliberately denormalize for performance — and how to make that call consciously instead of accidentally.

What is Database Normalization in DBMS?

Normalization in DBMS is a methodology for organizing relational tables to minimize redundancy and dependency. Instead of a dry textbook definition, let's ground it in a real scenario. You've got a table storing orders, customers, and items all in one place. Updating a customer's phone number means scanning every order row. Miss one, and you've got a stale number. That's the update anomaly. Normalization breaks this single table into smaller, focused tables connected by foreign keys. Each fact lives once. That's the whole goal.

Here's what trips up most devs: normalization isn't about splitting for the sake of splitting. It's about isolating each unique fact so that changes affect exactly one row. The cost is more JOINs, but the payoff is data integrity. You trade write complexity for read simplicity? No — you trade the risk of silent corruption for a slightly more complex schema. That's a trade worth making.

Throughout this article, we'll walk through each normal form step by step, with SQL you can run against any database. You'll see the before and after, and more importantly, you'll see the anomalies each form prevents.

ForgeExample.javaCS FUNDAMENTALS
1
2
3
4
5
6
7
8
// TheCodeForgeDatabase Normalization in DBMS example
// Always use meaningful names, not x or n
public class ForgeExample {
    public static void main(String[] args) {
        String topic = "Database Normalization in DBMS";
        System.out.println("Learning: " + topic + " 🔥");
    }
}
Output
Learning: Database Normalization in DBMS 🔥
Forge Tip:
Type this code yourself rather than copy-pasting. The muscle memory of writing it will help it stick.
Production Insight
Unnormalized tables cause update anomalies in production.
Changing a customer address requires updating every order row — miss one and data becomes inconsistent.
Normalize first, denormalize later with full knowledge of the cost.
Key Takeaway
Normalization eliminates redundancy.
Redundancy is the root cause of update, insert, and delete anomalies.
Always aim for at least 3NF unless performance analysis proves you need less.

First Normal Form (1NF): Atomic Columns and No Repeating Groups

A table is in 1NF if every column contains atomic (indivisible) values and there are no repeating groups. Think of atomic as "one fact per cell." Violations happen when you store a list in a single column (e.g., "red, blue, green" in a color column) or have columns like item1, item2, item3. The fix is to break the repeating group into separate rows, often in a new child table.

Consider an orders table that stores multiple items in a single column:

OrderID | Customer | Items 1 | Alice | Widget, Gizmo

This violates 1NF because the Items column contains multiple values. The canonical fix is a separate OrderItems table:

OrderID | Item 1 | Widget 1 | Gizmo

Now each cell contains exactly one value, and queries become straightforward.

Here's a production lesson: I once saw a schema where the team stored JSON arrays in a column to avoid child tables. They thought it was clever until they needed to query "all orders containing Gizmo." That query required a full table scan and JSON parsing — ~200ms per call. After normalizing to 1NF, the same query ran in 2ms with an index. The extra table and JOIN were negligible.

forge_1nf.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
-- Enforce 1NF by splitting comma-separated values into rows
CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer VARCHAR(50)
);

CREATE TABLE order_items (
    order_id INT,
    item VARCHAR(50),
    quantity INT,
    PRIMARY KEY (order_id, item),
    FOREIGN KEY (order_id) REFERENCES orders(order_id)
);

-- Insert sample data
INSERT INTO orders (order_id, customer) VALUES (1, 'Alice');
INSERT INTO order_items (order_id, item, quantity) VALUES (1, 'Widget', 2), (1, 'Gizmo', 1);
-- Now query without parsing strings
SELECT o.order_id, o.customer, oi.item, oi.quantity
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id;
-- Result: order_id=1, customer=Alice, item=Widget, quantity=2
Output
order_id | customer | item | quantity
---------|----------|--------|---------
1 | Alice | Widget | 2
1 | Alice | Gizmo | 1
The Cell Phone Rule
  • If you ever need to parse a column with commas or other delimiters, you're not in 1NF.
  • Repeating columns like color1, color2 are a design smell — they assume a fixed maximum.
  • The only way to store a variable number of values is with a child table.
Production Insight
Storing lists in a column leads to slow string-parsing queries.
When a new color is added, you must either add a column or parse and update a delimited string.
1NF eliminates this class of problems entirely — at the cost of an extra JOIN.
Key Takeaway
1NF = atomic values + no repeating groups.
If you ever parse a column's value to extract sub-values, you need 1NF.
The cost is one extra table and a JOIN — worth it for data integrity.
Is My Table in 1NF?
IfAny column contains multiple values (comma list, JSON array, etc.)
UseNot in 1NF. Extract each value into its own row in a child table.
IfColumns like item1, item2, item3 exist.
UseNot in 1NF. Replace with a single column and a child table that stores multiple rows per parent.
IfEvery cell is atomic and no repeating columns.
UsePasses 1NF. Move to 2NF checks.

Second Normal Form (2NF): Eliminate Partial Dependencies

A table is in 2NF if it is in 1NF and every non-key column is fully functionally dependent on the entire primary key. Partial dependency occurs when the primary key is composite and some non-key column depends on only part of that key. This leads to redundancy because the same partial key value repeats the same non-key data multiple times.

Example: A table Enrollments with composite key (StudentID, CourseID) and columns StudentName (depends only on StudentID) and Instructor (depends only on CourseID). StudentName repeats for every course the student takes. Instructor repeats for every student in the course. To reach 2NF, decompose into Students (StudentID, StudentName), Courses (CourseID, Instructor), and Enrollments (StudentID, CourseID).

Here's a production trap I've seen: a team used a composite key of (store_id, product_id) and stored store_name in the same table. Store name depends only on store_id. When a store rebranded, they had to update thousands of rows. One batch job timed out, and half the rows had the old name. The data was inconsistent for weeks. 2NF would have prevented it entirely by putting store_name in a separate stores table.

forge_2nf.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
-- Initial table (1NF but not 2NF)
CREATE TABLE enrollments_1nf (
    student_id INT,
    course_id INT,
    student_name VARCHAR(50),
    course_name VARCHAR(50),
    instructor VARCHAR(50),
    grade CHAR(1),
    PRIMARY KEY (student_id, course_id)
);

-- Partial dependencies: student_name depends only on student_id, instructor depends only on course_id
-- Decompose into 2NF:
CREATE TABLE students (
    student_id INT PRIMARY KEY,
    student_name VARCHAR(50)
);

CREATE TABLE courses (
    course_id INT PRIMARY KEY,
    course_name VARCHAR(50),
    instructor VARCHAR(50)
);

CREATE TABLE enrollments (
    student_id INT,
    course_id INT,
    grade CHAR(1),
    PRIMARY KEY (student_id, course_id),
    FOREIGN KEY (student_id) REFERENCES students(student_id),
    FOREIGN KEY (course_id) REFERENCES courses(course_id)
);

-- Now each fact is stored exactly once
INSERT INTO students (student_id, student_name) VALUES (1, 'Alice'), (2, 'Bob');
INSERT INTO courses (course_id, course_name, instructor) VALUES (101, 'Math', 'Dr. Smith'), (102, 'Physics', 'Dr. Jones');
INSERT INTO enrollments (student_id, course_id, grade) VALUES (1, 101, 'A'), (1, 102, 'B'), (2, 101, 'C');
-- Query to get Alice's courses with instructor:
SELECT s.student_name, c.course_name, c.instructor, e.grade
FROM enrollments e
JOIN students s ON e.student_id = s.student_id
JOIN courses c ON e.course_id = c.course_id;
Output
student_name | course_name | instructor | grade
-------------|-------------|-------------|------
Alice | Math | Dr. Smith | A
Alice | Physics | Dr. Jones | B
Bob | Math | Dr. Smith | C
Watch Out: Composite Keys Masking Partial Dependencies
If your table has a composite primary key and you see repeated values in non-key columns when sorting by one part of the key, you likely have a partial dependency. For example, the same instructor appears for every row with the same course_id. That's a red flag.
Production Insight
Partial dependencies waste storage and cause update anomalies.
If a student's name changes, you must update every enrollment row for that student — easy to miss.
2NF ensures one fact updates only one row. The JOIN penalty is negligible compared to consistency gains.
Key Takeaway
2NF = 1NF + no partial dependencies.
If a column depends on only part of a composite key, extract it.
The rule: every non-key attribute must describe the entire key.
Check for 2NF Violation
IfPrimary key is a single column.
UseIf primary key is single, partial dependency cannot exist. Table is automatically in 2NF if it's in 1NF.
IfComposite key exists. Pick a non-key column. Does its value depend on only part of the key?
UseYes: Violation. Decompose into separate tables for each partial key component.
IfComposite key exists, but every non-key column depends on the whole key.
UsePasses 2NF. Move to 3NF.

Third Normal Form (3NF): Remove Transitive Dependencies

A table is in 3NF if it is in 2NF and no non-key column is transitively dependent on the primary key. Transitive dependency means a non-key column depends on another non-key column, which in turn depends on the primary key. For example, in an Employees table with columns: EmployeeID (PK), DepartmentID, DepartmentName, DepartmentLocation. DepartmentName depends on DepartmentID, not directly on EmployeeID. So if DepartmentName changes, you must update every employee in that department. To fix, split into Departments (DepartmentID, DepartmentName, DepartmentLocation) and Employees (EmployeeID, DepartmentID).

Think of it this way: if you can derive a column's value from another non-key column, it's a transitive dependency. In production, these are insidious because they seem natural. "Of course department location depends on department name," you might think. But the rule is: all non-key columns must describe the key (EmployeeID), not each other.

I once debugged a payroll system where an employee's tax bracket was stored alongside their department. The tax bracket actually depended on the department, not the employee. When the tax bracket changed for a department, the update had to hit every employee row. A single missed row meant some employees had the wrong tax withheld — that's a real-money bug.

forge_3nf.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
-- Table violating 3NF (but in 2NF)
CREATE TABLE employees_2nf (
    employee_id INT PRIMARY KEY,
    department_id INT,
    department_name VARCHAR(50),  -- depends on department_id, not on employee_id
    department_location VARCHAR(50)  -- also depends on department_id
);

-- Fix by decomposing:
CREATE TABLE departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(50),
    department_location VARCHAR(50)
);

CREATE TABLE employees_3nf (
    employee_id INT PRIMARY KEY,
    department_id INT,
    FOREIGN KEY (department_id) REFERENCES departments(department_id)
);

-- Insert sample
INSERT INTO departments (department_id, department_name, department_location) VALUES (10, 'Engineering', 'Building A'), (20, 'HR', 'Building B');
INSERT INTO employees_3nf (employee_id, department_id) VALUES (1, 10), (2, 10), (3, 20);

-- Query to get employee with department info:
SELECT e.employee_id, d.department_name, d.department_location
FROM employees_3nf e
JOIN departments d ON e.department_id = d.department_id;
Output
employee_id | department_name | department_location
-------------|-----------------|--------------------
1 | Engineering | Building A
2 | Engineering | Building A
3 | HR | Building B
Production Insight
Transitive dependencies cause the 'one-fact-many-places' problem.
Moving a department to another building requires updating every employee row — a classic batch mistake.
3NF moves department details to their own table, so one update fixes all.
Key Takeaway
3NF = 2NF + no transitive dependencies.
A non-key column should tell you only about the key, not about another non-key column.
If you see a column that describes another column, normalize it out.

Boyce-Codd Normal Form (BCNF): When 3NF Isn't Enough

BCNF is a stricter version of 3NF. A table is in BCNF if for every non-trivial functional dependency X → Y, X must be a superkey. This catches cases where a 3NF table still has anomalies because a non-key attribute determines part of the primary key. For example, a table with attributes (Student, Course, Instructor) where each instructor teaches only one course, but a course can have multiple instructors. The functional dependency Instructor → Course exists, but Instructor is not a superkey. This is in 3NF but violates BCNF. Decomposition required: separate tables (Instructor, Course) and (Student, Instructor).

BCNF violations are rare in practice but dangerous when they occur. I once worked on a university scheduling system where the 3NF table allowed an instructor to be assigned to two different courses — the application's validation caught it, but a direct SQL update bypassed the app. The result: a student had an instructor who supposedly taught two different courses in the same time slot. It was a BCNF violation: Instructor → Course but Instructor wasn't a key.

forge_bcnf.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
-- 3NF table that violates BCNF
CREATE TABLE assignments_3nf (
    student VARCHAR(50),
    course VARCHAR(50),
    instructor VARCHAR(50),
    PRIMARY KEY (student, course)
);
-- FD: instructor -> course (each instructor teaches exactly one course)
-- But instructor is not a superkey. Update anomaly: if an instructor changes course name, update all rows.

-- BCNF decomposition:
CREATE TABLE instructors (
    instructor VARCHAR(50) PRIMARY KEY,
    course VARCHAR(50)
);

CREATE TABLE enrollments_bcnf (
    student VARCHAR(50),
    instructor VARCHAR(50),
    PRIMARY KEY (student, instructor),
    FOREIGN KEY (instructor) REFERENCES instructors(instructor)
);

-- Insert data
INSERT INTO instructors (instructor, course) VALUES ('Dr. Smith', 'Math'), ('Dr. Jones', 'Physics');
INSERT INTO enrollments_bcnf (student, instructor) VALUES ('Alice', 'Dr. Smith'), ('Alice', 'Dr. Jones'), ('Bob', 'Dr. Smith');

-- Query: which student takes what course?
SELECT e.student, i.course
FROM enrollments_bcnf e
JOIN instructors i ON e.instructor = i.instructor;
Output
student | course
--------|--------
Alice | Math
Alice | Physics
Bob | Math
Production Insight
BCNF violations are subtle and often missed even by experienced DBAs.
They cause update anomalies that look like data corruption until the schema is examined.
If you use a surrogate key, you may unknowingly still have BCNF violations — always check functional dependencies on business keys.
Key Takeaway
BCNF = every determinant is a candidate key.
It's 3NF with the extra rule: no non-key attribute can determine a key attribute.
When in doubt, aim for BCNF unless performance testing says otherwise.

Denormalization: When Breaking the Rules Is a Conscious Choice

Normalization reduces redundancy but increases JOINs. In read-heavy systems with massive throughput, the cost of joining many tables can become a bottleneck. Denormalization is the intentional reintroduction of redundancy to optimize read performance. Common strategies: pre-joining columns into a reporting table, storing aggregated values (e.g., order total stored on order header), or using materialized views. The key is to document the trade-off and implement synchronization logic (triggers, application-level updates, or eventual consistency) to keep redundant data consistent.

Example: In an e-commerce dashboard that shows order summaries, you might store customer name directly in the order table to avoid a JOIN on every page load. You accept that if the customer changes their name, some reports may briefly show the old name until a batch job updates the order table.

But here's the hard part: every denormalization is a debt. It trades read speed for write complexity and data integrity risk. Before you denormalize, measure. If a JOIN costs 5ms at 1000 QPS, that's 5 seconds of additional database time per second — significant. But if it costs 50ms at 10 QPS, the trade-off is often not worth it. Always profile first.

forge_denormalization.sqlSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
-- Denormalized schema for high-read performance
CREATE TABLE orders_denormalized (
    order_id INT PRIMARY KEY,
    customer_id INT,
    customer_name VARCHAR(100),  -- stored redundantly
    order_total DECIMAL(10,2),
    created_at TIMESTAMP
);

-- Trigger to keep customer_name in sync (simplified)
CREATE OR REPLACE FUNCTION sync_customer_name()
RETURNS TRIGGER AS $$
BEGIN
    UPDATE orders_denormalized
    SET customer_name = NEW.customer_name
    WHERE customer_id = NEW.customer_id;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_sync_customer_name
AFTER UPDATE OF customer_name ON customers
FOR EACH ROW
EXECUTE FUNCTION sync_customer_name();
Output
-- On update of customer name, trigger updates all order rows for that customer.
-- We trade consistency for read speed, but keep a safety net.
Denormalization Is a Debt, Not a Free Lunch
Every piece of redundant data increases the chance of inconsistency. Always measure the actual performance problem (latency, throughput) before denormalizing. If a JOIN adds <5ms, the complexity of syncing redundant data is almost never worth it.
Production Insight
Denormalization without synchronization leads to the exact anomalies normalization was designed to prevent.
Always have a clear rollback plan — when the sync fails, data drifts.
The rule of thumb: normalize to 3NF/BCNF first, then denormalize only for proven bottlenecks.
Key Takeaway
Denormalization is a deliberate trade-off of write integrity for read speed.
Normalize first, profile the performance gap, denormalize with extreme caution.
Automated synchronization (triggers, events) is mandatory but not bulletproof.
● Production incidentPOST-MORTEMseverity: high

The Customer Address That Changed Everywhere Except the Right Place

Symptom
After a customer updated their shipping address, orders placed after the change reflected the new address, but old orders and support tickets still showed the old address. Reports became untrustworthy.
Assumption
The team assumed storing the address directly in every related table was fine — it made SELECT queries simpler and seemed fast enough.
Root cause
The schema was in 0NF. The customer address was repeated in orders, invoices, and support_tickets tables. An UPDATE on the customers table only changed one copy, leaving the rest stale.
Fix
Split the schema: create a customers table with customer_id as primary key, store address only there. In orders and other tables, reference customer_id as a foreign key. Then query by JOIN instead of reading a repeated column.
Key lesson
  • Never store the same fact in more than one place. If you do, updates become unreliable.
  • Normalization isn't academic — it's the difference between consistent data and silent corruption.
  • Before adding a column, ask: 'Can I derive this from existing data via a JOIN?' If yes, don't store it.
Production debug guideSpot the pattern, confirm with queries, then refactor4 entries
Symptom · 01
UPDATE on one row unexpectedly affects multiple rows (update anomaly).
Fix
Check if the column being updated is duplicated across many rows. Write a SELECT to count distinct values vs total rows. If they differ, you have redundancy. Decompose the table.
Symptom · 02
Cannot insert a new record because a required column depends on another missing entity (insert anomaly).
Fix
Identify if the table combines two different entities. Split into parent/child tables using foreign keys. Ensure the parent can exist independently.
Symptom · 03
Deleting a row causes loss of unrelated information (delete anomaly).
Fix
Examine whether the row contains data about two separate concepts. For example, deleting a student might delete their course info. Extract the course info into a separate table.
Symptom · 04
Same data stored in multiple tables with inconsistent values.
Fix
Run a cross-table comparison query: SELECT A.col, B.col FROM A JOIN B ON A.key = B.key WHERE A.col <> B.col. Flag all mismatches. Normalize the column into one source table.
★ Quick Normalization Health CheckRun these commands to detect common normalization violations fast. Adjust table/column names to your schema.
Repeating groups (e.g., phone1, phone2 columns)
Immediate action
Check column count > expected entity attributes.
Commands
SELECT column_name FROM information_schema.columns WHERE table_name = 'customers' AND column_name LIKE '%phone%';
SELECT COUNT(*) FROM customers WHERE phone2 IS NOT NULL;
Fix now
Create a 'phones' table with customer_id, phone_type, phone_number, and drop the duplicate columns.
Transitive dependency (non-key depends on another non-key)+
Immediate action
Identify non-key columns that could determine other non-key columns.
Commands
SELECT department_name, department_location FROM employees GROUP BY department_name, department_location;
Check if department_name is unique: SELECT department_name, COUNT(*) FROM employees GROUP BY department_name HAVING COUNT(*) > 1;
Fix now
Create a 'departments' table with department_id as PK, then reference it in employees.
Partial dependency (composite key, non-key depends on part of key)+
Immediate action
List all candidate keys and check dependencies.
Commands
SELECT course_id, instructor_name FROM enrollments GROUP BY course_id, instructor_name;
Check if instructor_name is same for all rows of a course_id: SELECT course_id, COUNT(DISTINCT instructor_name) FROM enrollments GROUP BY course_id;
Fix now
Move instructor_name to a 'courses' table keyed by course_id.
Normal Forms at a Glance
Normal FormEliminatesTypical CostProduction Example
1NFRepeating groups, non-atomic columnsOne extra table per repeating groupStoring phone numbers in a separate phones table instead of phone1, phone2 columns
2NFPartial dependencies (composite key)Decomposition into 3+ tables; additional JOINsSplitting enrollments table into students, courses, and enrollments
3NFTransitive dependenciesOne more table per transitive factMoving department details from employees to a departments table
BCNFDeterminants that are not candidate keysFurther decomposition; may increase table countSplitting assignments table into instructors (instructor → course) and enrollments

Key takeaways

1
Normalization is a progressive removal of redundancy
1NF (atomic), 2NF (full dependency), 3NF (no transitive), BCNF (every determinant is a key).
2
Each normal form eliminates a specific type of anomaly. Skipping any form leaves your data vulnerable to silent corruption.
3
Denormalization is a conscious trade-off for performance
normalize first, measure, then denormalize with automated sync.
4
The real-world rule
aim for 3NF/BCNF in OLTP systems; denormalize in OLAP/reporting only after proving the bottleneck.
5
Always think in functional dependencies
every column should tell you only about the key, the whole key, and nothing but the key.

Common mistakes to avoid

3 patterns
×

Confusing 2NF with 3NF: thinking only composite keys matter

Symptom
A table with a single-column primary key still has redundant data (e.g., department name repeating). Developers assume they're in 2NF but violate 3NF.
Fix
Identify transitive dependencies: columns that depend on another non-key column. Decompose into separate tables.
×

Over-normalizing to 4NF or 5NF without performance justification

Symptom
Schema has dozens of tables with many JOINs for simple queries. Application response time degrades under load.
Fix
Stop at BCNF in most cases. Only go further if you have a specific multi-valued dependency problem. Profile first.
×

Using denormalization as a default design instead of a last resort

Symptom
Redundant columns everywhere. Application code has to manually keep copies in sync — and frequently fails, causing data corruption.
Fix
Design normalized schema first. Only after measuring read bottlenecks and confirming the cost of JOINs is unacceptable, denormalize with automated sync mechanisms.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Explain the difference between 3NF and BCNF with a concrete example.
Q02SENIOR
When would you deliberately denormalize a database in production?
Q03JUNIOR
What is an update anomaly and how does normalization prevent it?
Q01 of 03SENIOR

Explain the difference between 3NF and BCNF with a concrete example.

ANSWER
BCNF is stricter: every functional dependency X → Y must have X as a superkey. A 3NF table can still have a functional dependency where X is not a superkey, as long as Y is a candidate key attribute. Example: a table (Student, Course, Instructor) with instructor → course (each instructor teaches one course). This is 3NF because course is part of the candidate key. But violates BCNF because instructor is not a superkey. Decompose into (Instructor, Course) and (Student, Instructor).
FAQ · 5 QUESTIONS

Frequently Asked Questions

01
What is the difference between 1NF and 2NF?
02
Can a table be in 3NF but not in BCNF?
03
Is it always bad to have a table that violates 3NF?
04
How do I identify a functional dependency?
05
What is the most common normalization mistake beginners make?
🔥

That's DBMS. Mark it forged?

6 min read · try the examples if you haven't

Previous
ACID Properties in DBMS
3 / 11 · DBMS
Next
Relational Algebra