Skip to content
Home CS Fundamentals TDD — A $0.01 Floating-Point Error Cost $4,200 in Revenue

TDD — A $0.01 Floating-Point Error Cost $4,200 in Revenue

Where developers are forged. · Structured learning · Free forever.
📍 Part of: Software Engineering → Topic 7 of 16
Orders over $100 had $0.
⚙️ Intermediate — basic CS Fundamentals knowledge assumed
In this tutorial, you'll learn
Orders over $100 had $0.
  • TDD is a design tool first, a bug-catching tool second — writing a test before implementation forces you to define the API from the caller's point of view, which consistently produces simpler, cleaner interfaces.
  • The Red phase is not optional or symbolic — if your test passes before you write any implementation, either the feature already exists or your test is broken. A test that never fails has never proven anything.
  • Refactor only happens while tests are green — the entire point is that your passing tests act as a safety net; if you refactor when tests are red, you're changing behaviour and fixing bugs at the same time, and you can't tell which caused the next failure.
✦ Plain-English analogy ✦ Real code with output ✦ Interview questions
Quick Answer
  • Core concept: Write a failing test before writing the implementation code
  • Red phase: Write a test that describes one behaviour — it must fail
  • Green phase: Write the minimum code to make that test pass
  • Refactor phase: Clean up the code with the green test as a safety net
  • Performance insight: Each cycle takes 2–10 minutes; teams using TDD see 40–90% fewer defects
  • Production insight: Skipping the Refactor phase guarantees code rot within weeks
  • Biggest mistake: Writing multiple tests before any implementation — you'll rewrite them all
Production Incident

The Month-Long Regression That TDD Would Have Caught in 10 Minutes

A payment processing team skipped TDD on a critical refactor and spent three weeks chasing a silent rounding bug that only surfaced in production.
SymptomOrders over $100.00 were charged $0.01 less than expected. The discrepancy was within the acceptable rounding tolerance for individual transactions but accumulated to $4,200 in missing revenue over two weeks.
AssumptionThe team assumed that since the code was 'just a refactor' of an existing discount calculation, writing tests after the fact was sufficient.
Root causeThe refactored applyBulkDiscount method used double multiplication with floating-point rounding that differed from the original BigDecimal logic. The original code used BigDecimal with HALF_UP rounding; the refactored version used double. No test caught the 0.01 discrepancy because tests were written after the change and confirmed the new (wrong) behaviour as correct.
FixRewrite the discount method to use BigDecimal consistently. Add a TDD-driven test suite that specifies the exact rounding behaviour before touching the code. The test suite now includes edge cases: $100.00 exactly, $100.01, $99.99, and a bulk integration test that sums 1,000 random amounts and verifies the total matches the expected string representation.
Key Lesson
Any refactoring of financial logic requires TDD — write the test first that specifies the exact observable behaviour (total output) before changing the implementation.Floating-point errors are insidious: if you write tests after the fact, you validate the bug as a feature.Always include a test that sums many small amounts and compares to a string-formatted expected value to catch accumulated rounding errors.
Production Debug Guide

What to do when your TDD suite fails in ways you didn't expect

Test passes locally but fails on CICheck environment differences — locale, timezone, JDK version, file system encoding. Run mvn test -DskipTests=false on the exact CI image. Compare pom.xml dependencies for version mismatches.
Flaky test — sometimes passes, sometimes failsAdd thread dumps to the test output. Look for shared mutable state across tests (static variables, non-final singletons). Add @BeforeEach that resets all shared state. Check if the test depends on external resources (network, files) without proper retry or mock.
Test fails on the second run but not the firstTest order dependency. Use @TestMethodOrder(MethodName) or run tests in alphabetical order to reproduce. Look for lingering data from a previous test (e.g., files, database records, static collections). Each test must clean up after itself.
Test fails only on a specific branchCompare the test file between branches. Often a merge conflict resolution left an incorrect expectation. Use git diff to isolate the failing assertion. Check if the implementation changed in a way that invalidates the test's assumption.

Every developer has shipped code that worked perfectly on their machine and exploded in production. The usual culprit isn't bad intentions — it's writing code first and verifying it later, if at all. Test-Driven Development flips that script. It's a discipline practised by engineers at Google, Netflix, and Amazon not because it's trendy, but because it consistently produces code that is easier to change, easier to understand, and far less likely to blow up at 2am on a Friday.

The problem TDD solves is confidence. Without tests written up front, you're essentially guessing that your code is correct. As the codebase grows, that guess becomes less and less reliable. A small change to one class silently breaks three others, and you find out when a user files a bug report — not when you make the change. TDD forces you to define 'correct' in executable terms before you write a single line of logic, turning your test suite into a living specification that screams the moment reality diverges from expectation.

By the end of this article you'll understand exactly why TDD exists (not just what it is), how to execute the Red-Green-Refactor cycle on a real-world problem, how to avoid the three most common traps that make people give up on TDD early, and how to talk about it confidently in a technical interview.

The Red-Green-Refactor Cycle — The Heartbeat of TDD

TDD lives and dies by a three-step rhythm called Red-Green-Refactor. It's deceptively simple, but every word matters.

Red — Write a test that describes a single piece of behaviour your code doesn't have yet. Run it. It must fail. If it passes immediately, either the feature already exists or the test is broken. A passing test before any implementation is a red flag, not a green light.

Green — Write the minimum code required to make that test pass. Not clean code. Not clever code. The minimum. Seriously, return a hard-coded value if that's all it takes. The goal here is to get the test passing so you have a safety net for the next step.

Refactor — Now, with a green test as your safety net, clean up the implementation. Extract duplication, rename variables, simplify logic. Run the tests after every change. If they stay green, your refactoring is safe. This is the step most developers skip, and it's why their code rots.

The cycle typically takes 2–10 minutes per iteration. You're not writing a feature in one shot — you're stacking verified, small increments. Each green test is a permanent checkpoint you can always return to.

ShoppingCartTest.java · JAVA
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.BeforeEach;
import static org.junit.jupiter.api.Assertions.*;

/**
 * STEP 1RED: We write this test BEFORE ShoppingCart exists.
 * It describes exactly what we want: a cart that totals item prices
 * and applies a 10% discount when the total exceeds $100.
 *
 * Run this now and it won't even compile — that IS the red phase.
 */
class ShoppingCartTest {

    private ShoppingCart cart;

    @BeforeEach
    void setUp() {
        // Fresh cart before every test — tests must never share state
        cart = new ShoppingCart();
    }

    @Test
    void emptyCartHasZeroTotal() {
        // Simplest possible case — always start here
        assertEquals(0.0, cart.getTotal(), 0.001,
            "A brand new cart should have a total of exactly zero");
    }

    @Test
    void addingSingleItemUpdatesTotalCorrectly() {
        cart.addItem("Keyboard", 49.99);

        // We expect the total to equal the single item price — no tricks yet
        assertEquals(49.99, cart.getTotal(), 0.001,
            "Total should equal the price of the one item added");
    }

    @Test
    void addingMultipleItemsSumsAllPrices() {
        cart.addItem("Keyboard", 49.99);
        cart.addItem("Mouse",    29.99);
        cart.addItem("Monitor", 249.99);

        // 49.99 + 29.99 + 249.99 = 329.97
        assertEquals(329.97, cart.getTotal(), 0.001,
            "Total should be the sum of all added item prices");
    }

    @Test
    void totalAboveOneHundredDollarsReceivesTenPercentDiscount() {
        cart.addItem("Keyboard",  49.99);
        cart.addItem("Mouse",     29.99);
        cart.addItem("WebCam",    39.99);  // total = 119.97, triggers discount

        // 119.97 * 0.90 = 107.973
        assertEquals(107.973, cart.getTotal(), 0.001,
            "Orders over $100 should receive a 10% discount on the total");
    }

    @Test
    void cannotAddItemWithNegativePrice() {
        // Edge case: guard against bad data — the test documents this rule
        assertThrows(IllegalArgumentException.class,
            () -> cart.addItem("Broken Item", -5.00),
            "Adding an item with a negative price should throw IllegalArgumentException");
    }
}
▶ Output
// After writing ONLY the test file above, running the suite produces:
//
// COMPILATION ERROR:
// error: cannot find symbol
// symbol: class ShoppingCart
//
// This IS the Red phase. The test can't even compile because the class
// doesn't exist yet. That's correct. Now we move to Green.
⚠ Watch Out: A Test That Never Fails Is Worthless
If your test passes before you've written any implementation, it isn't testing anything. Always confirm your test fails for the RIGHT reason — 'class not found' or 'expected 107.97 but was 119.97' are good failures. 'Expected true but was true' means your assertion logic is broken.
📊 Production Insight
Teams that skip the Red phase often write tests that pass against old code and never catch regressions.
A bank's payment module passed all tests after a refactor, but the tests were written after the fact and had no assertions — they only verified no exceptions were thrown.
Rule: Always run the test before writing implementation; if it doesn't fail, it's not a real test.
🎯 Key Takeaway
Red phase must fail — it's your proof that the test can detect a missing feature.
Green phase requires minimal code — resist the urge to be clever.
Refactor is not optional — code rots without it.

Green Then Refactor — Writing the Implementation the TDD Way

With the tests written, now we build the ShoppingCart class. The TDD rule is ruthless: write only as much code as it takes to turn red tests green. No extra methods, no premature abstractions, no 'I'll need this later' code.

This constraint feels unnatural at first. You'll want to build the whole class in one shot. Resist it. The discipline of small steps is exactly what makes TDD valuable. Each green test is evidence that a specific piece of behaviour works. Stack enough evidence and you have a reliable system.

Once all five tests are green, the Refactor step begins. Notice in the code below that the initial Green implementation uses a simple loop. In the Refactor step, we extract the discount logic into a private method with a meaningful name. The tests don't change — they stay green throughout — but the code becomes easier to read and modify. That's the payoff.

This is also where TDD diverges from 'writing tests after'. When you write tests after the fact, you tend to write tests that confirm what you already built. When you write them first, you write tests that describe what the software should do, which is a much stronger guarantee.

ShoppingCart.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
import java.util.ArrayList;
import java.util.List;

/**
 * STEP 2GREEN: Minimum implementation to pass all five tests.
 * Then STEP 3REFACTOR: Clean it up with the tests as a safety net.
 */
public class ShoppingCart {

    // Each item is a small record: name + price. No over-engineering.
    private record CartItem(String name, double price) {}

    private final List<CartItem> items = new ArrayList<>();

    // Discount constants are named — magic numbers are a maintenance nightmare
    private static final double DISCOUNT_THRESHOLD  = 100.00;
    private static final double DISCOUNT_MULTIPLIER = 0.90;   // 10% off

    /**
     * Adds an item to the cart.
     * @throws IllegalArgumentException if price is negative (test 5 requires this)
     */
    public void addItem(String name, double price) {
        if (price < 0) {
            // Guard clause: fail fast and loud rather than silently corrupt the total
            throw new IllegalArgumentException(
                "Item price cannot be negative. Received: " + price
            );
        }
        items.add(new CartItem(name, price));
    }

    /**
     * Returns the cart total, with a 10% discount applied if the raw
     * sum exceeds $100. This is the core business rule our tests define.
     */
    public double getTotal() {
        // Stream the items and sum their prices — readable and concise
        double rawTotal = items.stream()
                               .mapToDouble(CartItem::price)
                               .sum();

        // REFACTOR: the discount decision is now in a named private method,
        // making getTotal() read like a sentence, not a maths puzzle
        return applyBulkDiscountIfEligible(rawTotal);
    }

    /**
     * Private helper extracted during the Refactor step.
     * The name describes the INTENT — not the mechanics.
     */
    private double applyBulkDiscountIfEligible(double rawTotal) {
        if (rawTotal > DISCOUNT_THRESHOLD) {
            return rawTotal * DISCOUNT_MULTIPLIER;
        }
        return rawTotal;
    }
}

// ─── Now re-run the test suite ───────────────────────────────────────────────

/**
 * STEP 3RUN THE TESTS AGAIN TO CONFIRM REFACTOR DIDN'T BREAK ANYTHING
 *
 * Run: mvn test   OR   gradlew test   OR use your IDE's test runner
 */
▶ Output
// Console output after running ShoppingCartTest with the implementation above:
//
// [INFO] -------------------------------------------------------
// [INFO] T E S T S
// [INFO] -------------------------------------------------------
// [INFO] Running ShoppingCartTest
// [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0
// [INFO]
// [INFO] BUILD SUCCESS
//
// All 5 tests pass. The Red-Green-Refactor cycle is complete.
// Every business rule — zero total, single item, multi-item sum,
// discount threshold, and negative-price guard — is now PROVEN.
💡Pro Tip: Name Your Tests Like Sentences
Test method names like addingSingleItemUpdatesTotalCorrectly are your free documentation. When a test fails in CI, that name is the first thing a teammate reads at 3am. Treat it like a sentence in a spec document, not a code identifier. The pattern 'given_when_then' or plain English both work — just be consistent.
📊 Production Insight
A common mistake is to write all production code at once and then run the tests. That defeats the purpose — you lose the step-by-step safety net.
If you write 50 lines of code and then run tests, you don't know which 5 lines introduced the bug.
Rule: One test at a time, one green bar at a time. Never write implementation for a test you haven't seen fail.
🎯 Key Takeaway
Green phase means minimum code — nothing more.
Refactor only when all tests pass — breaking green is a sign you're fixing bugs and refactoring simultaneously.
Tests are the safety net: they let you change internal structure with confidence.

TDD vs Writing Tests After — When Each Approach Actually Makes Sense

TDD often gets presented as 'always write tests first or you're doing it wrong.' That's doctrine, not engineering. Let's be honest about the trade-offs.

TDD shines brightest when you're building business logic — validation rules, calculation engines, state machines, algorithms. Any code where the behaviour is more important than how it's structured is a perfect TDD candidate. The test becomes a precise, executable spec.

TDD is harder to apply to UI components, database integrations, and exploratory spikes where you're still figuring out the shape of the solution. Forcing TDD on a piece of code you don't yet understand often produces tests that are rewritten three times before the design settles. In those cases, many experienced engineers will prototype first, then write tests once the design stabilises.

Writing tests after the fact isn't useless — it's better than no tests. But it has a known weakness: you tend to write tests that confirm what you built rather than tests that challenge it. TDD inverts this by forcing you to think about failure modes before you're emotionally invested in the implementation.

The pragmatic position: use TDD as your default for logic-heavy code, and apply post-implementation tests where TDD genuinely slows you down — then go back and tighten those tests once the design is stable.

PasswordValidatorTest.java · JAVA
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.CsvSource;
import static org.junit.jupiter.api.Assertions.*;

/**
 * Parameterized TDD example — password validation rules.
 *
 * Business rules defined BEFORE PasswordValidator is written:
 *   1. Must be at least 8 characters
 *   2. Must contain at least one uppercase letter
 *   3. Must contain at least one digit
 *   4. Must not contain spaces
 *
 * @CsvSource lets us test many cases with one test method —
 * ideal when the same rule applies to many different inputs.
 */
class PasswordValidatorTest {

    private final PasswordValidator validator = new PasswordValidator();

    @ParameterizedTest(name = "''{0}'' should be valid={1} because: {2}")
    @CsvSource({
        // password,          expectedValid, reason
        "Secure99,            true,  meets all four rules",
        "short1A,             false, only 7 characters — fails length rule",
        "alllowercase1,       false, no uppercase — fails case rule",
        "ALLUPPERCASE1,       false, no lowercase — but wait, do we require lowercase?",
        "NoDigitsHere,        false, missing a digit",
        "Has Space1A,         false, contains a space",
        "ExactlyEight1A,      true,  exactly 8 chars with all required types"
    })
    void passwordMeetsAllValidationRules(
            String password, boolean expectedValid, String reason) {

        boolean actualResult = validator.isValid(password.trim());

        // The failure message uses 'reason' so a failing test self-documents
        assertEquals(expectedValid, actualResult,
            "Password '" + password.trim() + "' — " + reason);
    }
}

// ─── Minimal Green implementation ────────────────────────────────────────────

// PasswordValidator.java
class PasswordValidator {

    private static final int    MINIMUM_LENGTH    = 8;
    private static final String UPPERCASE_PATTERN = ".*[A-Z].*";
    private static final String DIGIT_PATTERN     = ".*[0-9].*";

    public boolean isValid(String password) {
        if (password == null)             return false;
        if (password.length() < MINIMUM_LENGTH) return false;
        if (password.contains(" "))       return false;  // no spaces
        if (!password.matches(UPPERCASE_PATTERN)) return false;
        if (!password.matches(DIGIT_PATTERN))     return false;
        return true;
    }
}
▶ Output
// Running PasswordValidatorTest:
//
// [INFO] Running PasswordValidatorTest
// [INFO] 'Secure99' should be valid=true because: meets all four rules ✓
// [INFO] 'short1A' should be valid=false because: only 7 characters ✓
// [INFO] 'alllowercase1' should be valid=false because: no uppercase ✓
// [INFO] 'ALLUPPERCASE1' should be valid=false because: no lowercase required ✓
// [INFO] 'NoDigitsHere' should be valid=false because: missing a digit ✓
// [INFO] 'Has Space1A' should be valid=false because: contains a space ✓
// [INFO] 'ExactlyEight1A' should be valid=true because: exactly 8 chars ✓
//
// Tests run: 7, Failures: 0, Errors: 0, Skipped: 0
// BUILD SUCCESS
//
// Notice: the 4th row forced us to CLARIFY the spec.
// Is a lowercase letter required? TDD exposed an ambiguous requirement
// before it became a production bug.
🔥Interview Gold: TDD as a Design Tool
When an interviewer asks about TDD, most candidates talk about catching bugs. The stronger answer is: TDD is primarily a design tool. Writing a test first forces you to define the public API before the implementation — you can't write cart.getTotal() in a test without deciding its name, return type, and caller interface. That design pressure consistently produces cleaner APIs than writing implementation first.
📊 Production Insight
A startup used TDD for their core billing engine but skipped it for the UI. The billing engine had near-zero bugs; the UI had constant regressions. The difference wasn't test coverage — it was the order.
TDD forces you to design the API contract before getting attached to implementation. That's why the billing engine's interface never needed breaking changes.
Rule: If the behaviour is complex and the API is public, TDD is your default. If you're exploring (spike), write tests after the API stabilises.
🎯 Key Takeaway
TDD is a design tool, not just a testing technique.
Write tests first for logic-heavy code; prototype first for exploratory work.
Post-implementation tests confirm what you built; pre-implementation tests define what you should build.

Why TDD Fails in Practice — The Cultural and Technical Traps

Many teams adopt TDD with enthusiasm and abandon it within two sprints. The reasons are rarely technical. They're cultural and habitual.

Trap 1: All-or-nothing mindset. Teams decide 'we will do TDD on everything' and immediately hit friction with legacy code, UI, and database layers. When they can't test-drive a stored procedure, they declare TDD broken. The fix: carve out a 'TDD zone' — new business logic only. Legacy code gets covered later with characterisation tests.

Trap 2: Tests as a checkbox. When management requires code coverage numbers, developers write tests that exercise code but verify nothing meaningful. A test that calls a method and doesn't assert anything is worse than no test — it creates false confidence. TDD explicitly prevents this because you must see a Red phase first.

Trap 3: No refactoring step. Teams do the Red and Green phases but skip Refactor because 'the tests pass, the code works.' After three sprints, the codebase becomes a tangle of duplicated logic and unclear names. The tests still pass, but the code is hard to change — exactly the problem TDD was supposed to solve.

Trap 4: Writing tests that are too big. A single test that covers an entire use case is fragile and slow. One change in a different part of the flow breaks it, and you spend 30 minutes debugging which assertion failed. TDD's one-behaviour-per-test rule prevents this, but it requires discipline.

AvoidingCommonTraps.java · JAVA
12345678910111213141516171819202122232425262728293031323334353637383940
// ─── Trap 1: All-or-nothing — carve a TDD zone ──────────────

// Antipattern: Try to test a legacy class with 500 lines and 8 dependencies
// @Test void testLegacyMonster() { ... }  // This will be painful

// Pattern: Write a characterisation test first (capture current behaviour)
// Then refactor with the safety net of that test
// Only apply TDD to new code paths

// ─── Trap 2: Tests as a checkbox — assert something meaningful ─

// Antipattern: Test that doesn't assert
// @Test void testGetTotal() {
//     ShoppingCart cart = new ShoppingCart();
//     cart.addItem("test", 10);
//     cart.getTotal();  // no assertion!
// }

// Pattern: Every test must have at least one assertion
// @Test void testGetTotal() {
//     assertEquals(10, cart.getTotal());
// }

// ─── Trap 3: Skipping Refactor ────────────────────────────────

// After Green, look for:
// - Magic numbers: DISCOUNT_THRESHOLD = 100.00 instead of hardcoded 100
// - Duplicated logic: extract to private method
// - Unclear names: `applyBulkDiscountIfEligible` vs `x`
// Run tests after every rename or extract

// ─── Trap 4: Tests that are too big ────────────────────────────

// Antipattern: One test for the whole workflow
// @Test void testFullCheckout() { ... }  // 50 lines, 6 assertions

// Pattern: One test per behaviour
// @Test void testCartTotalForOneItem() { ... }
// @Test void testCartTotalForMultipleItems() { ... }
// @Test void testDiscountAppliedWhenOverThreshold() { ... }
▶ Output
// Applying these patterns reduces test fragility and maintenance cost.
// The key insight: TDD is sustainable only when you keep tests small,
// meaningful, and tied to a single behaviour.
Mental Model
The TDD Habit Loop
TDD is a habit, not a technique. You build it by pairing the cue (new behaviour needed) with a tiny reward (green bar) repeated every few minutes.
  • Cue: You need to implement a new piece of behaviour.
  • Routine: Write a test (Red), write minimal code (Green), clean up (Refactor).
  • Reward: Green bar — a dopamine hit of verified progress.
  • Break the loop if: You find yourself writing tests without a Red phase (no cue), or skipping Refactor (no cleanup reward).
  • Strongest habit: Pair with a timer. 5 minutes per cycle. If you're still in Red after 5 minutes, the test is too big.
📊 Production Insight
A team at a financial services company adopted TDD across all new microservices. Within two months, two services were abandoned — because the tests were too coupled to the implementation. Every refactor broke 30 tests. The team blamed TDD, but the real problem was testing implementation details instead of behaviour.
Rule: If a refactor breaks more than 5 tests, you're testing internals, not behaviour. TDD should make refactoring easier, not harder.
🎯 Key Takeaway
TDD fails when it's applied dogmatically to everything.
TDD fails when tests have no assertions.
TDD fails when you skip the Refactor step.
TDD fails when tests are too big.
Avoid these traps by keeping tests small, behaviour-focused, and refactoring with a green safety net.

TDD and Legacy Code — How to Introduce Tests Without Rewriting Everything

You land on a team with 200,000 lines of untested code. TDD feels impossible because the system wasn't designed for testability. You have three options: rewrite (expensive and risky), add tests after every change (better but slow), or use characterisation tests to capture the current behaviour as a safety net, then apply TDD to new code.

Characterisation tests are written after the fact but with the TDD mindset: you run the code, observe the output, and write a test that asserts that output. This gives you a safety net for refactoring. Once you have a characterisation test, you can refactor the implementation with confidence — and then write new features using TDD.

The Seam technique from Michael Feathers' 'Working Effectively with Legacy Code' is the practical tool. Find a seam — a place where you can intercept behaviour (a virtual method, an interface, a dependency injection point). Write a test that exercises the code through that seam. Now you have a testable unit. Over time, you extract seams, cover them with tests, and gradually introduce TDD for changes.

The 10% rule: For every legacy code change, you must cover at least 10% of the changed file with tests (characterisation or TDD). Within 10 changes, the file is 100% covered. This is the only sustainable way to introduce TDD into a legacy codebase.

LegacyCodeTest.java · JAVA
12345678910111213141516171819202122232425262728293031
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

/**
 * Characterisation test — captures current behaviour of a legacy class
 * that has no tests. We run the actual method and assert what it returns NOW.
 * This test becomes the safety net for future refactoring.
 */
class LegacyDiscountCalculatorTest {

    private final LegacyDiscountCalculator calc = new LegacyDiscountCalculator();

    @Test
    void appliesDiscountForAmountAbove100() {
        // We run the legacy code and capture the actual output
        double result = calc.calculate(150.00);
        // Based on observing the output, we assert the current behaviour
        assertEquals(135.00, result, 0.001,
            "Legacy behaviour: 10% discount on amounts over $100");
    }

    @Test
    void doesNotApplyDiscountAtExactly100() {
        double result = calc.calculate(100.00);
        assertEquals(100.00, result, 0.001,
            "Legacy behaviour: exactly $100 gets no discount? Let's verify");
    }

    // Once these characterisation tests pass, we can refactor the implementation
    // with confidence — and then write new features using TDD on the refactored code
}
▶ Output
// Running LegacyDiscountCalculatorTest:
// [INFO] Tests run: 2, Failures: 0, Errors: 0
//
// If a test fails, it means the legacy code's current behaviour differs from
// what we observed. That's a sign to re-examine the requirement, not a bug.
// Characterisation tests document intent; they don't judge correctness.
💡The Seam Technique in Practice
To apply TDD to legacy code, find where you can break a dependency. For a class that creates its own database connection, add a constructor parameter to accept the connection. That's a seam. Now you can write a test that passes a mock connection and verify behaviour. Over time, you'll refactor the entire class to be testable, and TDD becomes natural for new features.
📊 Production Insight
A healthcare startup had a billing engine with zero tests and a critical bug leading to underbilling. Management wanted to rewrite it with TDD. The rewrite took 6 months and introduced new bugs. The better approach: characterisation tests to capture the correct (and incorrect) behaviours, then refactor incrementally, applying TDD to each new feature.
Rule: Don't rewrite to introduce TDD. Use characterisation tests as a safety net, then apply TDD from the next feature forward. It takes longer initially but is safer.
🎯 Key Takeaway
Legacy code can be tamed with characterisation tests.
Seams — virtual methods, interfaces, parameters — make code testable.
The 10% rule: each change to a legacy file must add at least 10% test coverage.
Don't rewrite everything. Apply TDD to new code only, and build a safety net for old code.
AspectTest-Driven Development (TDD)Testing After Implementation
When tests are writtenBefore the implementation existsAfter implementation is complete
Primary benefitForces clear API design up frontConfirms existing behaviour works
Design influenceTests shape the production APITests conform to whatever was built
Catching bad requirementsEarly — test exposes ambiguity before codingLate — ambiguity is baked into implementation
Refactoring safetyHigh — tests are the safety net for cleanupModerate — depends on test coverage quality
Learning curveSteep initially; gets fast with practiceFamiliar — mirrors how most developers start
Risk of over-testingLower — tests stay focused on behaviourHigher — temptation to test implementation details
Best suited forBusiness logic, algorithms, state machinesUI components, exploratory prototypes, spikes
Test quality tendencyTests challenge the designTests confirm the design

🎯 Key Takeaways

  • TDD is a design tool first, a bug-catching tool second — writing a test before implementation forces you to define the API from the caller's point of view, which consistently produces simpler, cleaner interfaces.
  • The Red phase is not optional or symbolic — if your test passes before you write any implementation, either the feature already exists or your test is broken. A test that never fails has never proven anything.
  • Refactor only happens while tests are green — the entire point is that your passing tests act as a safety net; if you refactor when tests are red, you're changing behaviour and fixing bugs at the same time, and you can't tell which caused the next failure.
  • TDD is not universally applicable — use it as your default for logic-heavy code, but prototype first for exploratory work where the design is unknown, then write tests once the API stabilises.
  • Introduce TDD into legacy code using characterisation tests and the 10% rule: every change adds at least 10% test coverage to the changed file. Within 10 changes, the file is fully covered.

⚠ Common Mistakes to Avoid

    Writing multiple tests before writing any implementation
    Symptom

    You write 10 failing tests, then write implementation, then discover the design was wrong for tests 7-10 and rewrite everything. You waste hours rewriting tests instead of building features.

    Fix

    Strictly one test at a time. Write one test, make it green, refactor, then write the next. The cycle is per-test, not per-feature. The design emerges incrementally.

    Testing implementation details instead of behaviour
    Symptom

    You test that a private method was called, or that a specific internal data structure was used. These tests break every time you refactor, making TDD feel like a burden.

    Fix

    Only test observable behaviour — public method inputs and outputs. If you can refactor the internals without changing a single test, your tests are targeting behaviour correctly. Use mocks sparingly.

    Skipping the Refactor step
    Symptom

    After a few weeks the code is green but unreadable — duplicated logic, unclear names, long methods — because every Green phase added code but nobody ever cleaned up.

    Fix

    Treat Refactor as non-negotiable. Set a rule: no new Red test until the last Green test's code is clean. The tests only protect you if you actually use them as a net while you refactor. Dedicate 20% of development time to refactoring.

    Using TDD for everything (including exploratory code)
    Symptom

    You spend more time writing tests for code that will be thrown away than actually exploring the solution. TDD becomes a bottleneck and you abandon it.

    Fix

    Reserve TDD for logic-heavy production code. For exploratory spikes, write code first, then write tests once the design stabilises. Know when to use TDD and when to prototype.

Interview Questions on This Topic

  • QWhat is the Red-Green-Refactor cycle and what is the specific purpose of each phase?Mid-levelReveal
    Red: Write a test that describes a single piece of desired behaviour. It must fail. If it passes without implementation, either the feature already exists or the test is invalid. Green: Write the minimum code to pass that test. No extra logic — get the bar green as fast as possible. Refactor: With the test passing, clean up the implementation (rename, extract, restructure). The critical insight is that Refactor only happens while tests are green — if you refactor while tests are red, you're changing behaviour and fixing bugs simultaneously, and you can't attribute the next failure to either change. That's the detail that separates engineers who've actually done TDD from those who've only read about it.
  • QHow does TDD improve software design, beyond just catching bugs?SeniorReveal
    TDD improves design by forcing you to define the public API from the caller's perspective before you're invested in any implementation detail. When you write cart.getTotal() in a test, you're deciding the method name, return type, and whether it takes parameters — all before writing the logic. This caller-first pressure consistently produces simpler, more cohesive interfaces because you can't hide behind implementation complexity. Additionally, TDD surfaces ambiguous requirements early (e.g., 'should the discount apply at exactly $100?') before they become expensive bugs baked into the codebase.
  • QWhen would you choose NOT to use TDD?SeniorReveal
    Honest engineers acknowledge that TDD slows down exploratory work where the design is unknown. For example, when prototyping a new UI component or exploring a library's API, forcing TDD at the unit level adds friction without proportional benefit. Similarly, database schema migrations, configuration-heavy code, and integration layers are often better tested at the integration or end-to-end level rather than forcing a unit-test-first approach. The pragmatic rule is: use TDD as default for business logic and algorithms; prototype first for exploratory work, then write tests once the API stabilises.
  • QWhat's the difference between TDD and BDD (Behaviour-Driven Development)?Mid-levelReveal
    TDD is a development technique — you write unit tests in code before writing implementation. BDD is a collaboration methodology that extends TDD by writing tests in a near-natural language (like Cucumber's Gherkin syntax) so that non-technical stakeholders can read and contribute to the test specification. BDD tests are typically higher-level and describe user-facing behaviour; TDD tests can be very granular and technical. Many teams use both: BDD for acceptance criteria (e.g., 'Given a cart with items over $100, When the total is calculated, Then a 10% discount is applied') and TDD for the unit-level implementation (e.g., 'test that the discount method returns 90% of the raw total').

Frequently Asked Questions

Does TDD mean I have to write tests for every single line of code?

No. TDD means you write a test for every piece of behaviour you want your system to exhibit — not every line of implementation. One well-written test can cover a dozen lines of logic. The goal is 100% coverage of your requirements, not 100% line coverage, which is a very different thing.

Is TDD worth the extra time it takes?

The common objection is that TDD is slow. Studies (including Microsoft Research and IBM work on TDD adoption) consistently show that TDD teams spend roughly 15-35% more time on initial development but see 40-90% reductions in defect rates. The time saved in debugging, regression, and production incidents pays back the upfront investment quickly — usually within the same sprint.

What's the difference between TDD and BDD (Behaviour-Driven Development)?

TDD is a development technique — you write unit tests in code before writing implementation. BDD is a collaboration methodology that extends TDD by writing tests in a near-natural language (like Cucumber's Gherkin syntax) so that non-technical stakeholders can read and contribute to the test specification. BDD tests are typically higher-level and describe user-facing behaviour; TDD tests can be very granular and technical. Many teams use both: BDD for acceptance criteria, TDD for unit-level design.

How do I start using TDD on a legacy codebase with no tests?

Don't attempt a rewrite. Instead, use characterisation tests — run the existing code, observe the output, and write tests that assert that output. This gives you a safety net. Then apply the 'seam' technique: find points where you can break dependencies (e.g., add constructor injection for a database dependency) to isolate code for testability. Finally, apply the 10% rule: every time you modify a legacy file, add tests covering at least 10% of that file. Within ten modifications, it'll be fully covered.

Should I use TDD for UI components?

TDD is harder to apply to UI components because the behaviour is often tied to rendering and event handling. Many experienced engineers prototype the UI first, then write tests once the design settles. For complex UI logic (e.g., a state machine for a multi-step form), TDD works well — write a test for the state transitions before implementing them. The key is to keep the logic and presentation separate.

🔥
Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

← PreviousCode Review Best PracticesNext →Software Testing Types
Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged