Dutch National Flag — Why mid++ After High Swap Drops Data
Incrementing mid after high swap skipped 12% of ERROR logs in production.
- Three pointers: low (0s boundary), mid (scan pointer), high (2s boundary)
- Invariant: arr[0..low-1]=0s, arr[low..mid-1]=1s, arr[mid..high]=unknown, arr[high+1..n-1]=2s
- Time: O(n). Space: O(1). Single pass, in-place.
- Critical rule: never increment mid after swapping with high
- Used as 3-way partition in QuickSort for arrays with many duplicates — turns O(n^2) into O(n)
- Cache-friendly: single pass, no auxiliary array
- Incrementing mid after swapping with high. The swapped element is unexamined — it could be 0, 1, or 2.
Imagine you have a pile of red, white, and blue marbles all mixed together. You want to sort them so all reds are on the left, all whites are in the middle, and all blues are on the right — but you only want to touch each marble once. The Dutch National Flag algorithm is exactly that sorting strategy, named after the three-coloured flag of the Netherlands. It uses three 'hands' that crawl toward each other, swapping marbles into the right zone as they go.
Sorting an array with only three distinct values is a constraint that makes generic comparison sorts wasteful. Counting sort works but requires two passes. QuickSort is O(n log n) and overkill. The Dutch National Flag algorithm solves it in a single pass — O(n) time, O(1) space — by maintaining three pointers that carve the array into four live regions as they converge.
The algorithm was proposed by Edsger Dijkstra. It is the foundation of 3-way QuickSort, which handles duplicate-heavy arrays efficiently. The classic problem is LeetCode 75 'Sort Colors', where values are 0, 1, 2 representing red, white, blue.
The critical invariant: at every step, arr[0..low-1] contains only 0s, arr[low..mid-1] contains only 1s, arr[high+1..n-1] contains only 2s, and arr[mid..high] is unexamined. The mid pointer scans forward; low and high converge inward. The single rule that causes 90% of bugs: never increment mid after swapping with high.
How Dutch National Flag Works — Plain English
The Dutch National Flag algorithm sorts an array of three distinct values (0, 1, 2) in O(n) time, O(1) space, single pass.
Three-way partition invariant — three pointers divide the array: arr[0..low-1] = 0s. arr[low..mid-1] = 1s. arr[mid..high] = unknown. arr[high+1..n-1] = 2s.
Step-by-step: 1. Initialize low=0, mid=0, high=n-1. 2. While mid <= high: a. arr[mid]==0: swap arr[low]↔arr[mid], low++, mid++. b. arr[mid]==1: mid++ (already correct region). c. arr[mid]==2: swap arr[mid]↔arr[high], high--. Do NOT advance mid.
Worked example on [2,0,2,1,1,0]: low=0,mid=0,high=5. arr[0]=2: swap arr[0]↔arr[5]→[0,0,2,1,1,2]. high=4. arr[0]=0: swap arr[0]↔arr[0]. low=1,mid=1. arr[1]=0: swap arr[1]↔arr[1]. low=2,mid=2. arr[2]=2: swap arr[2]↔arr[4]→[0,0,1,1,2,2]. high=3. arr[2]=1: mid=3. arr[3]=1: mid=4. mid>high. Done. Result: [0,0,1,1,2,2].
- arr[0..low-1]: all 0s. Sorted. Never touched again.
- arr[low..mid-1]: all 1s. Sorted. Never touched again.
- arr[mid..high]: unknown. This is where mid scans.
- arr[high+1..n-1]: all 2s. Sorted. Never touched again.
- Loop ends when mid > high: unknown zone is empty.
The Three-Pointer Partition Strategy
The brilliance of the DNF algorithm lies in its ability to maintain four sections within the array using only three pointers: low, mid, and high.
[0 ... low-1]: Elements smaller than the pivot (0s).[low ... mid-1]: Elements equal to the pivot (1s).[mid ... high]: Elements yet to be explored (Unknown).[high+1 ... N-1]: Elements larger than the pivot (2s).
We iterate using the mid pointer. When we encounter a 0, we swap it with low and advance both. When we see a 1, we just move mid. When we see a 2, we swap it with high and decrement high—but we don't move mid yet, because the element swapped from the end is still unknown.
- Case 0: swap with low, advance both low and mid. The element at low was already examined (it was a 1 or was previously a 0).
- Case 1: advance mid only. The element is already in the correct zone.
- Case 2: swap with high, decrement high. Do NOT advance mid. The swapped element is unexamined.
- Why: the unknown zone is arr[mid..high]. After swapping with high, the new arr[mid] came from arr[high], which was in the unknown zone.
- Test with [2,2,2,2]. If mid is incremented on case 2, the result is wrong.
DNF as 3-Way QuickSort Partition — The Duplicate-Heavy Optimization
Dutch National Flag is the partition step in Dijkstra's 3-way QuickSort. Standard QuickSort partitions into two zones (< pivot, >= pivot). 3-way QuickSort partitions into three zones (< pivot, == pivot, > pivot). This is critical for arrays with many duplicate elements.
Standard QuickSort on an array of all identical elements: O(n^2) because every partition is maximally unbalanced (all elements go to one side). 3-way QuickSort with DNF: O(n) because all elements are grouped into the == pivot zone in a single pass, and only the < and > zones are recursed on.
The DNF partition in QuickSort uses the pivot value as the 'middle' value. Elements less than pivot go to the low zone, elements equal to pivot go to the mid zone, elements greater than pivot go to the high zone. The mid zone is already sorted and requires no further recursion.
- Standard QuickSort: 2-way partition (< pivot, >= pivot). O(n^2) on all-identical arrays.
- 3-way QuickSort: 3-way partition (< pivot, == pivot, > pivot). O(n) on all-identical arrays.
- DNF is the partition step in 3-way QuickSort.
- Java's Arrays.sort for primitives uses a dual-pivot QuickSort variant that incorporates 3-way partitioning.
- Use 3-way QuickSort when duplicates are expected. Use standard QuickSort when all elements are distinct.
Generalizing DNF to N Values — When 3 Pivots Aren't Enough
DNF is specialized for exactly 3 values. For 4 or more values, you need a generalized approach. The options are: counting sort (O(n+k) time, O(k) space), multiple DNF passes, or a bucket-based partition.
Counting sort is the simplest generalization: count occurrences of each value, then overwrite the array. It requires O(k) space and two passes. For k=4 or k=5, this is usually fine. For large k, the space cost dominates.
Multiple DNF passes: partition into 3 zones, then recursively apply DNF to each zone. This is essentially 3-way QuickSort with a fixed set of values. It works but adds complexity.
The practical recommendation: use counting sort for k <= 10. Use DNF (as 3-way QuickSort partition) for arbitrary values with expected duplicates. Use standard comparison sort for general-purpose sorting.
- k=3: DNF. O(n) time, O(1) space. Single pass. Cache-friendly.
- k=4-10: Counting sort. O(n+k) time, O(k) space. Two passes. Simple.
- k > 10 with duplicates: 3-way QuickSort. O(n log n) average. DNF as partition.
- k > 10, no duplicates: Standard QuickSort or
Arrays.sort(). O(n log n). - Decision rule: if k is known and small, counting sort. If k=3, DNF. Otherwise, comparison sort.
Log Priority Classifier Sorted Wrong: mid Incremented After High Swap Dropped Critical Alerts
- The mid-increment-after-high-swap bug is silent — no exception, no error, just a wrong partition that downstream systems trust.
- Always test DNF with homogeneous arrays: all 0s, all 1s, all 2s. These expose the mid-increment bug.
- Validate the partition invariant after sorting: arr[high+1..n-1] must contain only 2s.
- DNF is not just an interview algorithm. It is used in production for log classification, priority queues, and QuickSort partitioning.
- Document the no-increment-mid rule in code comments. Future developers will not know why mid is not advanced on the 2 case.
Key takeaways
Interview Questions on This Topic
Frequently Asked Questions
That's Arrays & Strings. Mark it forged?
3 min read · try the examples if you haven't