Priority Queue and Heap
- A min-heap guarantees the smallest element is always at the root (index 0), giving you true O(1) peek — the single most useful property in scheduling and pathfinding.
- Time Complexity: heappush and heappop are O(log n); heapify is surprisingly O(n) because it starts from the bottom; finding the min is O(1).
- Space Complexity: In-place heapify uses O(1) extra space — perfect for memory-tight environments like embedded systems or massive logs.
A priority queue is an abstract data structure where each element has a priority and the highest-priority element is always dequeued first. A binary heap is the most common implementation — it is a complete binary tree stored as an array where each parent is smaller (min-heap) or larger (max-heap) than its children. Python's heapq module implements a min-heap.
How a Priority Queue Works — Plain English
A priority queue always dequeues the element with the highest (or lowest) priority, not the oldest one. It is implemented with a binary heap — a complete binary tree where every parent is smaller than its children (min-heap) or larger (max-heap).
Core operations: 1. insert(x, priority): add element. Heap 'bubbles up' to maintain heap property. O(log n). 2. extract_min() / extract_max(): remove and return the minimum/maximum. Heap 'sinks down'. O(log n). 3. peek(): return min/max without removing. O(1).
Python's heapq is a min-heap. For max-heap, negate values.
Worked example — min-heap insert then extract: Insert 5: heap=[5]. Insert 3: heap=[5,3]. Bubble up: 3 < parent 5. Swap. heap=[3,5]. Insert 7: heap=[3,5,7]. 7 >= parent 3 (index 0 parent of index 2). No swap. Insert 1: heap=[3,5,7,1]. Bubble up: 1 < parent 5. Swap: [3,1,7,5]. 1 < parent 3. Swap: [1,3,7,5]. extract_min(): remove 1. Move 5 (last) to root: [5,3,7]. Sink down: 5>min(3,7)=3. Swap: [3,5,7]. Extracted: 1.
The Binary Heap: Array-Based Tree Logic
Picture a heap as a very disciplined family tree where every parent is stricter (smaller or larger) than its kids, and the entire tree is always packed to the left. Because it’s complete, we can throw away the pointers and just use a plain list — that’s why heaps are so memory efficient and cache-friendly.
For any index i in the array: • Left child → 2i + 1 • Right child → 2i + 2 • Parent → (i - 1) // 2
The magic rule (Heap Invariant) is simple: no parent is ever “worse” than its children. In a min-heap the smallest guy always bubbles to index 0. Break this rule even once and the whole thing collapses — that’s why we have heapify and sifting operations. I still draw this array-as-tree on a whiteboard every time I explain it to someone; it clicks instantly.
# Package: io.thecodeforge.algorithms.ds # The heap [1, 3, 5, 7, 9, 8] represented as a tree: # 1 (index 0) # / \ # 3 5 (index 1, 2) # / \ / \ # 7 9 8 (index 3, 4, 5) def get_heap_relationships(heap): relationships = [] for i in range(len(heap)): left_idx = 2 * i + 1 right_idx = 2 * i + 2 parent_idx = (i - 1) // 2 if i > 0 else None relationships.append({ "val": heap[i], "parent": heap[parent_idx] if parent_idx is not None else None, "left": heap[left_idx] if left_idx < len(heap) else None, "right": heap[right_idx] if right_idx < len(heap) else None }) return relationships # Standard Heap Layout my_heap = [1, 3, 5, 7, 9, 8] for node in get_heap_relationships(my_heap): print(f"Node: {node['val']} | Parent: {node['parent']} | L: {node['left']} | R: {node['right']}")
Node: 3 | Parent: 1 | L: 7 | R: 9
Node: 5 | Parent: 1 | L: 8 | R: None
Node: 7 | Parent: 3 | L: None | R: None
Node: 9 | Parent: 3 | L: None | R: None
Node: 8 | Parent: 5 | L: None | R: None
Production Usage: Python's heapq Module
Real talk — in 99% of production code and interviews you will never implement a heap from scratch. Python’s heapq is battle-tested, written in C under the hood, and ridiculously fast. The golden pattern every senior engineer uses is storing tuples: (priority, counter, data). The counter prevents comparison crashes when priorities are equal (a bug I’ve debugged in live systems more than once).
You’ll see this everywhere: background job queues, rate-limiters, merge k sorted streams, and even in custom schedulers.
import heapq # 1. Initialize a Priority Queue for Task Management tasks = [] # Format: (priority_level, unique_id, task_description) # unique_id prevents comparison errors if priorities are equal heapq.heappush(tasks, (3, 101, 'Cleanup logs')) heapq.heappush(tasks, (1, 102, 'Fix production crash')) heapq.heappush(tasks, (2, 103, 'Update documentation')) # 2. Efficient Extraction while tasks: priority, _id, task = heapq.heappop(tasks) print(f'Processing: {task} (Priority {priority})') # 3. Heapify: O(n) construction from unsorted list # Much faster than n-calls to heappush [O(n log n)] raw_metrics = [45, 12, 98, 3, 7, 22] heapq.heapify(raw_metrics) print(f"Smallest metric: {raw_metrics[0]}") # 3
Processing: Update documentation (Priority 2)
Processing: Cleanup logs (Priority 3)
Smallest metric: 3
The Max-Heap Pattern & Top-K Problems
Python only gives you a min-heap, so we just flip the sign — a trick every LeetCode grinder and system designer knows by heart. This pattern shows up in “k closest points”, “k largest elements in stream”, “median in data stream”, and even in designing Twitter’s trending topics.
For huge streams where you only care about the top 10, a heap of size k eats way less memory than sorting millions of items. I still remember the first time I replaced a full sort with a heap in a production analytics pipeline — CPU dropped 70% overnight.
import heapq class MaxForgeHeap: """Wrapper to facilitate Max-Heap behavior in Python""" def __init__(self, data=None): self.data = [-x for x in data] if data else [] heapq.heapify(self.data) def push(self, val): heapq.heappush(self.data, -val) def pop(self): return -heapq.heappop(self.data) # Practical: Top 3 high scores scores = [1500, 3200, 1200, 5000, 2800] top_k = heapq.nlargest(3, scores) # Optimized internal implementation print(f"Top 3 Scores: {top_k}")
🎯 Key Takeaways
- A min-heap guarantees the smallest element is always at the root (index 0), giving you true O(1) peek — the single most useful property in scheduling and pathfinding.
- Time Complexity: heappush and heappop are O(log n); heapify is surprisingly O(n) because it starts from the bottom; finding the min is O(1).
- Space Complexity: In-place heapify uses O(1) extra space — perfect for memory-tight environments like embedded systems or massive logs.
- Python's heapq uses 0-based indexing with the beautiful 2i+1 / 2i+2 child formula — once you internalize this, you’ll never forget heap navigation again.
- For stable priority queues that won’t explode on equal priorities, always include a unique counter: (priority, counter, item). I’ve seen this save live systems more times than I can count.
Interview Questions on This Topic
- QExplain the 'Heapify' process. Why do we start at the first non-leaf node instead of the root?
- QLeetCode 215: Find the Kth Largest Element in an Array. Why is a Min-Heap of size K more efficient than a Max-Heap of size N for this problem?
- QHow would you implement a 'Median Finder' data structure that supports 'addNum' and 'findMedian' in O(log n) and O(1) respectively? (Hint: Two Heaps).
- QWhat is the time complexity of 'Decrease Key' in a standard binary heap, and how does that affect Dijkstra’s algorithm?
- QDescribe a scenario where a Priority Queue would be used in a real-world operating system kernel.
- QHow would you design a rate-limiter that serves the highest-priority requests first during a traffic spike?
Frequently Asked Questions
Why is heapify O(n) instead of O(n log n)?
Because most nodes are near the leaves. When you start from the bottom, roughly half the nodes don’t move at all, a quarter move only one level, and so on. The math sums to roughly 2n operations — a beautiful linear result that still surprises people every time they see it in interviews.
Is a binary heap better than a sorted array for a priority queue?
Almost always yes if you have any insertions or deletions. A sorted array makes every insert O(n) because of shifting. A heap keeps everything fast. The only time a sorted array wins is when you build once and never touch it again (rare in real systems).
Can I search for an arbitrary element in a heap in O(log n)?
Nope — and this catches so many people in interviews. A heap is NOT a BST. It only cares about parent-child order. Want fast lookup? Use a hashmap alongside the heap (common pattern in sliding window + top-k problems).
How do you implement a max-heap using Python's heapq (which is min-heap only)?
Negate all values before inserting and negate again when extracting. heapq.heappush(h, -x) inserts x as if for a max-heap. heapq.heappop(h) returns -value, so negate it: x = -heapq.heappop(h). This works because negating flips the comparison order.
What is the difference between a heap and a sorted array for a priority queue?
Heap: insert O(log n), extract-min O(log n), peek O(1). Sorted array: insert O(n) (to maintain sort), extract-min O(1), peek O(1). Use a heap when you insert frequently; use a sorted array when you mostly extract with rare inserts.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.