Job Sequencing — First-Free-Slot Bug Costs $50k
- Sort by profit descending — greedily assign highest-profit jobs first.
- Assign each job to the latest available slot ≤ its deadline.
- Naive O(n²); Union-Find optimisation achieves O(n log n).
- Sort jobs by profit descending: highest profit first.
- For each job, find the latest free slot ≤ its deadline.
- Assign job to that slot — keeping earlier slots open for other jobs.
- Naive O(n²) slot search; Union-Find reduces to near O(n log n).
- Proven optimal via exchange argument: any optimal schedule can be transformed to this greedy result without reducing profit.
Quick Job Sequencing Troubleshooter
Profit too low
print(sorted(jobs, key=lambda x: x[2], reverse=True))Check slot-assignment loop: for s in range(deadline, 0, -1):Union-Find returns 0 jobs
print(parent[:10])Check find() path compression: parent[i] = find(parent, parent[i])Production Incident
Production Debug GuideDiagnose why your greedy scheduler is producing suboptimal profit or skipping jobs that should fit.
find() compresses properly; missing path compression leads to O(n²) anyway.sorted() or sort() with key, not custom comparison.Job scheduling with deadlines is a classic operations research problem that appears in OS scheduling, manufacturing planning, and project management. The greedy insight — schedule highest-paying jobs as late as possible within their deadlines — is counterintuitive: you might expect to schedule early, but scheduling late preserves earlier slots for potentially higher-value jobs that might arrive.
The Union-Find optimisation that reduces O(n^2) to O(n log n) is an elegant application of the disjoint set data structure in a non-graph context — a useful demonstration that Union-Find is about more than just connected components.
Greedy Algorithm
Sort jobs by profit descending. For each job, find the latest available slot before its deadline and assign it there.
This is the core greedy choice: take the highest-profit job and defer it as late as possible. Why late? Because early slots are scarce — a job with deadline 1 can only go in slot 1. If you take a low-profit job for slot 1, you block every job with deadline 1. By scheduling high-profit jobs late, you keep earlier slots flexible for jobs that have no choice but to go early.
def job_scheduling(jobs: list[tuple]) -> tuple[int, list]: """ jobs: list of (job_id, deadline, profit) Returns (total_profit, scheduled_job_ids) """ jobs = sorted(jobs, key=lambda x: x[2], reverse=True) max_deadline = max(d for _, d, _ in jobs) slots = [None] * (max_deadline + 1) # slots[1..max_deadline] total = 0 scheduled = [] for job_id, deadline, profit in jobs: # Find latest free slot at or before deadline for slot in range(deadline, 0, -1): if slots[slot] is None: slots[slot] = job_id total += profit scheduled.append(job_id) break return total, scheduled jobs = [('a',2,100), ('b',1,19), ('c',2,27), ('d',1,25), ('e',3,15)] print(job_scheduling(jobs)) # (142, ['a', 'c', 'e'])
Union-Find Optimisation — O(n log n)
The naive slot-finding is O(n) per job — O(n²) total. Using Union-Find (disjoint set), we can find the latest free slot in near-O(1) amortised time, giving O(n log n) overall.
How it works: treat each time slot as a node in a Union-Find structure. When a slot is used, union it with the slot before it (slot → slot-1). The find operation returns the largest free slot ≤ given deadline. It's essentially a 'next available lower slot' finder, with path compression making almost constant time.
def find(parent, i): if parent[i] != i: parent[i] = find(parent, parent[i]) return parent[i] def job_scheduling_uf(jobs): jobs = sorted(jobs, key=lambda x: x[2], reverse=True) max_d = max(d for _, d, _ in jobs) parent = list(range(max_d + 2)) # extra sentinel for slot 0 total, scheduled = 0, [] for job_id, deadline, profit in jobs: slot = find(parent, deadline) if slot > 0: parent[slot] = slot - 1 # union with previous slot total += profit scheduled.append(job_id) return total, scheduled # Same example yields same result jobs = [('a',2,100), ('b',1,19), ('c',2,27), ('d',1,25), ('e',3,15)] print(job_scheduling_uf(jobs)) # (142, ['a', 'c', 'e'])
Complexity
Naive: O(n log n) sort + O(n²) slot finding = O(n²) Union-Find: O(n log n) sort + O(n α(n)) slot finding ≈ O(n log n)
For most interview purposes the naive O(n²) implementation is sufficient. But you should understand both — the interviewer may ask how to optimise when n reaches 10⁵.
Proof of Optimality via Exchange Argument
We need to prove that the greedy algorithm (highest profit first, latest slot) produces maximum total profit. The exchange argument works like this:
- Let G be the set of jobs selected by greedy. Let O be an optimal set with maximum total profit.
- Compare G and O from highest profit to lowest. At the first point they differ, greedy chose job j (higher profit) while optimal chose job k (lower profit or none).
- Since j has higher profit and is scheduled at or after k's slot, we can swap k with j in O without reducing profit and without violating deadlines.
- After a finite number of swaps, O becomes identical to G, proving G is optimal.
This argument holds only because the profit-sorted order is monotonic — the greedy never picks a lower-profit job when a higher-profit one could be placed. The 'latest slot' rule ensures we don't block future jobs unnecessarily.
- Slot 1 can only fit jobs with deadline ≥ 1 — every job qualifies.
- Slot 10 can only fit jobs with deadline ≥ 10 — fewer jobs qualify.
- So slot 1 is the scarcest: fill it last, with a job that absolutely must go there.
- By scanning from deadline down to 1, you effectively 'push' each job as far right as possible.
Variations and Real-World Extensions
The classic job sequencing problem assumes each job takes exactly one time unit. In practice, jobs have varying durations and may depend on each other. Here are common variations:
Weighted Interval Scheduling: Jobs have start/end times and durations. Greedy fails; DP required. Job Sequencing with Release Times: Each job has a release time (earliest start). Greedy still works with slight modification — always pick the job with highest profit among available ones. Multi-Machine Scheduling: Multiple workers/CPUs. NP-hard in general; heuristic approaches used. Minimum Lateness Scheduling: Instead of maximising profit, minimise maximum lateness. Solved by EDD (Earliest Due Date) — a different greedy.
Understanding the classic version provides the foundation for these extensions.
| Criterion | Naive O(n²) | Union-Find O(n log n) |
|---|---|---|
| Time complexity (after sort) | O(n * max_deadline) → O(n²) worst | O(n α(n)) ≈ O(n) |
| Space complexity | O(max_deadline) | O(max_deadline) |
| Code complexity | Simple loop, no extra DS | Requires Union-Find with path compression |
| Best for n | < 5000 | ≥ 10000 |
| Off-by-one risk | Low: slots[1..D] easy | Medium: parent array size + sentinel |
| Memorisation difficulty | Straightforward | Moderate; need to recall parent[] = slot-1 |
🎯 Key Takeaways
- Sort by profit descending — greedily assign highest-profit jobs first.
- Assign each job to the latest available slot ≤ its deadline.
- Naive O(n²); Union-Find optimisation achieves O(n log n).
- Exchange argument proves optimality — a core greedy proof technique.
- Unit time per job is a critical assumption; drop it and the problem changes entirely.
- This pattern transfers to any slot-based resource allocation under deadlines.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhy do we assign jobs to the latest possible slot rather than earliest?Mid-levelReveal
- QWhat data structure optimises slot-finding to O(log n)?SeniorReveal
- QProve that sorting by profit and taking the latest available slot is optimal.SeniorReveal
- QHow would you handle a variant where each job takes X time units instead of 1?SeniorReveal
- QWhat if jobs have release times where they cannot be started before a certain time?SeniorReveal
Frequently Asked Questions
What if no slot is available for a job?
Skip it — this is the greedy choice. Taking a lower-profit job that fits would only reduce total profit. The algorithm already considered jobs in descending order of profit, so if a high-profit job can't fit, no later job will improve the total by replacing it.
Why does the Union-Find parent array need to be size max_deadline + 2?
Because find() is called with deadline values, and we also need a sentinel at index 0. When find(deadline) returns 0, it means no free slot ≤ deadline exists. So the array must have indices 0..max_deadline inclusive, which is max_deadline + 1 elements. Adding one more (+2 total) is a safety measure; some implementations use find(deadline) and then union with deadline-1, so accessing parent[deadline-1] requires index up to max_deadline-1 only. But the extra element prevents off-by-one errors.
Can this algorithm handle duplicate profits?
Yes. Sorting by profit descending is stable in Python (and most languages) — ties are broken in original order. Since all jobs with equal profit are equivalent, any tie-breaking is fine as long as you apply the 'latest slot' rule. The total profit will be the same regardless.
What is the time complexity if we sort using insertion sort?
Insertion sort is O(n²) worst-case, making the whole algorithm O(n²). Never use insertion sort for the sort step — use the built-in sorted() which is O(n log n). The sort dominates the overall complexity, so it's critical to use an efficient sort.
How does this relate to Earliest Deadline First (EDF) scheduling?
EDF schedules the job with the nearest deadline at each time slot, aiming to meet all deadlines. It minimises maximum lateness but does not maximise profit. The greedy in this article maximises profit, possibly dropping some low-profit jobs even if they could meet their deadlines. They solve different objectives.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.