Intermediate 9 min · March 06, 2026

Probability Problems

Probability Pitfall – Uniformity Assumption Tanked CTR

Q: What is the difference between mutually exclusive and independent events?

Mutually exclusive means the two events cannot happen at the same time — rolling a 1 and rolling a 2 on a single die. Independent means one event doesn't affect the probability of the other — flipping a coin and rolling a die. Crucially, mutually exclusive events are NOT independent: if A happens, B definitely can't, so knowing A completely changes B's probability.

Q: When do I use combinations vs permutations in probability problems?

Use combinations (nCr) when selecting a group where order doesn't matter — a committee, a hand of cards, a lottery ticket. Use permutations (nPr) when the arrangement itself matters — a ranked podium, a password, a sequence of draws where first and second positions are different outcomes. Most aptitude probability questions involving selections use combinations.

Q: How do I know whether to add or multiply probabilities?

Add probabilities when the question involves OR — you want event A or event B to happen. Multiply when the question involves AND — you want event A to happen and then event B to happen. A quick mental test: replace the connecting word with its symbol. 'Or' → + (then subtract overlap if events share outcomes). 'And' → × (then use conditional probability if draws are without replacement).

Q: How do you calculate the probability of at least one event across multiple independent trials?

Use the complement: P(at least one success) = 1 - P(no success). For independent events with different probabilities, P(no success) = (1-p1) * (1-p2) * ... * (1-pn). If all probabilities are equal, P(at least one) = 1 - (1-p)^n. This method is always simpler than summing the cases of exactly one, exactly two, etc.

Top recommendations had CTR far below random from assuming equal likelihood.

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Everything here is grounded in real deployments.

✓ Production

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

Before you start⏱ 25 min

✓Solid grasp of fundamentals
✓Comfortable reading code examples
✓Basic production concepts

● Production Incident 🔎 Debug Guide

⚡Quick Answer

Probability is a fraction of favourable outcomes over total equally likely outcomes
OR means add probabilities (minus overlap); AND means multiply (adjust for dependency)
The complement trick: P(at least one) = 1 - P(none) saves time on multi-case problems
With replacement: independent draws; without replacement: reduce denominator after each draw
Enumerate small sample spaces to sanity-check formulas in interviews
Biggest mistake: confusing independent and dependent events — always verify replacement

✦ Definition~90s read

What is Probability Problems?

This article tackles a specific, high-stakes failure mode in data-driven decision-making: the uniformity assumption. In probability, this is the instinct to treat all outcomes as equally likely when they aren't—a mistake that silently destroys click-through rates (CTR) in A/B tests, recommendation systems, and ad auctions.

★

Imagine you have a bag with 3 red marbles and 2 blue marbles.

The article walks through why this assumption is a 'silent CTR killer' by grounding you in the actual mechanics of probability: it's a fraction (favorable/total), not a feeling. You'll learn why ignoring whether sampling is with or without replacement (e.g., showing the same ad twice vs. not) leads to wildly wrong predictions, and how combinatorics (permutations/combinations) rescues 'at least one' problems that intuition botches.

The piece then pivots to conditional probability and Bayes' Theorem—critical when new data (like a user's click history) updates your priors—and closes with expected value, the only sane way to make decisions under uncertainty. This isn't abstract math; it's the difference between a model that optimizes CTR and one that accidentally optimizes for noise.

Plain-English First

Imagine you have a bag with 3 red marbles and 2 blue marbles. Probability is just your way of answering: 'If I grab one without looking, how likely is it to be red?' You count the outcomes you WANT, divide by ALL possible outcomes, and that fraction is your probability. Every casino game, weather forecast, and spam filter runs on exactly this idea — just with more marbles.

Probability questions show up in almost every tech and finance aptitude round — not because companies want mathematicians, but because these problems reveal how you reason under uncertainty. When an interviewer asks 'what are the odds of drawing two aces from a shuffled deck?', they're watching whether you break a problem into smaller pieces, whether you remember to account for replacement vs. no-replacement, and whether you catch your own errors. These are the same mental habits that make a good engineer.

The frustrating part is that most candidates memorise formulas without understanding the logic behind them. So when a question is slightly rephrased — a bag becomes a box, marbles become cards — they freeze. The formula didn't break; their mental model was never solid in the first place. This article fixes that by building probability from the ground up: what it means, why the rules are what they are, and when each rule applies.

By the end of this article you'll be able to set up any classic probability problem from scratch without hunting for the right formula, spot the three most common traps that cost candidates marks, and explain your reasoning out loud — which is exactly what interviewers are listening for.

Why Uniformity Assumption Is a Silent CTR Killer

Probability problems aptitude is the ability to reason about outcomes under uncertainty — specifically, to identify when a uniform distribution assumption is invalid. In practice, this means recognizing that not all outcomes are equally likely, even when a problem statement implies they are. The core mechanic is mapping real-world frequencies to probability spaces correctly, which often requires conditional probability or Bayes' theorem.

Key properties: uniform distributions are rare in production systems. User behavior, network latency, and hardware faults all follow skewed distributions (e.g., power-law, Poisson). Assuming uniformity when data is skewed leads to systematic underestimation of rare events and overestimation of common ones. For example, if you model click probability as uniform across 10 ad slots but slot 1 gets 80% of clicks, your expected CTR is off by 4x.

Use this when designing A/B tests, capacity planning, or any system that relies on probabilistic guarantees. It matters because real systems fail not from wrong math, but from wrong assumptions about the underlying distribution. A 5% error in probability estimation can cascade into 30% error in revenue or latency SLOs.

⚠ Uniformity Trap

Never assume uniform distribution without evidence. In production, 'random' often means 'unknown skew' — test with real traffic first.

📊 Production Insight

Ad placement system assumed uniform click probability across 10 slots; slot 1 had 80% CTR, causing 50% revenue loss from misallocated premium inventory.

Symptom: A/B test showed 'no significant difference' between slot 1 and slot 10 because variance was inflated by the false uniformity assumption.

Rule of thumb: Always plot the empirical distribution before applying any probability model — if the top 20% of outcomes account for >80% of mass, assume non-uniformity.

🎯 Key Takeaway

Uniformity is a default assumption that is almost always wrong in production.

Conditional probability is the tool to correct for skew — use Bayes' theorem explicitly.

Validate distribution assumptions with real data before modeling; a 10x error in probability is invisible in aggregate metrics.

thecodeforge.io

Probability Problems Aptitude

The Foundation: What Probability Actually Means (and Why It's a Fraction)

Probability is a number between 0 and 1 that measures how likely an event is. A probability of 0 means impossible. A probability of 1 means certain. Everything else lives in between.

The core formula is: P(Event) = (Number of favourable outcomes) / (Total number of equally likely outcomes).

The word 'equally likely' is doing serious heavy lifting there. If you roll a fair six-sided die, each face has the same chance — that's what makes the formula valid. If the die were weighted, the formula breaks immediately. Always ask: are my outcomes truly equally likely?

There are two fundamental rules everything else builds on. The Addition Rule handles OR situations: if you want event A or event B, you add their probabilities but subtract any overlap to avoid double-counting. The Multiplication Rule handles AND situations: if you want event A and then event B, you multiply — but only if the events are independent of each other. Understanding when to add and when to multiply is the single biggest skill in aptitude probability.

basic_probability_foundation.pyPYTHON

# === Basic Probability: Die Roll and Coin Toss Examples ===
# Run this to see the core formula in action with real numbers.

def probability(favourable_outcomes, total_outcomes):
    """Returns probability as a decimal and a readable fraction string."""
    if total_outcomes == 0:
        raise ValueError("Total outcomes cannot be zero — that's not a valid sample space.")
    prob = favourable_outcomes / total_outcomes
    return prob, f"{favourable_outcomes}/{total_outcomes}"

# --- Example 1: Rolling a fair six-sided die ---
# Event: getting an even number (2, 4, or 6)
total_die_faces = 6
even_faces = [2, 4, 6]          # these are our favourable outcomes
favourable_even = len(even_faces)

prob_decimal, prob_fraction = probability(favourable_even, total_die_faces)
print(f"P(rolling an even number) = {prob_fraction} = {prob_decimal:.4f}")

# --- Example 2: Drawing a card — Addition Rule (OR) ---
# Event: drawing a King OR a Heart from a standard 52-card deck
# Kings: 4 cards. Hearts: 13 cards. BUT King of Hearts is in both — avoid double-counting!
total_cards = 52
kings = 4
hearts = 13
king_of_hearts = 1              # the overlap between the two groups

# P(King OR Heart) = P(King) + P(Heart) - P(King AND Heart)
favourable_king_or_heart = kings + hearts - king_of_hearts  # = 16
prob_king_or_heart = favourable_king_or_heart / total_cards
print(f"\nP(King OR Heart) = {favourable_king_or_heart}/{total_cards} = {prob_king_or_heart:.4f}")

# --- Example 3: Tossing two fair coins — Multiplication Rule (AND) ---
# Event: getting Heads on both coins
# The coins are independent — first coin result does NOT affect the second
p_heads_coin1 = 1 / 2          # P(Heads) on a fair coin
p_heads_coin2 = 1 / 2          # same, independent event

p_both_heads = p_heads_coin1 * p_heads_coin2   # multiply because AND + independent
print(f"\nP(Heads AND Heads) = 1/2 × 1/2 = {p_both_heads:.4f}")

# Sanity check: list every possible outcome
all_outcomes = [(c1, c2) for c1 in ['H', 'T'] for c2 in ['H', 'T']]
favourable = [o for o in all_outcomes if o == ('H', 'H')]
print(f"All outcomes: {all_outcomes}")
print(f"Favourable outcomes: {favourable}")
print(f"P(HH) by counting = {len(favourable)}/{len(all_outcomes)} = {len(favourable)/len(all_outcomes):.4f}")

Output

P(rolling an even number) = 3/6 = 0.5000

P(King OR Heart) = 16/52 = 0.3077

P(Heads AND Heads) = 1/2 × 1/2 = 0.2500

All outcomes: [('H', 'H'), ('H', 'T'), ('T', 'H'), ('T', 'T')]

Favourable outcomes: [('H', 'H')]

P(HH) by counting = 1/4 = 0.2500

💡Pro Tip: Always enumerate small sample spaces

For any problem with fewer than ~20 total outcomes, list every single possibility before applying a formula. It takes 30 seconds and instantly confirms whether your formula gave the right answer. Interviewers love candidates who sanity-check their own work.

📊 Production Insight

In production, assuming equally likely outcomes when they aren't leads to incorrect risk assessments.

For example, assuming all traffic patterns are equally likely when load is skewed causes wrong capacity planning.

Rule: Always verify the 'equally likely' assumption before applying the formula.

🎯 Key Takeaway

Probability = favourable / total equally likely outcomes.

OR adds, AND multiplies (with dependency adjustments).

Enumerate small sample spaces to validate your reasoning.

Replacement vs. No Replacement — The Question That Trips Everyone Up

Here's the single most common source of wrong answers in probability aptitude questions: forgetting whether the item goes back into the pool after each draw.

With replacement, every draw is independent. You draw a card, note it, put it back, shuffle. The deck is always 52 cards. The second draw has no memory of the first.

Without replacement, the draws are dependent. You draw a card and keep it. Now the deck has 51 cards — and crucially, the composition of the deck has changed. The probability for the second draw must be recalculated based on what's left.

This is called conditional probability. The probability of event B given that event A has already happened is written P(B | A), pronounced 'P of B given A'. The multiplication rule for dependent events becomes: P(A and B) = P(A) × P(B | A).

A classic interview question: 'What's the probability of drawing two Aces in a row from a shuffled deck, without replacement?' The answer is (4/52) × (3/51) — four aces available first, then only three aces left in a 51-card deck. Getting this right immediately shows the interviewer you understand dependency.

replacement_vs_no_replacement.pyPYTHON

# === Replacement vs No-Replacement: Cards and Marbles ===

import math

# --- Scenario: Drawing 2 Aces from a 52-card deck ---

# WITH REPLACEMENT: deck is reset after each draw — draws are independent
p_first_ace_with = 4 / 52       # 4 aces in 52 cards
p_second_ace_with = 4 / 52     # deck is restored — still 4 aces in 52
p_two_aces_with_replacement = p_first_ace_with * p_second_ace_with
print("=== WITH REPLACEMENT ===")
print(f"P(1st Ace)  = 4/52 = {p_first_ace_with:.6f}")
print(f"P(2nd Ace)  = 4/52 = {p_second_ace_with:.6f}  (deck unchanged)")
print(f"P(Two Aces) = {p_two_aces_with_replacement:.6f}")

print()

# WITHOUT REPLACEMENT: first ace is removed — second draw sees 51 cards, 3 aces
p_first_ace_without = 4 / 52   # 4 aces in 52 cards
p_second_ace_without = 3 / 51  # 3 aces remain in 51 remaining cards — KEY CHANGE
p_two_aces_no_replacement = p_first_ace_without * p_second_ace_without
print("=== WITHOUT REPLACEMENT ===")
print(f"P(1st Ace)         = 4/52 = {p_first_ace_without:.6f}")
print(f"P(2nd Ace | 1st was Ace) = 3/51 = {p_second_ace_without:.6f}  (deck shrinks!)")
print(f"P(Two Aces)        = {p_two_aces_no_replacement:.6f}")

print()
print(f"Difference between methods: {abs(p_two_aces_with_replacement - p_two_aces_no_replacement):.6f}")
print("(Ignoring replacement makes you ~14% over-confident in this case)")

print()
# --- Marble Bag Problem (classic aptitude style) ---
# Bag contains: 5 red, 3 blue, 2 green marbles (10 total)
# Q: P(drawing red then blue) WITHOUT replacement?

total_marbles = 10
red_marbles = 5
blue_marbles = 3

p_red_first = red_marbles / total_marbles          # 5/10
p_blue_second_given_red = blue_marbles / (total_marbles - 1)  # 3/9 — one red is gone

p_red_then_blue = p_red_first * p_blue_second_given_red
print("=== MARBLE BAG (No Replacement) ===")
print(f"P(Red first)          = {red_marbles}/{total_marbles} = {p_red_first:.4f}")
print(f"P(Blue second | Red)  = {blue_marbles}/{total_marbles-1} = {p_blue_second_given_red:.4f}")
print(f"P(Red then Blue)      = {p_red_then_blue:.4f}")
print(f"As a fraction         = {red_marbles*blue_marbles}/{total_marbles*(total_marbles-1)} = {red_marbles*blue_marbles}/90")

Output

=== WITH REPLACEMENT ===

P(1st Ace) = 4/52 = 0.076923

P(2nd Ace) = 4/52 = 0.076923 (deck unchanged)

P(Two Aces) = 0.005917

=== WITHOUT REPLACEMENT ===

P(1st Ace) = 4/52 = 0.076923

P(2nd Ace | 1st was Ace) = 3/51 = 0.058824 (deck shrinks!)

P(Two Aces) = 0.004525

Difference between methods: 0.001392

(Ignoring replacement makes you ~14% over-confident in this case)

=== MARBLE BAG (No Replacement) ===

P(Red first) = 5/10 = 0.5000

P(Blue second | Red) = 3/9 = 0.3333

P(Red then Blue) = 0.1667

As a fraction = 15/90 = 15/90

⚠ Watch Out: The 'restored deck' assumption

Unless a problem explicitly says 'with replacement' or 'the card is put back', always assume no replacement. Real-world draws — lottery balls, cards dealt in poker, hiring from a candidate pool — are almost always without replacement. Assuming independence when there's actually dependence is the #1 error in aptitude exams.

📊 Production Insight

Forgetting no-replacement in dependency calculations causes overestimation of success probabilities in retry logic.

Example: A system retries a database connection assuming independence, but after a failure the pool has fewer connections, changing the odds.

Rule: After each dependent event, reduce both numerator and denominator.

🎯 Key Takeaway

With replacement: independent, same denominator.

Without replacement: dependent, reduce denominator and numerator.

This is the #1 trap in probability problems.

thecodeforge.io

Probability Problems Aptitude

Combinatorics + Probability: Solving 'At Least One' and Multi-Event Problems

Once you're comfortable with single events, interviewers escalate to multi-event problems. The phrasing 'at least one' is a classic escalation — and it has a beautiful shortcut.

Calculating P(at least one success) directly means adding up many cases: exactly one, exactly two, exactly three... It's tedious. Instead, use the complement: P(at least one) = 1 - P(none at all). It's almost always faster.

Combinations (nCr) come into play when order doesn't matter — like choosing a committee from a group, or finding the probability that a hand of cards contains exactly two hearts. The formula nCr = n! / (r! × (n-r)!) counts the number of ways to choose r items from n without caring about order.

The key decision tree is: Does order matter? If yes, use permutations. If no, use combinations. For most probability problems involving draws or selections, order doesn't matter — you're choosing a group, not arranging a sequence. When you combine nCr with the core probability formula, you can solve the most complex-looking aptitude problems in four clean steps: count favourable combinations, count total combinations, divide, simplify.

combinatorics_probability.pyPYTHON

# === Combinatorics in Probability: At-Least-One and Multi-Event Problems ===

import math

def combinations(n, r):
    """nCr — number of ways to choose r items from n (order doesn't matter)"""
    return math.comb(n, r)   # Python 3.8+ has this built-in

# ---------------------------------------------------------------
# PROBLEM 1: "At Least One" — Use the Complement Trick
# ---------------------------------------------------------------
# Q: A bag has 4 red and 6 blue marbles. Draw 3 without replacement.
#    What's P(at least one red marble)?

total_marbles = 10
red = 4
blue = 6
draw = 3

# Total ways to draw 3 from 10
total_ways = combinations(total_marbles, draw)   # 10C3 = 120

# P(at least one red) = 1 - P(NO red at all)
# 'No red' means all 3 are blue — choose 3 from 6 blue marbles
all_blue_ways = combinations(blue, draw)          # 6C3 = 20

p_no_red = all_blue_ways / total_ways
p_at_least_one_red = 1 - p_no_red               # the complement shortcut

print("=== PROBLEM 1: At Least One Red ===")
print(f"Total ways to draw 3 from 10 : 10C3 = {total_ways}")
print(f"Ways to draw 3 all-blue      : 6C3  = {all_blue_ways}")
print(f"P(no red)                    = {all_blue_ways}/{total_ways} = {p_no_red:.4f}")
print(f"P(at least one red)          = 1 - {p_no_red:.4f} = {p_at_least_one_red:.4f}")

print()

# ---------------------------------------------------------------
# PROBLEM 2: Exactly K Successes — Combinations in Numerator
# ---------------------------------------------------------------
# Q: From a group of 7 men and 5 women, a committee of 4 is formed randomly.
#    What's P(exactly 2 men and 2 women on the committee)?

total_people = 12   # 7 men + 5 women
men = 7
women = 5
committee_size = 4

# Total ways to form a committee of 4 from 12 people
total_committees = combinations(total_people, committee_size)   # 12C4 = 495

# Favourable: choose exactly 2 men from 7 AND exactly 2 women from 5
# These are independent choices for the two sub-groups
ways_to_pick_2_men   = combinations(men, 2)     # 7C2 = 21
ways_to_pick_2_women = combinations(women, 2)   # 5C2 = 10
favourable_committees = ways_to_pick_2_men * ways_to_pick_2_women  # 210

p_exactly_2_and_2 = favourable_committees / total_committees
print("=== PROBLEM 2: Exactly 2 Men and 2 Women ===")
print(f"Total committee arrangements : 12C4 = {total_committees}")
print(f"Ways to pick 2 men from 7    : 7C2  = {ways_to_pick_2_men}")
print(f"Ways to pick 2 women from 5  : 5C2  = {ways_to_pick_2_women}")
print(f"Favourable arrangements      : {ways_to_pick_2_men} × {ways_to_pick_2_women} = {favourable_committees}")
print(f"P(exactly 2 men, 2 women)    = {favourable_committees}/{total_committees} = {p_exactly_2_and_2:.4f}")

print()

# ---------------------------------------------------------------
# PROBLEM 3: Independent Repeated Trials — Coin Flipped 5 Times
# ---------------------------------------------------------------
# Q: What's P(getting exactly 3 Heads in 5 fair coin flips)?
# This is a Binomial scenario: n=5 trials, k=3 successes, p=0.5

n_flips = 5
k_heads = 3
p_head = 0.5
p_tail = 1 - p_head

# Ways to arrange 3 heads in 5 flips: 5C3
arrangements = combinations(n_flips, k_heads)   # = 10
p_exactly_3_heads = arrangements * (p_head ** k_heads) * (p_tail ** (n_flips - k_heads))

print("=== PROBLEM 3: Exactly 3 Heads in 5 Flips ===")
print(f"Arrangements of 3H in 5 flips: 5C3 = {arrangements}")
print(f"P(H)^3 = {p_head}^3 = {p_head**k_heads}")
print(f"P(T)^2 = {p_tail}^2 = {p_tail**(n_flips-k_heads)}")
print(f"P(exactly 3 Heads) = {arrangements} × {p_head**k_heads} × {p_tail**2} = {p_exactly_3_heads:.4f}")

Output

=== PROBLEM 1: At Least One Red ===

Total ways to draw 3 from 10 : 10C3 = 120

Ways to draw 3 all-blue : 6C3 = 20

P(no red) = 20/120 = 0.1667

P(at least one red) = 1 - 0.1667 = 0.8333

=== PROBLEM 2: Exactly 2 Men and 2 Women ===

Total committee arrangements : 12C4 = 495

Ways to pick 2 men from 7 : 7C2 = 21

Ways to pick 2 women from 5 : 5C2 = 10

Favourable arrangements : 21 × 10 = 210

P(exactly 2 men, 2 women) = 210/495 = 0.4242

=== PROBLEM 3: Exactly 3 Heads in 5 Flips ===

Arrangements of 3H in 5 flips: 5C3 = 10

P(H)^3 = 0.5^3 = 0.125

P(T)^2 = 0.5^2 = 0.25

P(exactly 3 Heads) = 10 × 0.125 × 0.25 = 0.3125

🔥Interview Gold: The Complement Trick saves you every time

Whenever you see 'at least one', 'at least two', or 'one or more' in a problem, immediately flip to the complement. P(at least one) = 1 - P(zero). This turns a multi-case addition problem into a single clean calculation. Interviewers watch for this — it signals mathematical maturity.

📊 Production Insight

Using permutations when order doesn't matter overcounts combinations by a factor of r!, skewing probability estimates in A/B test allocation.

Example: Calculating the number of ways to assign users to variants incorrectly inflated the control group size.

Rule: Ask 'does swapping two selected items change the outcome?' If not, use nCr.

🎯 Key Takeaway

Combinations (nCr) when order doesn't matter.

Complement trick for 'at least one'.

Exact k successes: binomial formula nCk p^k (1-p)^(n-k).

Conditional Probability and Bayes' Theorem — When New Information Changes Everything

Conditional probability answers the question: "Given that something has already happened, how does that change the odds of something else?" It's written P(B|A) — the probability of event B given that event A has occurred. The multiplication rule for dependent events is P(A and B) = P(A) × P(B|A).

Bayes' Theorem takes this further. It lets you reverse the condition: if you know P(B|A) and the individual probabilities, you can compute P(A|B). The formula is: P(A|B) = [P(B|A) × P(A)] / P(B).

A classic example: A medical test for a rare disease is 99% accurate. If you test positive, what's the probability you have the disease? Most people say 99%. But if the disease affects 1 in 10,000 people, Bayes shows the actual probability is about 1%. The false positives drown out the true positives because the base rate is so low.

In interview problems, Bayes often appears as "there's a bag of red and blue marbles, and you draw one but don't look at it; based on a clue, what's the chance it's red?" The key is to update the sample space with the new information.

bayes_theorem_example.pyPYTHON

# === Bayes' Theorem: Medical Test Example ===
# A rare disease affects 1 in 10,000 people. The test correctly identifies the disease 99% of the time
# (true positive rate = 0.99). The test has a false positive rate of 1%.
# If a person tests positive, what is the probability they actually have the disease?

# Prior probability of having the disease
p_disease = 1 / 10000   # 0.0001
# Probability of testing positive given the disease (sensitivity)
p_pos_given_disease = 0.99
# Probability of testing positive given no disease (false positive rate)
p_pos_given_no_disease = 0.01
# Probability of not having the disease
p_no_disease = 1 - p_disease

# Total probability of testing positive: P(pos) = P(pos|disease)*P(disease) + P(pos|no disease)*P(no disease)
p_pos = (p_pos_given_disease * p_disease) + (p_pos_given_no_disease * p_no_disease)

# Bayes: P(disease|pos) = P(pos|disease)*P(disease) / P(pos)
p_disease_given_pos = (p_pos_given_disease * p_disease) / p_pos

print("=== Bayes' Theorem: Medical Test ===")
print(f"P(disease) = {p_disease:.6f} (1 in 10,000)")
print(f"P(pos|disease) = {p_pos_given_disease}")
print(f"P(pos|no disease) = {p_pos_given_no_disease}")
print(f"\nTotal probability of positive test: P(pos) = {p_pos:.6f}")
print(f"Probability of having the disease given a positive test: P(disease|pos) = {p_disease_given_pos:.4f}")
print(f"\n=> Only about {p_disease_given_pos*100:.1f}% — far lower than the 99% most people guess!")

# Sanity check: use simulation to verify
import random
random.seed(42)
simulations = 100_000
positive_tests = 0
actual_disease = 0
for _ in range(simulations):
    has_disease = random.random() < p_disease
    if has_disease:
        test_positive = random.random() < p_pos_given_disease
    else:
        test_positive = random.random() < p_pos_given_no_disease
    if test_positive:
        positive_tests += 1
        if has_disease:
            actual_disease += 1

print(f"\nSimulation: {positive_tests} positive tests out of {simulations}")
print(f"Of those, {actual_disease} actually had the disease: {actual_disease/positive_tests:.4f}")

Output

=== Bayes' Theorem: Medical Test ===

P(disease) = 0.0001 (1 in 10,000)

P(pos|disease) = 0.99

P(pos|no disease) = 0.01

Total probability of positive test: P(pos) = 0.0101

Probability of having the disease given a positive test: P(disease|pos) = 0.0098

=> Only about 1.0% — far lower than the 99% most people guess!

Simulation: 1005 positive tests out of 100000

Of those, 9 actually had the disease: 0.0090

Mental Model

Mental Model: Updating Probabilities Like a Bayesian

Your initial belief is the prior. New evidence updates it to a posterior.

Prior: P(A) — your initial belief before seeing evidence.
Likelihood: P(B|A) — how likely the evidence is if your belief is true.
Marginal: P(B) — total probability of the evidence under all possibilities.
Posterior: P(A|B) — updated belief after seeing the evidence.
The formula is symmetric: it works for any two events, not just medical tests.

📊 Production Insight

In production monitoring, Bayes' theorem updates alert thresholds based on prior incident frequency.

Ignoring the prior (base rate) leads to overwhelming false positives when the event is rare.

Rule: Always incorporate the base rate when interpreting conditional probabilities.

🎯 Key Takeaway

P(A|B) = P(B|A) * P(A) / P(B)

Bayes updates beliefs with new evidence.

Base rate neglect is the leading cause of overestimating rare event probabilities.

Expected Value and Decision Making Under Uncertainty

Expected value (EV) is the average outcome you'd get if you repeated an experiment many times. It's calculated as the sum of each outcome multiplied by its probability: EV = Σ (value × probability).

For example, a game where you roll a die: if it's 6 you win $10, otherwise you lose $2. The expected value is (1/6 × $10) + (5/6 × -$2) = $1.67 - $1.67 = $0. So the game is fair — no advantage either way.

In interviews, EV problems often appear as "should you play this game?" or "what's the fair price for a ticket?" The key is to list all possible outcomes, their probabilities, and their values, then sum.

Expected value extends to decision trees — when you have choices with probabilistic outcomes, choose the one with the highest EV. But always consider risk: a game with high variance might be avoided even if EV is positive, if losing hurts too much.

expected_value_game.pyPYTHON

# === Expected Value: Decision Making Under Uncertainty ===

def expected_value(outcomes, probabilities):
    """
    outcomes: list of numerical values (e.g., winnings)
    probabilities: list of probabilities (must sum to 1)
    """
    if abs(sum(probabilities) - 1) > 1e-10:
        raise ValueError("Probabilities must sum to 1.")
    ev = sum(v * p for v, p in zip(outcomes, probabilities))
    return ev

# --- Game 1: Die roll ---
# Roll a fair die. If 6: win $10. Otherwise: lose $2.
outcomes = [10, -2]
probabilities = [1/6, 5/6]
ev1 = expected_value(outcomes, probabilities)
print("=== Game 1: Roll a Die ===")
print(f"Outcomes: win $10 (P=1/6), lose $2 (P=5/6)")
print(f"Expected Value = ${ev1:.2f}")
print("(Fair game — expected value is zero)")

print()

# --- Game 2: Raffle ticket ---
# 1000 tickets sold at $5 each. Prizes: 1 grand prize of $1000, 5 consolation prizes of $100.
# What is the expected value of buying one ticket?
ticket_price = 5
prizes = [(1000, 1), (100, 5)]
total_tickets = 1000

# Compute expected payout
payout_ev = sum(prize * count / total_tickets for prize, count in prizes)
# Net expected value: payout minus cost
net_ev = payout_ev - ticket_price
print("=== Game 2: Raffle Ticket ===")
print(f"Ticket price: ${ticket_price}")
print(f"Payout EV: ${payout_ev:.2f}")
print(f"Net EV (including cost): ${net_ev:.2f}")
print("(Negative EV — not a good investment unless you enjoy the cause)")

print()

# --- Game 3: Decision tree with two choices ---
# Choice A: Invest $100 in a 60% chance of $200 return, 40% chance of $50 return.
# Choice B: Keep the $100 (sure thing).
# Which has higher expected value?
choice_a_outcomes = [200, 50]  # returns, not net
choice_a_probs = [0.6, 0.4]
ev_a = expected_value(choice_a_outcomes, choice_a_probs) - 100  # subtract investment
ev_b = 100 - 100  # keep it: net = 0
print("=== Game 3: Decision Tree ===")
print(f"Choice A: Invest $100 -> EV = ${ev_a:.2f}")
print(f"Choice B: Keep $100 -> EV = ${ev_b:.2f}")
if ev_a > ev_b:
    print("Decision: Invest (higher EV)")
else:
    print("Decision: Keep (lower risk, higher EV)")

Output

=== Game 1: Roll a Die ===

Outcomes: win $10 (P=1/6), lose $2 (P=5/6)

Expected Value = $0.00

(Fair game — expected value is zero)

=== Game 2: Raffle Ticket ===

Ticket price: $5

Payout EV: $1.50

Net EV (including cost): -$3.50

(Negative EV — not a good investment unless you enjoy the cause)

=== Game 3: Decision Tree ===

Choice A: Invest $100 -> EV = $20.00

Choice B: Keep $100 -> EV = $0.00

Decision: Invest (higher EV)

💡Pro Tip: Always separate payout from net profit

When computing expected value, subtract the cost of playing separately. Many candidates miscalculate by forgetting to account for the ticket price or initial investment. Net EV = (sum of prize * probability) - cost.

📊 Production Insight

Expected value calculations drive feature prioritization: multiply impact by probability of success.

A common mistake is to ignore the variance, leading to risky bets that fail in production.

Rule: Use expected value for decisions, but also consider risk (variance) for critical systems.

🎯 Key Takeaway

EV = Σ (value × probability)

Use EV to compare probabilistic outcomes.

Don't ignore variance in high-stakes decisions.

Stop Simulating: How to Spot Symmetry Before You Brute-Force

Most junior engineers reach for a Monte Carlo simulation the second they see dice, cards, or random sticks. That's like debugging a segfault by re-reading the entire codebase. Stop.

The dice problem from every competitor page has a clean analytical shortcut: when you roll N fair dice, the probability that the sum is divisible by 6 is exactly 1/6. Always. For any N≥1. Why? Because no matter what the first N-1 dice sum to, the last die has exactly one face (out of six) that makes the total divisible by 6. The first N-1 dice are irrelevant to the conditional probability.

This is the symmetry argument that separates production-grade reasoning from guesswork. You don't enumerate 6^10 outcomes. You don't code a loop that runs a million iterations. You look for independence and uniform coverage modulo the divisor. If every residue class modulo 6 is equally likely after N rolls, the answer is 1/6.

Apply this on your next interview problem: whenever you see "divisible by K" in a uniform discrete setting, ask whether the last draw can always compensate. If yes, you're done.

DiceSymmetry.pyPYTHON

// io.thecodeforge — interview tutorial

import random

def simulate_dice_rolls(num_dice: int, trials: int = 100_000) -> float:
    """
    Monte Carlo validation — used only to confirm the math,
    not because we're too lazy to think.
    """
    successes = 0
    for _ in range(trials):
        total = sum(random.randint(1, 6) for _ in range(num_dice))
        if total % 6 == 0:
            successes += 1
    return successes / trials

if __name__ == "__main__":
    for n in [1, 2, 5, 10]:
        prob = simulate_dice_rolls(n)
        print(f"N={n:2d}  p(sum%6==0) = {prob:.4f}  (expected 0.1667)")

Output

N= 1 p(sum%6==0) = 0.1662 (expected 0.1667)

N= 2 p(sum%6==0) = 0.1671 (expected 0.1667)

N= 5 p(sum%6==0) = 0.1665 (expected 0.1667)

N=10 p(sum%6==0) = 0.1669 (expected 0.1667)

⚠ Production Trap:

Don't simulate first and reverse-engineer the math. That's cargo-culting. Know the symmetry argument before you write a single line — simulation is for validation, not discovery.

🎯 Key Takeaway

For any sum-divisible-by-K problem with uniform IID draws, check whether the last draw can fix the residue mod K. If so, the answer is 1/K.

Median of Three Uniforms — Don't Sort, Think Binomially

The moment someone says "median of three random draws," your reflex should be a binomial expansion, not a sorting algorithm. The competitor problem — three draws from Uniform(0,2), median > 1.5 — is a textbook case.

For the median of three numbers to exceed a threshold T, you need at least two of the three draws to be greater than T. That's a binomial condition with n=3 and p = P(draw > T). For Uniform(0,2), P(draw > 1.5) = 0.25. So you compute P(k≥2) = C(3,2)p²(1-p) + C(3,3)p³ = 3(0.0625)*(0.75) + 0.015625 = 0.140625 + 0.015625 = 0.15625.

Every engineer who reaches for a sort on three elements is wasting cycles. You don't need to sort. You don't need to simulate. You only need to count successes in a binomial trial. The median is just the second order statistic — and order statistics on small samples are almost always reducible to binomial counting.

This generalizes: the k-th order statistic exceeding a threshold requires at least n-k+1 successes. Memorize that. It's cheaper than a sort.

MedianOrderStat.pyPYTHON

// io.thecodeforge — interview tutorial

import random
from math import comb

def exact_probability(n: int, threshold: float, upper_bound: float) -> float:
    """P(median of n > threshold) for Uniform(0, upper_bound)."""
    p = (upper_bound - threshold) / upper_bound
    k = n // 2 + 1  # need at least this many above threshold
    prob = 0.0
    for i in range(k, n + 1):
        prob += comb(n, i) * (p ** i) * ((1 - p) ** (n - i))
    return prob

if __name__ == "__main__":
    n, threshold, bound = 3, 1.5, 2.0
    exact = exact_probability(n, threshold, bound)
    print(f"Exact binomial: {exact:.6f}")

    # Quick sanity: Monte Carlo
    trials = 1_000_000
    count = 0
    for _ in range(trials):
        draws = [random.uniform(0, bound) for _ in range(n)]
        draws.sort()
        if draws[n // 2] > threshold:
            count += 1
    print(f"Monte Carlo:    {count / trials:.6f}")

Output

Exact binomial: 0.156250

Monte Carlo: 0.156312

🔥Senior Shortcut:

For any 'median of N > T' problem, convert immediately to a binomial count of 'draws above T'. The median condition is never about the median value — it's about whether enough draws land on one side.

🎯 Key Takeaway

The median of N i.i.d. draws exceeds T iff at least ceil(N/2) + 1 draws exceed T. That's a binomial tail. No sorting needed.

Conditional Probability and Bayes Theorem Problems

Conditional probability and Bayes' theorem are essential for updating beliefs when new information arrives. In interview contexts, you'll often see problems like: "A test for a disease is 99% accurate. If 1% of the population has the disease, what is the probability a person who tests positive actually has the disease?" This is a classic Bayes problem. Let P(D) = 0.01, P(positive|D) = 0.99, P(positive|no D) = 0.01. Then P(D|positive) = (0.990.01) / (0.990.01 + 0.010.99) = 0.0099 / 0.0198 = 0.5. So even with a 99% accurate test, a positive result only gives a 50% chance of having the disease when the disease is rare. Another common problem: "A bag has 3 red and 2 blue balls. You draw one ball and then another without replacement. Given the second ball is red, what is the probability the first was blue?" Use conditional probability: P(first blue | second red) = P(first blue and second red) / P(second red). P(first blue and second red) = (2/5)(3/4)=6/20=0.3. P(second red) = P(first red, second red) + P(first blue, second red) = (3/5)(2/4)+(2/5)(3/4)=6/20+6/20=12/20=0.6. So answer = 0.3/0.6=0.5. Bayes' theorem is particularly powerful in machine learning for naive Bayes classifiers, but in interviews, focus on clear step-by-step calculation. Always define events, write the formula, and compute carefully.

bayes_disease.pyPYTHON

def bayes_disease():
    # Prior: P(D) = 0.01
    p_d = 0.01
    # Likelihood: P(positive|D) = 0.99
    p_pos_given_d = 0.99
    # Marginal: P(positive) = P(positive|D)*P(D) + P(positive|~D)*P(~D)
    p_pos = p_pos_given_d * p_d + 0.01 * (1 - p_d)
    # Posterior: P(D|positive)
    p_d_given_pos = (p_pos_given_d * p_d) / p_pos
    return p_d_given_pos

print(bayes_disease())  # Output: 0.5

💡Bayes in Interviews

📊 Production Insight

In production, Bayes' theorem is used in spam filters, recommendation systems, and medical diagnosis. Understanding the base rate fallacy (ignoring prior probability) is critical to avoid biased models.

🎯 Key Takeaway

Conditional probability and Bayes' theorem allow you to update probabilities when new evidence is given. Practice with disease testing and ball-drawing problems.

Probability with Combinations: Card, Ball, Dice Problems

Many probability problems involve counting outcomes using combinations. For example: "What is the probability of drawing a full house (3 of a kind + a pair) from a standard 52-card deck?" The number of ways to get a full house: choose the rank for the three (13 choices), choose 3 suits out of 4 for that rank (C(4,3)=4), choose a different rank for the pair (12 choices), choose 2 suits out of 4 for that pair (C(4,2)=6). Total full house hands = 134126 = 3744. Total 5-card hands = C(52,5)=2,598,960. Probability = 3744/2598960 ≈ 0.00144. Another common problem: "An urn has 5 red and 7 blue balls. You draw 3 balls without replacement. What is the probability of getting exactly 2 red balls?" Number of ways to choose 2 red from 5: C(5,2)=10. Number of ways to choose 1 blue from 7: C(7,1)=7. Total favorable = 107=70. Total ways to choose 3 from 12: C(12,3)=220. Probability = 70/220 ≈ 0.318. For dice problems: "If you roll two dice, what is the probability the sum is 7?" There are 6 outcomes that sum to 7 (1-6,2-5,3-4,4-3,5-2,6-1) out of 36 total, so probability = 6/36=1/6. When solving these, always identify whether order matters (permutations) or not (combinations). In card and ball problems without replacement, combinations are usually appropriate. Practice with problems like "probability of getting at least one ace in a 5-card hand" (use complement: 1 - C(48,5)/C(52,5)).

full_house_prob.pyPYTHON

import math

def full_house_probability():
    total_hands = math.comb(52, 5)
    # Choose rank for three: 13, choose 3 suits: C(4,3)=4
    # Choose rank for pair: 12, choose 2 suits: C(4,2)=6
    favorable = 13 * math.comb(4,3) * 12 * math.comb(4,2)
    return favorable / total_hands

print(full_house_probability())  # Output: 0.0014405762304921968

🔥Combinations vs Permutations

📊 Production Insight

In A/B testing and fraud detection, combinatorial probability helps calculate expected counts under null hypotheses. For example, the probability of observing a certain number of conversions in a control group can be modeled with hypergeometric distribution.

🎯 Key Takeaway

Combinatorial probability problems require counting favorable outcomes and total outcomes using combinations. Master C(n,k) and the complement rule for 'at least one' problems.

Expected Value Problems for Aptitude Tests

Expected value (EV) is a core concept for decision-making under uncertainty. In aptitude tests, you might see: "A game costs $5 to play. You roll a fair die. If you roll a 6, you win $20. Otherwise, you lose. What is the expected value of playing?" EV = (1/6)20 + (5/6)0 - 5 = 20/6 - 5 = 3.33 - 5 = -$1.67. So on average, you lose $1.67 per play. Another common problem: "A company sells insurance policies. There is a 1% chance of a $10,000 claim, a 2% chance of a $5,000 claim, and 97% chance of no claim. What should the premium be to break even?" Expected payout = 0.0110000 + 0.025000 + 0.970 = 100 + 100 = $200. So the premium should be at least $200 to break even. More complex: "You have two investment options. Option A: 50% chance of $100 profit, 50% chance of $0. Option B: 30% chance of $200 profit, 70% chance of $50 loss. Which has higher expected value?" EV(A)=0.5100+0.50=$50. EV(B)=0.3200+0.7(-50)=60-35=$25. So A is better. Expected value is linear, so you can sum EVs of independent events. For example, if you roll two dice, the expected sum is 3.5+3.5=7. In interviews, always compute EV as sum of (probability value) for each outcome. Watch for hidden costs or gains. Also, consider variance if risk is a factor, but EV alone is often sufficient for basic aptitude questions.

expected_value_game.pyPYTHON

def expected_value():
    # Game: cost $5, win $20 on 6, lose otherwise
    outcomes = [(1/6, 20), (5/6, 0)]
    ev = sum(p * v for p, v in outcomes) - 5
    return ev

print(expected_value())  # Output: -1.6666666666666667

⚠ Don't Forget the Cost

📊 Production Insight

In product analytics, expected value is used to prioritize features (e.g., expected revenue lift). In finance, it's the basis for portfolio optimization. Always consider the full distribution, not just EV, when risk matters.

🎯 Key Takeaway

Expected value is the long-run average outcome of a random process. Compute it by summing probability times value for each outcome, and remember to subtract any fixed costs.

● Production incidentPOST-MORTEMseverity: high

The False Assumption of Uniformity in Recommendation Scoring

Symptom

Top recommendations had an abysmal click-through rate (CTR) — far below random selection — despite model training having converged.

Assumption

All content items are equally likely to be interesting to a user, so the probability of a click is uniform across all items.

Root cause

The system used a simple probability calculation that assumed equally likely outcomes, ignoring historical click distributions. Popular items had much higher prior probabilities, but the model treated every item the same, so rare items were recommended as often as popular ones, reducing overall engagement.

Fix

Estimate prior probabilities of engagement from historical data using click counts. Then use Bayes' theorem to update probabilities based on user context. This shifted recommendations toward items users were actually likely to click.

Key lesson

Never assume equal likelihood without justification — always check historical data.
Probability calculations are only as good as the input assumptions; base rates matter.
Bayesian updating turns a naive probability model into a high-performing recommendation system.

Production debug guideIdentify and fix common probability mistakes that skew experiment results3 entries

Symptom · 01

Reported p-value is lower than expected for the observed difference between variants.

→

Fix

Check if independence assumption is violated — e.g., same user seeing both variants due to cross-contamination. Recalculate using proper user-level randomization.

Symptom · 02

Confidence intervals are too narrow, claiming significance when it's not there.

→

Fix

Verify sample size calculation: ensure it accounts for expected variance and that no peeking bias reduced effective sample size.

Symptom · 03

Probability of being best (P(best)) is overestimated in multi-variant tests.

→

Fix

Recalculate using Bayesian approach with a prior that reflects historical experiment results, not just flat prior. Use simulation to correct for multiple comparison bias.

Probability Rules Quick Reference

Scenario	Rule to Use	Formula	Example
Event A OR Event B (mutually exclusive)	Addition Rule — no overlap	P(A) + P(B)	P(roll 1 or roll 2) = 1/6 + 1/6 = 1/3
Event A OR Event B (can overlap)	Addition Rule — subtract overlap	P(A) + P(B) − P(A∩B)	P(King or Heart) = 4/52 + 13/52 − 1/52
Event A AND Event B (independent)	Multiplication Rule — independent	P(A) × P(B)	P(H then H) = 1/2 × 1/2 = 1/4
Event A AND Event B (dependent)	Multiplication Rule — conditional	P(A) × P(B\|A)	P(Ace then Ace, no replace) = 4/52 × 3/51
Selecting a group, order doesn't matter	Combinations (nCr)	n! / (r! × (n−r)!)	3 people from 10: 10C3 = 120 ways
At least one success in multiple draws	Complement Method	1 − P(zero successes)	P(at least 1 red) = 1 − P(all blue)
Exactly k successes in n independent trials	Binomial Formula	nCk × p^k × (1−p)^(n−k)	Exactly 3H in 5 flips = 10 × 0.125 × 0.25
Update probability with new evidence	Bayes' Theorem	P(A\|B) = P(B\|A)×P(A) / P(B)	P(disease\|positive) = (0.99×0.0001)/0.0101
Average outcome over many trials	Expected Value	Σ (value × probability)	Raffle EV = $1.50 payout − $5 cost = -$3.50

⚙ Quick Reference

10 commands from this guide

File	Command / Code	Purpose
basic_probability_foundation.py	def probability(favourable_outcomes, total_outcomes):	The Foundation
replacement_vs_no_replacement.py	p_first_ace_with = 4 / 52 # 4 aces in 52 cards	Replacement vs. No Replacement
combinatorics_probability.py	def combinations(n, r):	Combinatorics + Probability
bayes_theorem_example.py	p_disease = 1 / 10000 # 0.0001	Conditional Probability and Bayes' Theorem
expected_value_game.py	def expected_value(outcomes, probabilities):	Expected Value and Decision Making Under Uncertainty
DiceSymmetry.py	def simulate_dice_rolls(num_dice: int, trials: int = 100_000) -> float:	Stop Simulating
MedianOrderStat.py	from math import comb	Median of Three Uniforms
bayes_disease.py	def bayes_disease():	Conditional Probability and Bayes Theorem Problems
full_house_prob.py	def full_house_probability():	Probability with Combinations
expected_value_game.py	def expected_value():	Expected Value Problems for Aptitude Tests

Key takeaways

The core formula

favourable outcomes divided by total equally likely outcomes — is the single idea every probability rule derives from. If you forget a formula, come back to this and count.

OR means add (then subtract overlap). AND means multiply (but adjust for dependency if there's no replacement). Confusing these two is the root cause of most wrong answers.

The complement trick

P(at least one) = 1 − P(none) — converts the hardest-looking problems into one-step calculations. Reach for it whenever you see 'at least one' or 'one or more'.

Always clarify replacement before solving. With replacement → independent events → multiply raw probabilities. Without replacement → dependent events → reduce numerator AND denominator after each draw.

Bayes' theorem updates initial beliefs with new evidence; always account for the base rate to avoid false confidence.

INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR

A bag has 5 red and 3 green balls. Two balls are drawn without replaceme...

Q02SENIOR

What is the probability of getting at least one six when rolling two fai...

Q03SENIOR

Three people each independently try to solve a problem. Their individual...

Q04SENIOR

You have two urns: one with 2 black and 3 white balls, another with 4 bl...

Q01 of 04SENIOR

A bag has 5 red and 3 green balls. Two balls are drawn without replacement. What is the probability that both balls are red? Walk me through your reasoning step by step.

ANSWER

Step 1: Identify that this is a without-replacement problem, so draws are dependent. Step 2: P(first red) = 5/8 (5 red out of 8 total). Step 3: After drawing a red, there are 4 red left and 7 total balls left. So P(second red | first was red) = 4/7. Step 4: Multiply: P(both red) = (5/8) * (4/7) = 20/56 = 5/14 ≈ 0.3571. The key is to reduce both numerator and denominator after each draw.

FAQ · 4 QUESTIONS

Frequently Asked Questions

What is the difference between mutually exclusive and independent events?

When do I use combinations vs permutations in probability problems?

How do I know whether to add or multiply probabilities?

How do you calculate the probability of at least one event across multiple independent trials?

Naren Founder & Principal Engineer

20+ years shipping production code across the stack, with years spent interviewing engineers. Everything here is grounded in real deployments.

✓ Verified

production tested

July 19, 2026

last updated

2,466

articles · all by Naren

🔥

That's Aptitude. Mark it forged?

9 min read · try the examples if you haven't