NumPy Random Module — Generating and Controlling Random Data
The Modern Generator API
Create a generator with np.random.default_rng(). Pass a seed for reproducibility.
import numpy as np # Reproducible — same seed gives same numbers every run rng = np.random.default_rng(seed=42) print(rng.random(5)) # 5 floats in [0, 1) print(rng.integers(0, 10, 5)) # 5 ints in [0, 10) print(rng.normal(0, 1, 5)) # 5 standard normal samples print(rng.uniform(2.0, 5.0, 3)) # 3 floats in [2, 5)
[0 9 5 0 2]
[-0.234 1.573 -0.462 0.241 -1.913]
Common Distributions
import numpy as np rng = np.random.default_rng(0) # Normal (Gaussian) print(rng.normal(loc=170, scale=10, size=5)) # heights in cm # Binomial — n trials, p probability print(rng.binomial(n=10, p=0.5, size=5)) # coin flips # Poisson — events per interval print(rng.poisson(lam=3, size=5)) # Exponential — time between events print(rng.exponential(scale=1.0, size=5))
[4 5 7 5 3]
Shuffling and Sampling
import numpy as np rng = np.random.default_rng(42) arr = np.arange(10) # Shuffle in place rng.shuffle(arr) print(arr) # [0 3 7 2 5 1 9 4 6 8] — order varies # Sample without replacement print(rng.choice(arr, size=3, replace=False)) # Sample with replacement (bootstrap) print(rng.choice(arr, size=5, replace=True)) # Permutation — returns a copy, does not modify original orig = np.arange(5) shuffled = rng.permutation(orig) print(orig) # [0 1 2 3 4] — unchanged print(shuffled) # shuffled copy
[0 1 2 3 4]
[3 7 2]
🎯 Key Takeaways
- Use np.random.default_rng(seed) for new code — it is faster and better than the legacy API.
- Seeding makes random numbers reproducible — essential for ML experiments.
- rng.shuffle() modifies in place; rng.permutation() returns a copy.
- rng.choice() with replace=False is sampling without replacement.
- Each call to a Generator method advances the internal state — the same rng object produces different numbers on consecutive calls.
Interview Questions on This Topic
- QWhy is np.random.default_rng() preferred over np.random.seed() in modern NumPy?
- QHow do you generate reproducible random numbers in NumPy?
Frequently Asked Questions
What is the difference between np.random.seed() and np.random.default_rng()?
np.random.seed() sets a global seed that affects all legacy numpy.random functions. np.random.default_rng() creates an independent Generator object. The Generator approach is better because it avoids shared global state — multiple generators with different seeds can run independently in the same process.
Why does my random data change every time I run my script?
You are not seeding the generator. Add seed=42 (or any integer) to np.random.default_rng(). The exact number does not matter — what matters is that you use the same number consistently.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.