Python Intermediate

NumPy Mathematical Functions — ufuncs, aggregations and statistics

Q: What is the difference between np.sum(a) and a.sum()?

Functionally identical. The method form a.sum() is slightly more common in practice. Both support the axis, keepdims, and dtype parameters.

Q: How do I compute a weighted average in NumPy?

Use np.average(a, weights=w). Unlike np.mean(), np.average() accepts a weights array and computes the weighted mean. np.mean() treats all elements equally.

📅 March 16, 2026 ⏱ 3 min read 🎯 Intermediate

Where developers are forged. · Structured learning · Free forever.

📍 Part of: Python Libraries → Topic 28 of 51

NumPy's mathematical functions — universal functions (ufuncs), aggregations like sum and mean, axis-wise operations, and statistical functions with practical examples.

⚙️ Intermediate — basic Python knowledge assumed

In this tutorial, you'll learn

NumPy's mathematical functions — universal functions (ufuncs), aggregations like sum and mean, axis-wise operations, and statistical functions with practical examples.

Ufuncs operate element-wise and support broadcasting — always prefer them over Python loops.
Aggregation functions accept an axis parameter — axis=0 collapses rows, axis=1 collapses columns.
np.std() uses ddof=0 (population) by default. Use ddof=1 for sample standard deviation.

✦ Plain-English analogy ✦ Real code with output ✦ Interview questions

⚡Quick Answer

NumPy mathematical functions are vectorised: they run in compiled C, not Python loops.
Ufuncs (universal functions) operate element-wise and support broadcasting.
Aggregation functions reduce arrays: sum, mean, min, max with axis control.
Axis=0 collapses rows (down), axis=1 collapses columns (across).
Performance: vectorised operations are 10-100x faster than equivalent Python loops.
Production trap: forgetting nan-safe aggregations (np.nanmean) silently propagates NaNs.

🚨 START HERE

Quick Debug Cheatsheet: NumPy Math Functions

Immediate commands to diagnose and fix common issues in production

🟡NaN in aggregation result

Immediate ActionRun np.isnan(data).any() to detect NaNs in input

Commands

np.isnan(data).sum()

np.nanmean(data) if using mean

Fix NowReplace np.mean with np.nanmean or clean data with np.nan_to_num(data)

🟡Broadcasting error

Immediate ActionPrint shapes with print(a.shape, b.shape)

Commands

a.shape, b.shape

np.broadcast_shapes(a.shape, b.shape) (NumPy >= 1.20)

Fix NowReshape one array: b = b.reshape(-1, 1) or use np.newaxis

🔴Memory blow-up (OOM)

Immediate ActionStop the process and estimate array sizes

Commands

sys.getsizeof(a) / (1024**3) # GB

a.nbytes / (1024**3) # more accurate

Fix NowUse out= parameter to write into pre-allocated array, or use np.linalg.norm instead of outer for distance matrices

🟡Wrong reduction dimension

Immediate ActionCheck axis value and output shape

Commands

result.shape

np.sum(a, axis=0, keepdims=True).shape

Fix NowAdd keepdims=True and verify shape before downstream logic

Production IncidentThe Silent NaN: How Aggregation Broke Our ML PipelineA production data pipeline using np.mean on a column with missing values caused the entire training job to produce NaN predictions. The bug was invisible for 3 days because logs only showed the final model metric.

SymptomModel produced NaN predictions after a data refresh. Input data had occasional missing sensor readings, but np.mean returned NaN for the entire column, which propagated through the feature pipeline.

AssumptionThe team assumed that np.mean would simply skip missing values (like Excel's AVERAGE does). They also assumed that a 'few missing values' wouldn't affect the final model.

Root causenp.mean returns NaN if any of the input elements is NaN. The team used np.mean instead of np.nanmean, and the downstream scaler and model had no NaN handling — so NaN propagated to predictions.

FixReplaced np.mean with np.nanmean in the feature processing code. Added a pre-aggregation check: if np.any(np.isnan(data)), log a warning and drop the row or impute. Also added a post-aggregation assertion: assert not np.isnan(result).any().

Key Lesson

Never assume standard aggregation functions handle missing data gracefully.Always use nan-aware functions when your data pipeline may contain missing values.Add defensive checks after aggregation to catch NaN propagation early.Review the function's documentation — np.mean vs np.nanmean, np.std vs np.nanstd.

Production Debug GuideCommon failure modes when using ufuncs and aggregations

Aggregation returns NaN or Inf unexpectedly→Check for NaN values in input: np.isnan(data).any(). If true, use nan-aware functions (np.nanmean, np.nansum, etc.) or clean the data. Also check for Inf with np.isinf().

BroadcastingError: operands could not be broadcast together with shapes (a,b) (c,d)→Print both array shapes: print(a.shape, b.shape). Ensure dimensions are compatible: they must be equal or one must be 1, aligning from the right. Reshape using .reshape() or np.newaxis.

MemoryError when using np.power or np.outer on large arrays→Check intermediate array sizes. For power, use out parameter to write to pre-allocated array. For outer, consider chunking or use np.multiply.outer with careful memory monitoring.

Unexpected result: aggregation returns a scalar instead of an array→Check if you forgot the axis parameter. By default, axis=None reduces all elements. Specify axis=0 or axis=1 for column/row operations. Use keepdims=True to preserve dimensions.

np.std result differs from expected sample standard deviation→np.std defaults to population std (ddof=0). For sample std, pass ddof=1. Compare with np.std(data, ddof=1) and confirm using a small test set.

Universal Functions (ufuncs)

Ufuncs apply a function to every element of an array. They support broadcasting, type casting, and output array specification.

Example · PYTHON

1234567891011121314

import numpy as np

a = np.array([1.0, 4.0, 9.0, 16.0])

print(np.sqrt(a))          # [1. 2. 3. 4.]
print(np.log(a))           # [0.    1.386 2.197 2.773]
print(np.exp([0, 1, 2]))   # [1.    2.718 7.389]
print(np.abs([-3, -1, 2])) # [3 1 2]

# Two-array ufuncs
b = np.array([2.0, 3.0, 4.0, 5.0])
print(np.add(a, b))       # same as a + b
print(np.maximum(a, b))   # element-wise max
print(np.power(b, 2))     # [4. 9. 16. 25.]

▶ Output

[1. 2. 3. 4.]
[3 1 2]

📊 Production Insight

Ufuncs automatically upcast dtypes (float32 + float64 = float64) which can silently double memory.

When combined with broadcasting, the intermediate result can blow up before the operation completes.

Rule: always check the dtype of the output via .dtype before processing high-volume data.

🎯 Key Takeaway

Ufuncs are fast because they're written in C.

Broadcasting + dtype promotion can cause unexpected memory spikes.

If memory is tight, specify output dtype explicitly with the dtype parameter.

Aggregation Functions and the axis Parameter

axis=0 operates down rows (along columns). axis=1 operates across columns (along rows). axis=None reduces everything to a scalar.

Example · PYTHON

123456789101112

import numpy as np

m = np.array([[1, 2, 3],
              [4, 5, 6]])

print(m.sum())           # 21 — all elements
print(m.sum(axis=0))     # [5 7 9] — column sums
print(m.sum(axis=1))     # [6 15] — row sums

print(m.mean(axis=0))    # [2.5 3.5 4.5]
print(m.max(axis=1))     # [3 6]
print(m.argmax(axis=1))  # [2 2] — index of max in each row

▶ Output

21
[5 7 9]
[6 15]

📊 Production Insight

The most common production bug is passing axis=0 when you meant axis=1 (or vice versa).

The result shape changes silently — a 2D array becomes 1D, which can break downstream code expecting a different shape.

Use keepdims=True to preserve dimensionality and make the intent explicit.

🎯 Key Takeaway

axis=0 collapses rows (down), axis=1 collapses columns (across).

keepdims=True keeps the reduced dimension as a size-1 axis.

Always test the output shape before passing to functions that expect a specific number of dimensions.

Statistical Functions

NumPy covers standard statistics. For variance and standard deviation, note the ddof parameter — ddof=0 is population (default), ddof=1 is sample.

Example · PYTHON

1234567891011

import numpy as np

data = np.array([2, 4, 4, 4, 5, 5, 7, 9])

print(np.mean(data))          # 5.0
print(np.median(data))        # 4.5
print(np.std(data))           # 2.0  (population)
print(np.std(data, ddof=1))   # 2.138 (sample)
print(np.var(data))           # 4.0
print(np.percentile(data, 75)) # 5.75
print(np.corrcoef([1,2,3], [1,2,3]))  # correlation matrix

▶ Output

5.0
4.5
2.0

📊 Production Insight

Default ddof=0 gives population statistics, but most real-world uses require sample statistics (ddof=1).

Using the wrong ddof can shift ML model thresholds by a non-trivial amount.

Also, np.std is not robust to outliers — consider using np.median and IQR for production anomaly detection.

🎯 Key Takeaway

np.std() defaults to population std (ddof=0).

For sample data, always pass ddof=1.

For robust statistics in production, use median and IQR instead of mean and std.

Broadcasting: How ufuncs Handle Different Shapes

Broadcasting allows ufuncs to operate on arrays of different shapes by automatically expanding dimensions. The arrays must be compatible: they align from the rightmost dimension, and each dimension must be equal or one of them must be 1.

Example · PYTHON

1234567891011

import numpy as np

a = np.array([[1, 2, 3],
              [4, 5, 6]])  # shape (2,3)
b = np.array([10, 20, 30]) # shape (3,) -> broadcasts to (1,3) then (2,3)
print(a + b)  # element-wise addition
# [[11 22 33]
#  [14 25 36]]

# Scalar broadcasting works too
print(a * 2)  # multiplies every element by 2

▶ Output

[[11 22 33]
[14 25 36]]
[[2 4 6]
[8 10 12]]

📊 Production Insight

Broadcasting is memory-efficient because it creates virtual views — no data is copied.

But if you accidentally create a very large broadcast (e.g., shape (1, 1_000_000) + (1_000_000, 1) -> (1_000_000, 1_000_000)), you can burn gigabytes of RAM instantly.

The broadcast error message 'operands could not be broadcast together with shapes' is a frequent debugging stop.

🎯 Key Takeaway

Broadcasting aligns dimensions from the right.

It's fast and memory-efficient for small expansions.

Watch out for accidental large broadcasts — they can OOM your process in seconds.

Performance: Vectorisation vs Python Loops

Vectorised operations in NumPy run in compiled C and are orders of magnitude faster than Python loops. The performance gap grows with array size — for a 10-million-element array, vectorised sum takes ~5ms, while a Python loop takes over a minute.

Example · PYTHON

12345678910111213141516

import numpy as np
import time

arr = np.random.randn(10_000_000)

# Vectorised
start = time.time()
result = np.sum(arr)
print(f"Vectorised: {time.time() - start:.5f}s")

# Python loop
start = time.time()
total = 0.0
for x in arr:
    total += x
print(f"Loop: {time.time() - start:.5f}s")

▶ Output

Vectorised: 0.005s
Loop: 62.3s

📊 Production Insight

Writing Python loops over NumPy arrays is often a code smell in code reviews.

The performance penalty becomes critical in data pipelines processing millions of rows per second.

Always use np.vectorize or np.frompyfunc only as a last resort — they are still slower than native ufuncs, but better than loops.

🎯 Key Takeaway

Vectorised operations are 10-100x faster than Python loops.

If you see a for loop over a NumPy array in production code, it's a red flag.

For custom element-wise logic, use np.vectorize only when you cannot rewrite the function as a ufunc.

Advanced ufunc Methods: reduce, accumulate, outer

Ufuncs provide methods beyond direct element-wise application. reduce applies the operation cumulatively along an axis, accumulate returns all intermediate results, and outer computes the outer product.

Example · PYTHON

123456789101112131415

import numpy as np

a = np.array([1, 2, 3, 4])

# reduce: cumulative sum produces single value
print(np.add.reduce(a))  # 10 = 1+2+3+4

# accumulate: all intermediate sums
print(np.add.accumulate(a))  # [1 3 6 10]

# outer: outer product
print(np.multiply.outer([1,2,3], [10,20,30]))
# [[10 20 30]
#  [20 40 60]
#  [30 60 90]]

▶ Output

10
[1 3 6 10]
[[10 20 30]
[20 40 60]
[30 60 90]]

📊 Production Insight

np.add.reduce is identical to np.sum in behaviour but slightly faster for some dtypes.

accumulate is useful for running totals in time-series pipelines but can overflow with int32 arrays.

outer operations can produce huge outputs (O(n^2) memory) — always check dimensions before using them in production.

🎯 Key Takeaway

Ufunc methods (reduce, accumulate, outer) are efficient tools for specialised operations.

Outer product memory scales quadratically — dangerous for large arrays.

Use accumulate for running totals instead of manual loops.

Handling NaN in Statistical Aggregations

By default, NumPy aggregation functions (mean, sum, std) return NaN if any element is NaN. Use nan-aware versions: np.nansum, np.nanmean, np.nanstd. They ignore NaN values and compute on the remaining elements.

Example · PYTHON

123456789101112

import numpy as np

data = np.array([1.0, 2.0, np.nan, 4.0])

print(np.mean(data))   # nan
print(np.nanmean(data)) # 2.333...

print(np.std(data))     # nan
print(np.nanstd(data))  # ~1.247

# For percentage of missing values
print(np.isnan(data).sum())  # 1

▶ Output

nan
2.333333333333333
nan
1.247219128924647
1

📊 Production Insight

Silent NaN propagation is one of the most insidious bugs in data pipelines.

A single missing value can turn an entire column of statistics to NaN, which often goes unnoticed for days.

Always use np.isnan().any() in assertions before feeding aggregated results to downstream systems.

🎯 Key Takeaway

Default aggregations propagate NaN silently.

Always use nan-aware functions (np.nanmean, np.nansum, np.nanstd) when data may contain missing values.

Assert no NaN after aggregation: assert not np.any(np.isnan(result))

🎯 Key Takeaways

Ufuncs operate element-wise and support broadcasting — always prefer them over Python loops.
Aggregation functions accept an axis parameter — axis=0 collapses rows, axis=1 collapses columns.
np.std() uses ddof=0 (population) by default. Use ddof=1 for sample standard deviation.
np.argmax() and np.argmin() return indices, not values.
np.cumsum() and np.cumprod() give running totals without reducing the array size.
Broadcasting avoids memory copies but can silently explode when shapes are non-optimal.
Always use nan-aware functions (np.nanmean, np.nansum) when data may have missing values.
Ufunc methods (reduce, accumulate, outer) are efficient for specialised operations but watch for overflow and memory.

⚠ Common Mistakes to Avoid

✕Using Python loops instead of ufuncs

Symptom

Code runs extremely slowly (minutes vs milliseconds) on large arrays. CPU usage is low because Python overhead dominates.

Fix

Replace the loop with a vectorised operation (e.g., np.sum, np.sqrt, or custom ufunc using np.vectorize as last resort).

✕Forgetting the axis parameter in aggregations

Symptom

You expected an array of per-column sums but got a single scalar. Downstream code expecting a 1D array breaks with shape mismatch.

Fix

Always specify axis=0 for column-wise, axis=1 for row-wise. Use keepdims=True to retain dimensions.

✕Using np.std without ddof=1 for sample data

Symptom

Standard deviation is slightly smaller than expected (population vs sample). In hypothesis testing or anomaly detection, thresholds are off.

Fix

Use np.std(data, ddof=1) for sample standard deviation. For variance, np.var(data, ddof=1).

✕Assuming aggregations ignore NaN values

Symptom

Result is NaN when input contains any NaN. Downstream processes fail or produce NaN predictions.

Fix

Use nan-aware functions: np.nanmean, np.nansum, np.nanstd. Or clean input with np.isnan check.

Interview Questions on This Topic

QWhat does the axis parameter mean in NumPy aggregation functions?JuniorReveal
The axis parameter specifies the dimension along which to perform the aggregation. For a 2D array: axis=0 operates along the rows (collapsing rows, giving per-column results), axis=1 operates along the columns (collapsing columns, giving per-row results). axis=None (default) flattens the array and aggregates everything to a scalar. The shape of the result depends on which axes are removed. Using keepdims=True preserves the reduced axes as size-1 dimensions, which helps maintain shape compatibility.
QWhat is the difference between np.std() with ddof=0 and ddof=1?Mid-levelReveal
ddof stands for 'delta degrees of freedom'. ddof=0 (default) computes the population standard deviation: it divides by N, where N is the number of elements. ddof=1 computes the sample standard deviation: it divides by N-1, which provides an unbiased estimate for a sample drawn from a larger population. Use ddof=1 when your array is a sample of a larger population, and ddof=0 when it's the entire population. The difference is small for large N, but significant for small datasets.
QHow does broadcasting work in NumPy?Mid-levelReveal
Broadcasting allows arrays of different shapes to be used together in ufuncs. The rules are: (1) arrays are aligned from the rightmost dimension, (2) each dimension must be equal or one of them must be 1, (3) if they differ, the array with size 1 is stretched to match the other. Broadcasting creates virtual views — no data is copied, so memory usage remains low unless the result is materialized. A common error is attempting to broadcast incompatible shapes (e.g., (2,3) and (4,3) — last dims match but first dims don't). You can add dimensions manually using np.newaxis or a.reshape(-1,1).
QWhy are vectorised operations faster than Python loops in NumPy?SeniorReveal
Vectorised operations execute in compiled C code inside the CPython interpreter, bypassing the Python runtime overhead. Python loops incur costs per iteration: type checking, reference counting, and bytecode execution. For a million-element array, the difference is between ~0.5 milliseconds and ~10 seconds. NumPy's operations leverage CPU-level optimisations like SIMD instructions and cache locality. The performance gap widens with array size. Always prefer ufuncs and aggregate functions over manual iteration.

Frequently Asked Questions

What is the difference between np.sum(a) and a.sum()?

Functionally identical. The method form a.sum() is slightly more common in practice. Both support the axis, keepdims, and dtype parameters.

How do I compute a weighted average in NumPy?

Use np.average(a, weights=w). Unlike np.mean(), np.average() accepts a weights array and computes the weighted mean. np.mean() treats all elements equally.

How do I compute the running sum or cumulative product?

Use np.cumsum(a) for running sum, np.cumprod(a) for running product. These return an array of the same shape with accumulated values along the given axis (default flattens).

What is the difference between np.add.reduce and np.sum?

np.add.reduce(a) is exactly equivalent to np.sum(a). The reduce method is the more general primitive; np.sum is a convenience wrapper. For integer types, both may overflow silently — use np.add.reduce with dtype=np.float64 to avoid.

🔥

Naren Founder & Author

Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.

About Naren Get in touch

Forged with 🔥 at TheCodeForge.io — Where Developers Are Forged