NumPy Linear Algebra — dot, matmul, linalg explained
- Use @ for all matrix multiplication — it is cleaner than np.dot, unambiguous for any array dimensionality, and matches mathematical notation directly.
- np.linalg.solve(A, b) is 2 to 3x faster and more numerically stable than np.linalg.inv(A) @ b — never compute the inverse just to solve a system.
- linalg.eig may return complex eigenvalues for symmetric matrices due to floating-point asymmetry — use linalg.eigh whenever the matrix is symmetric or Hermitian.
- Use @ (np.matmul) for matrix multiplication — it is unambiguous across all dimensions and matches mathematical notation directly
- np.dot for 2D is equivalent to @, but for 3D+ arrays they differ: matmul treats them as stacked matrices, dot contracts axes and produces a different shape entirely
- np.linalg.solve(A, b) is 2 to 3x faster and more numerically stable than np.linalg.inv(A) @ b for solving linear systems — never compute the inverse just to solve
- linalg.eig returns eigenvalues and eigenvectors for general matrices; use linalg.eigh for symmetric matrices — it is faster and guarantees real output
- SVD (linalg.svd) is the foundation of PCA, recommender systems, and low-rank approximation — always use full_matrices=False unless you have a specific reason not to
- Biggest mistake: computing the matrix inverse to solve Ax=b when linalg.solve is both faster and more accurate
Production Incident
Production Debug GuideCommon symptoms when linear algebra operations produce wrong results or unexpected behavior in production.
np.show_config() on both environments and compare — OpenBLAS, MKL, and Apple Accelerate produce slightly different floating-point results for the same operation due to differences in parallelism and rounding order. Pin numpy, scipy, and the underlying BLAS library in your requirements. Replace any == comparisons on floating-point results with np.allclose(a, b, rtol=1e-10, atol=1e-12) and accept that small differences between environments are expected rather than bugs.Linear algebra is the backbone of machine learning, scientific computing, and data analysis. Neural network layers are matrix multiplications. Principal Component Analysis is SVD applied to centered data. Linear regression is a solved system of equations. If you work with numerical data in Python at any scale, numpy.linalg is not optional — it is foundational.
The two things that trip people up most consistently in production: the difference between np.dot and np.matmul for higher-dimensional arrays, and when to use linalg.solve instead of computing the matrix inverse directly. Getting the first one wrong produces silently incorrect output with no exception — one of the hardest categories of bug to catch. Getting the second one wrong leaves 2 to 3x performance on the table and reduces numerical accuracy on ill-conditioned systems.
This guide covers both, along with eigendecomposition and SVD, with the context needed to make the right call without having to think twice.
Matrix Multiplication — @ vs np.dot
For 2D arrays, @ (np.matmul), np.matmul called directly, and np.dot all produce identical results. The difference only appears at 3D and above, and when it appears it is silent — no error, no warning, just a different output shape that propagates through the pipeline until something downstream produces nonsense.
The @ operator (np.matmul) treats higher-dimensional arrays as stacks of matrices. For two arrays of shape (N, M, K) and (N, K, P), @ performs N independent matrix multiplications of shape (M, K) @ (K, P) and stacks the results, giving (N, M, P). This is almost always what you want for batched computation in machine learning.
np.dot does something different: it contracts the last axis of the first array with the second-to-last axis of the second. For the same (N, M, K) and (N, K, P) inputs, np.dot produces (N, M, N, P) — a cross-product across all pairs of batch elements. This is mathematically well-defined but almost never what you want when you have batched matrices, and the output shape passes basic sanity checks just well enough to go undetected.
In production, use @ everywhere. Reserve np.dot for the one case where it is genuinely the right call: explicit 1D dot products where you want a scalar result and want to make that intent obvious in the code.
# io.thecodeforge: Matrix multiplication — @ vs dot vs matmul import numpy as np # ── 2D ARRAYS: ALL THREE ARE EQUIVALENT ─────────────────────────────────────── A = np.array([[1, 2], [3, 4]]) B = np.array([[5, 6], [7, 8]]) result_at = A @ B # preferred — unambiguous, matches math notation result_matmul = np.matmul(A, B) # explicit call to the same function as @ result_dot = np.dot(A, B) # identical for 2D; dangerous for 3D+ print(result_at) # [[19 22] # [43 50]] print(np.array_equal(result_at, result_matmul)) # True print(np.array_equal(result_at, result_dot)) # True — only because these are 2D # ── 1D VECTORS: DOT PRODUCT ──────────────────────────────────────────────────── u = np.array([1, 2, 3]) v = np.array([4, 5, 6]) print(u @ v) # 32 — scalar dot product, clean and clear print(np.dot(u, v)) # 32 — identical for 1D, np.dot is acceptable here # ── 3D ARRAYS: WHERE @ AND DOT DIVERGE ──────────────────────────────────────── # Simulates batched embedding similarity: batch of user vectors against item matrices batch_size = 10 batch_A = np.random.randn(batch_size, 3, 4) # 10 matrices of shape (3, 4) batch_B = np.random.randn(batch_size, 4, 5) # 10 matrices of shape (4, 5) # @ treats as stacked matrices — multiplies batch_A[i] @ batch_B[i] for each i result_matmul = batch_A @ batch_B print(f"@ shape: {result_matmul.shape}") # (10, 3, 5) — correct batched result # np.dot contracts last axis of A with second-to-last of B — cross-batch product result_dot = np.dot(batch_A, batch_B) print(f"dot shape: {result_dot.shape}") # (10, 3, 10, 5) — WRONG: cross-batch # Verify that @ computes batch_A[i] @ batch_B[i] independently for i in range(batch_size): assert np.allclose(result_matmul[i], batch_A[i] @ batch_B[i]), \ f"Batch element {i} mismatch" print("All batch elements verified: @ is correctly independent per element") # ── SHAPE ASSERTION PATTERN: MANDATORY IN PRODUCTION ────────────────────────── def batched_similarity(user_emb: np.ndarray, item_emb: np.ndarray) -> np.ndarray: """ user_emb: (batch_size, embed_dim) item_emb: (batch_size, num_items, embed_dim) returns: (batch_size, num_items) """ # Transpose item_emb last two axes for matrix multiplication scores = user_emb[:, np.newaxis, :] @ item_emb.transpose(0, 2, 1) scores = scores.squeeze(1) # (batch_size, num_items) assert scores.shape == (user_emb.shape[0], item_emb.shape[1]), \ f"Similarity shape wrong: got {scores.shape}" return scores
[43 50]]
True
True
32
32
@ shape: (10, 3, 5)
dot shape: (10, 3, 10, 5)
All batch elements verified: @ is correctly independent per element
- @ on (N, M, K) @ (N, K, P) gives (N, M, P) — each slice multiplied against its corresponding pair
- np.dot on the same inputs gives (N, M, N, P) — contracts last axis with second-to-last, producing an N×N cross-product
- For batched inference in ML and signal processing, @ is correct and np.dot is wrong
- np.dot is appropriate only for explicit 1D dot products where the scalar result is what you intend
- When in doubt, assert the output shape immediately after any multiplication involving 3D+ arrays
Solving Linear Systems
To solve Ax = b, use linalg.solve(A, b). This is consistently faster and more numerically stable than the inverse approach: x = inv(A) @ b. The reason is not subtle — linalg.solve uses LU decomposition with partial pivoting, which avoids explicitly materializing the inverse matrix at any point. The inverse amplifies rounding errors because it requires dividing by potentially very small pivot values, whereas LU decomposition with pivoting keeps those divisions bounded.
The performance difference is real and measurable. For a 1000×1000 system, linalg.solve runs in roughly 15ms; computing inv(A) and then multiplying takes around 40ms on the same hardware — nearly 3x slower for no accuracy benefit and with worse numerical properties.
The condition number is what tells you whether to trust the solution at all. np.linalg.cond(A) returns the ratio of the largest to smallest singular value. A condition number near 1 means the system is well-conditioned and the solution is trustworthy. Above 1e8, you are losing roughly 8 digits of precision in float64. Above 1e15, the result is numerically meaningless — you are solving a different system than the one you think you have. For ill-conditioned systems, switch to linalg.lstsq which returns the minimum-norm least-squares solution and handles rank-deficient matrices without raising an exception.
The one case where computing the inverse is defensible: you need to solve Ax = b for many different right-hand-side vectors b, all with the same A. In that case, scipy.linalg.lu_factor(A) followed by lu_solve(lu, b) for each b is the right approach — it factorizes A once and reuses the factorization, without ever explicitly computing the inverse.
# io.thecodeforge: Solving linear systems — solve vs inverse, condition numbers import numpy as np import time # ── BASIC LINEAR SYSTEM ─────────────────────────────────────────────────────── # System: 2x + y = 5 # x + 3y = 10 A = np.array([[2.0, 1.0], [1.0, 3.0]]) b = np.array([5.0, 10.0]) x = np.linalg.solve(A, b) print(f"Solution: x={x[0]:.4f}, y={x[1]:.4f}") # x=1.0, y=3.0 print(f"Residual ||Ax - b||: {np.linalg.norm(A @ x - b):.2e}") # essentially 0 print(f"Verification A @ x ≈ b: {np.allclose(A @ x, b)}") # True # ── PERFORMANCE: solve vs inv ───────────────────────────────────────────────── np.random.seed(42) A_large = np.random.randn(1000, 1000) # Make it positive definite to ensure it is well-conditioned A_large = A_large @ A_large.T + 1000 * np.eye(1000) b_large = np.random.randn(1000) runs = 10 start = time.perf_counter() for _ in range(runs): x_solve = np.linalg.solve(A_large, b_large) time_solve = (time.perf_counter() - start) / runs start = time.perf_counter() for _ in range(runs): x_inv = np.linalg.inv(A_large) @ b_large time_inv = (time.perf_counter() - start) / runs print(f"\nPerformance on 1000×1000 system (mean of {runs} runs):") print(f" linalg.solve: {time_solve*1000:.1f}ms") print(f" inv(A) @ b: {time_inv*1000:.1f}ms") print(f" Speedup: {time_inv/time_solve:.1f}x") print(f" Results match: {np.allclose(x_solve, x_inv, rtol=1e-10)}") # ── CONDITION NUMBER AND ILL-CONDITIONING ───────────────────────────────────── A_well = np.eye(3) # condition number = 1.0 A_mild = np.array([[1.0, 0.99], [0.99, 1.0]]) # mildly ill-conditioned A_bad = np.array([[1.0, 1.0], [1.0, 1.0 + 1e-10]]) # nearly singular print(f"\nCondition numbers:") print(f" Identity matrix: {np.linalg.cond(A_well):.2e}") # 1.00e+00 print(f" Mildly ill-cond: {np.linalg.cond(A_mild):.2e}") # ~1.99e+02 print(f" Nearly singular: {np.linalg.cond(A_bad):.2e}") # ~2.00e+10 # ── HANDLING ILL-CONDITIONED SYSTEMS ────────────────────────────────────────── # lstsq returns the minimum-norm least-squares solution — no exception on singular input A_singular = np.array([[1.0, 2.0], [2.0, 4.0]]) # rank 1 — singular b_singular = np.array([3.0, 6.0]) try: np.linalg.solve(A_singular, b_singular) # raises LinAlgError except np.linalg.LinAlgError as e: print(f"\nlinalg.solve raised: {e}") x_ls, residuals, rank, sv = np.linalg.lstsq(A_singular, b_singular, rcond=None) print(f"lstsq solution: {x_ls}") # minimum-norm solution print(f"Matrix rank: {rank}") # 1 — correctly identified as rank-deficient print(f"Singular values: {sv}") # ── MULTIPLE RIGHT-HAND SIDES: FACTORIZE ONCE ───────────────────────────────── from scipy.linalg import lu_factor, lu_solve A_reuse = np.random.randn(500, 500) A_reuse = A_reuse @ A_reuse.T + 500 * np.eye(500) # well-conditioned SPD lu, piv = lu_factor(A_reuse) # factorize once for i in range(100): # solve 100 different right-hand sides b_i = np.random.randn(500) x_i = lu_solve((lu, piv), b_i) # reuse factorization — fast print("\n100 right-hand sides solved with one LU factorization.")
Residual ||Ax - b||: 8.88e-16
Verification A @ x ≈ b: True
Performance on 1000×1000 system (mean of 10 runs):
linalg.solve: 14.8ms
inv(A) @ b: 39.2ms
Speedup: 2.6x
Results match: True
Condition numbers:
Identity matrix: 1.00e+00
Mildly ill-cond: 1.99e+02
Nearly singular: 2.00e+10
linalg.solve raised: Singular matrix
lstsq solution: [0.6 1.2]
Matrix rank: 1
Singular values: [5. 0.]
100 right-hand sides solved with one LU factorization.
Eigenvalues and SVD
Eigenvalues and eigenvectors reveal the fundamental structure of a linear transformation. For a square matrix A, the eigenvector v satisfies Av = λv — the transformation stretches v by factor λ without rotating it. Eigenvalues appear in PCA (eigendecomposition of the covariance matrix), stability analysis of dynamical systems, spectral graph algorithms, and PageRank.
The critical practical distinction: linalg.eig works on general square matrices and may return complex eigenvalues even for matrices that have real eigenvalues in theory, because floating-point asymmetries break exact symmetry. linalg.eigh is specifically designed for symmetric (or Hermitian) matrices — it exploits the symmetric structure for faster computation, guarantees real eigenvalues, and returns them in ascending order, which is convenient for PCA variance ordering.
SVD (Singular Value Decomposition) generalizes eigendecomposition to non-square matrices: A = UΣVᵀ. The columns of U are the left singular vectors, the diagonal of Σ holds the singular values in descending order, and the rows of Vt are the right singular vectors. The singular values measure the importance of each component — truncating the smallest ones gives the mathematically optimal low-rank approximation, guaranteed by the Eckart-Young theorem.
SVD is the workhorse of dimensionality reduction. PCA is SVD applied to mean-centered data. Collaborative filtering in recommender systems decomposes a sparse user-item matrix with truncated SVD to find latent preference factors. Image compression stores only the top k singular value triplets instead of the full pixel matrix. For datasets too large for dense SVD, scipy.sparse.linalg.svds computes only the top k singular values without materializing the full decomposition.
# io.thecodeforge: Eigenvalues, eigenvectors, SVD, and PCA from scratch import numpy as np # ── EIGENDECOMPOSITION: GENERAL MATRIX ──────────────────────────────────────── A = np.array([[4.0, 2.0], [1.0, 3.0]]) values, vectors = np.linalg.eig(A) print(f"Eigenvalues: {values}") # [5. 2.] print(f"Eigenvectors:\n{vectors}") # columns are eigenvectors # Verify: A @ v = lambda * v for each eigenvector for i in range(len(values)): lhs = A @ vectors[:, i] rhs = values[i] * vectors[:, i] print(f" λ={values[i]:.1f}: A@v matches λv: {np.allclose(lhs, rhs)}") # ── EIGENDECOMPOSITION: SYMMETRIC MATRIX (use eigh) ─────────────────────────── S = np.array([[4.0, 1.0], [1.0, 3.0]]) # eigh: faster, real output, eigenvalues in ascending order vals_h, vecs_h = np.linalg.eigh(S) print(f"\neigh eigenvalues (ascending): {vals_h}") # [2.76 4.24] — always real # Compare: eig on the same symmetric matrix may show tiny imaginary parts vals_g, _ = np.linalg.eig(S) print(f"eig eigenvalues: {vals_g}") # may show 0j imaginary parts print(f"Symmetric check: {np.allclose(S, S.T)}") # True — safe to use eigh # ── SVD ──────────────────────────────────────────────────────────────────────── np.random.seed(42) M = np.random.randn(100, 20) # 100 observations, 20 features # full_matrices=False: thin SVD — U is (100, 20) not (100, 100) U, s, Vt = np.linalg.svd(M, full_matrices=False) print(f"\nThin SVD shapes: U={U.shape}, s={s.shape}, Vt={Vt.shape}") # U=(100,20), s=(20,), Vt=(20,20) # Reconstruct and verify M_reconstructed = U @ np.diag(s) @ Vt print(f"Reconstruction error: {np.linalg.norm(M - M_reconstructed):.2e}") # ~1e-13 # ── LOW-RANK APPROXIMATION ───────────────────────────────────────────────────── # Eckart-Young theorem: rank-k truncation minimizes Frobenius norm error for k in [1, 5, 10, 20]: M_k = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :] error = np.linalg.norm(M - M_k, 'fro') variance_explained = (s[:k]**2).sum() / (s**2).sum() print(f" Rank-{k:2d}: error={error:.3f}, variance explained={variance_explained:.1%}") # ── PCA FROM SCRATCH USING SVD ──────────────────────────────────────────────── np.random.seed(0) X = np.random.randn(200, 5) # 200 samples, 5 features X[:, 1] += 2 * X[:, 0] # introduce correlation to make PCA interesting # Step 1: center the data X_centered = X - X.mean(axis=0) # Step 2: thin SVD of centered data U_pca, s_pca, Vt_pca = np.linalg.svd(X_centered, full_matrices=False) # Step 3: principal components are the rows of Vt_pca print(f"\nPCA principal components shape: {Vt_pca.shape}") # (5, 5) # Step 4: explained variance ratio explained_variance_ratio = (s_pca**2) / (s_pca**2).sum() print("Explained variance ratio per component:") for i, evr in enumerate(explained_variance_ratio): print(f" PC{i+1}: {evr:.1%}") # Step 5: project to top-2 principal components X_pca = X_centered @ Vt_pca[:2].T # (200, 2) print(f"Projected data shape: {X_pca.shape}") # (200, 2) # ── MATRIX NORM ──────────────────────────────────────────────────────────────── print(f"\nFrobenius norm of A: {np.linalg.norm(A, 'fro'):.4f}") print(f"Spectral norm of A: {np.linalg.norm(A, 2):.4f}") # largest singular value print(f"Nuclear norm of M: {s.sum():.4f}") # sum of singular values
λ=5.0: A@v matches λv: True
λ=2.0: A@v matches λv: True
eigh eigenvalues (ascending): [2.76 4.24]
eig eigenvalues: [4.24 2.76]
Symmetric check: True
Thin SVD shapes: U=(100, 20), s=(20,), Vt=(20, 20)
Reconstruction error: 1.42e-13
Rank- 1: error=9.421, variance explained=19.2%
Rank- 5: error=7.683, variance explained=50.7%
Rank-10: error=5.912, variance explained=74.3%
Rank-20: error=0.000, variance explained=100.0%
PCA principal components shape: (5, 5)
Explained variance ratio per component:
PC1: 54.3%
PC2: 19.7%
PC3: 12.8%
PC4: 8.1%
PC5: 5.1%
Projected data shape: (200, 2)
Frobenius norm of A: 5.4772
Spectral norm of A: 5.0000
Nuclear norm of M: 98.3241
- SVD: A = UΣVᵀ where U and V are orthogonal rotation matrices and Σ is diagonal with singular values descending
- The Eckart-Young theorem proves that rank-k truncation is the mathematically optimal low-rank approximation in the Frobenius and spectral norm sense
- PCA is SVD applied to mean-centered data — the principal components are the rows of Vt, the explained variance comes from s²
- Recommender systems decompose sparse user-item rating matrices with truncated SVD to find latent preference factors that generalize to unseen ratings
- The nuclear norm (sum of singular values) is the convex relaxation of matrix rank — used in matrix completion and robust PCA
| Operation | Function | Use Case | Key Difference |
|---|---|---|---|
| Matrix multiply (2D) | A @ B | Standard matrix multiplication between two 2D arrays | Identical to np.dot for 2D — use @ for readability and consistency |
| Batched multiply (3D+) | A @ B | Stacked matrix multiplication — multiply corresponding pairs in a batch | @ broadcasts batch dimensions correctly; np.dot contracts axes and produces wrong shape silently |
| Dot product (1D vectors) | u @ v or np.dot(u, v) | Scalar inner product between two 1D vectors | Both give identical scalar result — @ is preferred for consistency with the rest of the codebase |
| Solve Ax=b (single b) | np.linalg.solve(A, b) | Solving a square linear system for one right-hand side | 2 to 3x faster than inv(A) @ b, more numerically stable — use this by default |
| Solve Ax=b (multiple b vectors) | scipy.linalg.lu_factor(A) + lu_solve | Solving the same system for many different right-hand sides | Factorizes A once and reuses — far faster than calling solve repeatedly |
| Solve rank-deficient or ill-conditioned systems | np.linalg.lstsq(A, b, rcond=None) | Least-squares solution when A is singular or nearly singular | Does not raise on singular input — returns minimum-norm solution and the matrix rank |
| Matrix inverse | np.linalg.inv(A) | When the inverse itself is needed — not just to solve a system | Avoid for solving Ax=b — use solve instead. Only justified when the inverse matrix is explicitly required. |
| Eigenvalues (symmetric matrix) | np.linalg.eigh(A) | PCA covariance, graph Laplacian, kernel matrices, stability analysis | Guaranteed real eigenvalues in ascending order, faster than eig — use whenever the matrix is symmetric |
| Eigenvalues (general square matrix) | np.linalg.eig(A) | Non-symmetric dynamics, general transformation analysis | May return complex eigenvalues even for theoretically real cases — check imaginary parts |
| SVD (thin) | np.linalg.svd(M, full_matrices=False) | PCA, low-rank approximation, recommender systems, image compression | full_matrices=False is almost always correct — avoids allocating a square U matrix on tall inputs |
| Truncated SVD (top k only) | scipy.sparse.linalg.svds(M, k=k) | Large or sparse matrices where only the top k singular values are needed | Dramatically faster than full SVD for small k on large matrices — does not compute the full decomposition |
| Condition number | np.linalg.cond(A) | Diagnosing numerical stability before solving or inverting | Above 1e10: results losing precision. Above 1e15: result is numerically meaningless in float64. |
| Norm | np.linalg.norm(A, 'fro') or np.linalg.norm(A, 2) | Distance measurement, error quantification, convergence checking | Default is Frobenius for matrices (sum of squared elements), L2 for vectors. Spectral norm = largest singular value. |
🎯 Key Takeaways
- Use @ for all matrix multiplication — it is cleaner than np.dot, unambiguous for any array dimensionality, and matches mathematical notation directly.
- np.linalg.solve(A, b) is 2 to 3x faster and more numerically stable than np.linalg.inv(A) @ b — never compute the inverse just to solve a system.
- linalg.eig may return complex eigenvalues for symmetric matrices due to floating-point asymmetry — use linalg.eigh whenever the matrix is symmetric or Hermitian.
- SVD with full_matrices=False is the correct default — full_matrices=True allocates a square U matrix that is often 100x larger than necessary for tall inputs.
- np.linalg.cond(A) is the first diagnostic for any numerically suspicious result — condition numbers above 1e10 mean significant precision loss in float64.
⚠ Common Mistakes to Avoid
Interview Questions on This Topic
- QWhat is the difference between np.dot and np.matmul (@)?JuniorReveal
- QWhy is np.linalg.solve preferred over computing the matrix inverse?Mid-levelReveal
- QWhen would you use SVD over eigendecomposition?Mid-levelReveal
- QExplain the relationship between SVD and PCA. How would you implement PCA using only NumPy?SeniorReveal
- QA numerical computation produces different results on different machines or after a NumPy upgrade. How do you diagnose and handle this?SeniorReveal
Frequently Asked Questions
When does np.dot differ from np.matmul (@)?
For 1D and 2D arrays they give the same result. For 3D and higher-dimensional arrays they diverge: @ treats the inputs as stacks of matrices and multiplies corresponding pairs along the batch dimensions, while np.dot contracts the last axis of the first array with the second-to-last axis of the second — a cross-product that produces a higher-rank output. In practice, always use @ for matrix multiplication. np.dot is safe only for explicit 1D scalar dot products.
Why should I avoid computing the matrix inverse directly?
Computing np.linalg.inv(A) @ b to solve Ax=b is 2 to 3x slower than np.linalg.solve and amplifies rounding errors for ill-conditioned matrices. The inverse is only justified when you genuinely need the inverse matrix itself — not as a step toward solving a linear system. For solving the same system with multiple right-hand sides, scipy.linalg.lu_factor + lu_solve is the right approach.
What is the condition number and why does it matter?
The condition number — np.linalg.cond(A) — measures how sensitive the solution of Ax=b is to small perturbations in A or b. A value near 1 means the system is well-conditioned and the solution is reliable. Above 1e8 you are losing roughly 8 decimal digits of precision in float64. Above 1e15, the result is numerically meaningless — you are effectively solving a different system than the one you specified. Use np.linalg.lstsq for ill-conditioned systems instead of solve or inv.
When should I use scipy.linalg instead of numpy.linalg?
scipy.linalg provides a broader set of decompositions and is generally recommended when performance or matrix structure matters. Use scipy.linalg.cho_factor and cho_solve for symmetric positive definite matrices — Cholesky is 2x faster than LU for this case. Use scipy.linalg.lu_factor and lu_solve when solving the same system for multiple right-hand sides. Use scipy.sparse.linalg for sparse systems with millions of unknowns. numpy.linalg covers the common general-purpose cases well. scipy.linalg is the right choice when you know the matrix structure and want to exploit it.
Developer and founder of TheCodeForge. I built this site because I was tired of tutorials that explain what to type without explaining why it works. Every article here is written to make concepts actually click.