Skip to content

3.2.6 Basic Linear Algebra Operations

NumPy Linear Algebra Toolkit

  • Master three ways to write matrix multiplication (dot, matmul, @)
  • Understand the meaning and computation of inverse matrices, determinants, and eigenvalues
  • Learn to use the numpy.linalg module for linear algebra operations
  • Understand why linear algebra matters in AI

You may feel that “linear algebra” sounds very mathematical and abstract. But in AI, it is one of the most essential mathematical foundations:

AI ScenarioRole of Linear Algebra
Neural networksThe computation in each layer is matrix multiplication
Recommender systemsUser-item matrix factorization
Image processingAn image is a matrix
Word vectorsEach word is a vector; similarity = dot product
Dimensionality reductionPCA is about finding eigenvalues and eigenvectors

For now, let’s use NumPy to work with these concepts and build intuition. Chapter 4, The Minimum Necessary Math Foundation for AI, will explain the principles in more depth.


Element-wise multiplication vs. matrix multiplication

Section titled “Element-wise multiplication vs. matrix multiplication”

This is one of the most common points of confusion for beginners:

import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Element-wise multiplication
print(A * B)
# [[ 5 12]
# [21 32]]
# Calculation: 1×5=5, 2×6=12, 3×7=21, 4×8=32
# Matrix multiplication
print(A @ B)
# [[19 22]
# [43 50]]
# Calculation:
# [1×5+2×7, 1×6+2×8] = [19, 22]
# [3×5+4×7, 3×6+4×8] = [43, 50]
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Method 1: @ operator (recommended, most concise)
C1 = A @ B
# Method 2: np.matmul
C2 = np.matmul(A, B)
# Method 3: np.dot
C3 = np.dot(A, B)
# All three methods give exactly the same result
print(np.array_equal(C1, C2)) # True
print(np.array_equal(C2, C3)) # True

Two matrices can be multiplied only when: the number of columns in the first matrix = the number of rows in the second matrix.

# (2, 3) @ (3, 4) → (2, 4) ✅ 3 == 3
A = np.ones((2, 3))
B = np.ones((3, 4))
C = A @ B
print(C.shape) # (2, 4)
# (2, 3) @ (2, 4) → ❌ error! 3 ≠ 2
# A = np.ones((2, 3))
# B = np.ones((2, 4))
# C = A @ B # ValueError!

Memory trick: (m, n) @ (n, p) → (m, p)

For one-dimensional arrays, @ or np.dot computes the dot product:

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Dot product = 1×4 + 2×5 + 3×6 = 32
print(a @ b) # 32
print(np.dot(a, b)) # 32

The dot product is very important in AI—you will use it later when learning cosine similarity and the attention mechanism.


NumPy’s linalg submodule provides a full set of linear algebra functions:

The inverse of a matrix satisfies A × A⁻¹ = identity matrix:

A = np.array([[1, 2], [3, 4]])
# Compute the inverse matrix
A_inv = np.linalg.inv(A)
print(A_inv)
# [[-2. 1. ]
# [ 1.5 -0.5]]
# Verify: A × A_inv ≈ identity matrix
print(A @ A_inv)
# [[1.0000000e+00 0.0000000e+00]
# [8.8817842e-16 1.0000000e+00]]
# The diagonal is 1, and the other values are close to 0 (floating-point precision error)

The determinant is a scalar value that represents the matrix’s “scaling factor”:

A = np.array([[1, 2], [3, 4]])
det = np.linalg.det(A)
print(f"Determinant: {det:.1f}") # -2.0
# Determinant of a 2×2 matrix = ad - bc
# [[a, b], [c, d]] → 1×4 - 2×3 = -2

Eigenvalues and eigenvectors are the “DNA” of a matrix—they reveal its internal properties:

A = np.array([[4, 2], [1, 3]])
# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print(f"Eigenvalues: {eigenvalues}") # [5. 2.]
print(f"Eigenvectors:\n{eigenvectors}")
# [[ 0.894 -0.707]
# [ 0.447 0.707]]
Solve the equations:
2x + y = 5
x + 3y = 7

Write them in matrix form: Ax = b

A = np.array([[2, 1], [1, 3]])
b = np.array([5, 7])
# Solve the system
x = np.linalg.solve(A, b)
print(f"x = {x[0]:.2f}, y = {x[1]:.2f}") # x = 1.60, y = 1.80
# Verify
print(A @ x) # [5. 7.] ← equals b, so the solution is correct

v = np.array([3, 4])
# L2 norm (Euclidean distance)
l2 = np.linalg.norm(v)
print(f"L2 norm: {l2}") # 5.0 (3² + 4² = 25, √25 = 5)
# L1 norm (sum of absolute values)
l1 = np.linalg.norm(v, ord=1)
print(f"L1 norm: {l1}") # 7.0 (|3| + |4| = 7)
# Matrix norm
M = np.array([[1, 2], [3, 4]])
print(f"Matrix Frobenius norm: {np.linalg.norm(M):.2f}") # 5.48
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
rank = np.linalg.matrix_rank(A)
print(f"Matrix rank: {rank}") # 2 (not full rank, because the third row = first row×(-1) + second row×2)
FunctionPurposeExample
A @ BMatrix multiplicationnp.array([[1,2],[3,4]]) @ np.eye(2)
np.linalg.inv(A)Inverse matrix
np.linalg.det(A)Determinant
np.linalg.eig(A)Eigenvalues and eigenvectors
np.linalg.solve(A, b)Solve Ax=b
np.linalg.norm(v)Norm
np.linalg.matrix_rank(A)Matrix rank
A.TTranspose
np.trace(A)Trace (sum of diagonal elements)

Cosine similarity is a common way in AI to measure how “similar” two vectors are. It will be used repeatedly later in word vectors, recommender systems, and RAG.

Formula: cos(θ) = (a · b) / (||a|| × ||b||)

import numpy as np
def cosine_similarity(a, b):
"""Calculate the cosine similarity between two vectors"""
dot_product = a @ b # Dot product
norm_a = np.linalg.norm(a) # Length of a
norm_b = np.linalg.norm(b) # Length of b
return dot_product / (norm_a * norm_b)
# Example: compare model-serving profiles
# Dimensions represent: [accuracy, throughput, low_latency, low_memory, stability]
baseline = np.array([4, 3, 2, 2, 4])
quantized = np.array([4, 3, 3, 3, 4])
experimental = np.array([2, 5, 5, 4, 2])
print(f"Baseline vs quantized: {cosine_similarity(baseline, quantized):.4f}") # 0.9857
print(f"Baseline vs experimental: {cosine_similarity(baseline, experimental):.4f}") # 0.8137
print(f"Quantized vs experimental: {cosine_similarity(quantized, experimental):.4f}") # 0.8778

Keep this page’s proof of learning as a small evidence card:

Array State
shape, dtype, axis, and sample values before the operation
Operation
indexing, slicing, broadcasting, reshape, linear algebra, or random/stat function
Output
resulting array shape, values, or statistic
Failure Check
axis confusion, view/copy trap, broadcast mismatch, or wrong shape
Expected Output
printed shapes and values that make the array operation inspectable
ConceptDescriptionNumPy function
Matrix multiplication(m,n) @ (n,p) → (m,p)A @ B or np.matmul
Inverse matrixA × A⁻¹ = Inp.linalg.inv()
DeterminantMatrix scaling factornp.linalg.det()
Eigenvalues/vectorsThe “DNA” of a matrixnp.linalg.eig()
Solving equationsSolve Ax = bnp.linalg.solve()
NormVector lengthnp.linalg.norm()

# Resource cost per request for 3 pipeline stages
cost_per_stage = np.array([4, 12, 6]) # [embed, rerank, generate]
# Stage counts for 3 request batches
stage_counts = np.array([
[3, 1, 2], # Batch 1
[0, 2, 5], # Batch 2
[5, 0, 3] # Batch 3
])
# Use matrix multiplication to calculate the total cost of each batch
# totals = ?
# Solve the system:
# 3x + 2y - z = 1
# x - y + 2z = 5
# 2x + 3y - z = 0
#
# Hint: write it in the form Ax = b
# Suppose we have feature vectors for model-serving profiles
# Dimensions represent: [accuracy, throughput, low_latency, low_memory, stability]
profiles = {
"baseline": np.array([4, 3, 2, 2, 4]),
"quantized": np.array([4, 3, 3, 3, 4]),
"experimental": np.array([2, 5, 5, 4, 2]),
}
# Use cosine similarity to find the profile most similar to "baseline"
# Hint: calculate the cosine similarity between "baseline" and each other profile
Reference implementation and walkthrough
  • For the resource-cost example, stage_counts @ cost_per_stage is the clean vectorized answer. With costs [4, 12, 6] and rows [3,1,2], [0,2,5], [5,0,3], the totals are 36, 54, and 38.
  • For the linear system 3x + 2y - z = 1, x - y + 2z = 5, 2x + 3y - z = 0, np.linalg.solve should return x=1, y=0, z=2.
  • In the profile cosine-similarity example, compare dot products after normalizing by vector length. The most similar profile should be the one with the largest cosine value, not simply the largest raw dot product.