Skip to main content

4.3.1 Calculus Roadmap: How Models Learn by Reducing Loss

Calculus explains how a model changes its parameters. The first goal is intuition: measure change, move in a better direction, repeat.

Look at the Map First

Calculus and Optimization Learning Map

The training flow is:

Relationship diagram of calculus and optimization sections

IdeaFirst meaning in AI
derivativehow fast one value changes
gradienthow many parameters should change together
gradient descentupdate parameters toward lower loss
chain ruleconnect changes across steps
backpropagationcompute many gradients efficiently

When you later see loss.backward() and optimizer.step(), this chapter is the background.

Run the Smallest Loop

Create gradient_descent_first_loop.py. It finds a number close to 3 by reducing (w - 3)^2.

w = 0.0
learning_rate = 0.2

for step in range(1, 7):
gradient = 2 * (w - 3)
w = w - learning_rate * gradient
loss = (w - 3) ** 2
print(step, "w=", round(w, 3), "loss=", round(loss, 3))

Expected output:

1 w= 1.2 loss= 3.24
2 w= 1.92 loss= 1.166
3 w= 2.352 loss= 0.42
4 w= 2.611 loss= 0.151
5 w= 2.767 loss= 0.054
6 w= 2.86 loss= 0.02

The number moves toward 3, and the loss gets smaller. That is the training idea before the neural network becomes large.

Learn in This Order

OrderReadWhat to focus on first
14.3.2 Derivativesrate of change
24.3.3 Partial Derivatives and Gradientsmany parameters changing together
34.3.4 Gradient Descentupdate loop, learning rate, loss curve
44.3.5 Backpropagationchain rule, loss.backward() intuition

Pass Check

You pass this roadmap when you can explain why gradient descent repeats “compute loss -> compute gradient -> update parameter,” and why a learning rate that is too large can make training unstable.