Skip to main content

6.1.1 Neural Network Roadmap: Linear Layer, Activation, Loss, Update

Neural networks are not magic. A layer first does a weighted sum, then an activation changes the shape of the signal, then training adjusts weights to reduce loss.

Look at the Flow First

Neural network basics chapter relationship diagram

Keep this loop:

input -> weighted sum -> activation -> loss -> gradient -> update weights
WordFirst meaning
neuronweighted sum plus bias
activationnonlinearity such as ReLU
forward passcompute prediction
backward passcompute responsibility for error
optimizerupdate weights using gradients

Run One Neuron

Create nn_first_loop.py and run it after installing torch.

import torch

x = torch.tensor([[1.0, -2.0, 3.0]])
weights = torch.tensor([[0.5], [-1.0], [0.25]])
bias = torch.tensor([0.1])

linear_output = x @ weights + bias
activated = torch.relu(linear_output)

print("linear_output:", round(linear_output.item(), 3))
print("relu_output:", round(activated.item(), 3))

Expected output:

linear_output: 3.35
relu_output: 3.35

If the linear output were negative, ReLU would turn it into 0. That small gate is what lets stacked layers model nonlinear patterns.

Learn in This Order

OrderReadWhat to focus on
16.1.2 ML to DL Bridgewhat changes after sklearn
26.1.3 Neurons and Activationweighted sum, bias, ReLU
36.1.4 Forward and Backwardprediction, loss, gradient
46.1.5 OptimizersSGD, Momentum, Adam intuition
56.1.6 Regularizationoverfitting controls
66.1.7 Weight Initializationstable starting points
76.1.8 Optional Historywhy backprop, CNN, RNN, Attention, and Transformer appeared

Pass Check

You pass this roadmap when you can explain one layer as input @ weights + bias, describe what an activation does, and connect loss, gradient, and optimizer into one training loop.