6.0 Study Guide and Task Sheet: Deep Learning and Transformer Basics

The main study route is now in Chapter 6 entry. Use this page only as a quick checklist while you practice.
One-Line Mental Model
Section titled “One-Line Mental Model”batch datamodel forwardlossbackward gradientsoptimizer stepcurves
If the code feels long, find these six steps first.
Expected Final Output
Section titled “Expected Final Output”By the end of Chapter 6, your visible output should be a small evidence folder, not only finished reading notes:
deep_learning_evidence/ shape_trace.txt training_log.csv loss_curve.png best_checkpoint_note.md attention_note.md failure_sample_note.mdIf this folder is missing, the chapter is not finished yet, even if every page has been read.
Practice Checklist
Section titled “Practice Checklist”| Check | Evidence |
|---|---|
| I can explain forward, loss, backward, optimizer | training-loop note |
| I can run a minimal PyTorch script | train.py |
| I can print tensor shapes through a model | shape trace |
| I can compare training and validation curves | curve image or CSV |
| I can explain what Attention changes | attention note |
| I can finish the evidence-pack workshop | deep_learning_workshop_run/ |
Check reasoning and explanation
Use this checklist as a self-review rubric:
- A training-loop note should explain forward pass, loss, backward pass, and optimizer step without copying code line by line.
- A valid PyTorch script should run from a clean folder and print at least one shape, loss, or metric that proves it executed.
- A useful shape trace should include batch size, channel/feature dimensions, and the point where tensors enter the classifier or loss.
- A curve artifact should support a diagnosis: improving, underfitting, overfitting, unstable, or unclear.
- An attention note should explain what Q/K/V and masking change compared with earlier sequence models.
- A finished evidence pack should be rerunnable and should contain enough artifacts for someone else to understand the result.
Evidence Rubric
Section titled “Evidence Rubric”| Artifact | It should answer |
|---|---|
| Training-loop note | What happens in forward, loss, backward, and optimizer step? |
| Shape trace | How do tensor shapes change through the model? |
| Curve image or CSV | Is the model underfitting, overfitting, or improving steadily? |
| Attention note | What information does attention add, and what remains hard? |
| Failure sample note | Which sample fails, and what does that tell you about data, model, or labels? |
Evidence to Keep
Section titled “Evidence to Keep”Before leaving Chapter 6, keep one compact evidence pack:
- Shape Trace
- one model with printed tensor shapes
- Training Log
- train and validation loss over time
- Best Checkpoint
- how the best model was selected
- Attention Note
- Q/K/V, mask, and next-token bridge
- Failure Sample
- one wrong or weak prediction with next action
- Project Folder
- runnable evidence pack or README
Ready To Continue
Section titled “Ready To Continue”Continue to Chapter 7 when you can train one small model, save the training log, inspect failure cases, and explain why the model improved or failed.