6.7.1 Training Tips Roadmap: Diagnose Before Changing Everything

Training tips are useful only when they answer a diagnosis. Do not change optimizer, learning rate, model size, and data at the same time.

Look at the Diagnosis Flow First

Deep learning training tips chapter relationship diagram

Training diagnosis dashboard map

Symptom	First check
training loss high	model too small, learning rate too low, bad data
training good, validation bad	overfitting, leakage, weak augmentation
unstable loss	learning rate too high, bad batch, exploding gradients
too slow	batch size, device, model size
too heavy to deploy	compression, quantization, pruning

Read a Tiny Loss Log

Create training_tips_first_loop.py.

val_loss = [0.62, 0.51, 0.48, 0.49, 0.53]
best_epoch = min(range(len(val_loss)), key=val_loss.__getitem__) + 1

print("best_epoch:", best_epoch)
print("best_val_loss:", val_loss[best_epoch - 1])
print("action: stop or reduce learning rate if validation keeps worsening")

Expected output:

best_epoch: 3
best_val_loss: 0.48
action: stop or reduce learning rate if validation keeps worsening

Before adding tricks, read the curve. A simple log often tells you what to try next.

Learn in This Order

Order	Read	What to practice
1	6.7.2 Hyperparameter Tuning	learning rate, batch size, optimizer
2	6.7.3 Training Diagnosis	loss curves, overfitting, instability
3	6.7.4 Model Compression	smaller, faster, deployable models

Pass Check

You pass this roadmap when you can look at a training/validation curve and choose one next action with a reason.

Look at the Diagnosis Flow First​

Read a Tiny Loss Log​

Learn in This Order​

Pass Check​

Look at the Diagnosis Flow First

Read a Tiny Loss Log

Learn in This Order

Pass Check