5.0 Study Guide and Task Sheet: Machine Learning

Machine learning study guide project loop

The main study route is now in Chapter 5 entry. Use this page only as a quick checklist while you practice.

One-Line Mental Model

define tasksplit datatrain baselineevaluateinspect errorsimprove

If you do not know which model to use, start with a baseline.

Practice Checklist

Check	Evidence
I can define the task type	problem note
I can split data without leakage	train/test split note
I can train a dummy baseline and one real model	baseline comparison
I can choose a metric for the task	metric note
I can inspect errors	error samples
I can finish the evidence-pack workshop	`ml_workshop_run/`

Check reasoning and explanation

A task note should say whether the problem is regression, classification, clustering, evaluation, or feature engineering, and what success means.
A safe split note explains when the data is split and which preprocessing steps are fitted only on training data.
A baseline comparison should include a dummy or simple model and one stronger model under the same evaluation protocol.
A metric note should justify the metric using the task goal. Accuracy alone is not enough for imbalanced classification.
Error samples should become a next action, not just a screenshot. Good next actions are controlled feature, data, threshold, or model changes.
You are ready for Chapter 6 when another person can rerun your evidence pack and understand the modeling decisions.

Evidence Rubric

Artifact	It should answer
Problem note	What is the task type, and what counts as success?
Split note	How did you keep test data away from training?
Baseline comparison	What is the minimum score to beat?
Metric note	Why does this metric match the goal better than plain accuracy?
Error note	Which mistakes matter most, and what feature or label issue might explain them?

Ready To Continue

Continue to Chapter 6 when one tabular project includes a baseline, a real model, metrics, error analysis, and a README that another person can rerun.

Evidence to Keep

Keep this page’s proof of learning as a small evidence card:

Modeling Loop: data, features, model, metric, error review, and next experiment
Artifact: code, score, chart, pipeline, or project README
Failure Check: leakage, metric mismatch, unstable split, overfitting, or unclear business target
Next Action: one controlled experiment rather than many parameter changes
Expected Output: reproducible ML evidence that prepares for deep learning