Skip to content

5.0 Study Guide and Task Sheet: Machine Learning

Machine learning study guide project loop

The main study route is now in Chapter 5 entry. Use this page only as a quick checklist while you practice.

define tasksplit datatrain baselineevaluateinspect errorsimprove

If you do not know which model to use, start with a baseline.

CheckEvidence
I can define the task typeproblem note
I can split data without leakagetrain/test split note
I can train a dummy baseline and one real modelbaseline comparison
I can choose a metric for the taskmetric note
I can inspect errorserror samples
I can finish the evidence-pack workshopml_workshop_run/
Check reasoning and explanation
  1. A task note should say whether the problem is regression, classification, clustering, evaluation, or feature engineering, and what success means.
  2. A safe split note explains when the data is split and which preprocessing steps are fitted only on training data.
  3. A baseline comparison should include a dummy or simple model and one stronger model under the same evaluation protocol.
  4. A metric note should justify the metric using the task goal. Accuracy alone is not enough for imbalanced classification.
  5. Error samples should become a next action, not just a screenshot. Good next actions are controlled feature, data, threshold, or model changes.
  6. You are ready for Chapter 6 when another person can rerun your evidence pack and understand the modeling decisions.
ArtifactIt should answer
Problem noteWhat is the task type, and what counts as success?
Split noteHow did you keep test data away from training?
Baseline comparisonWhat is the minimum score to beat?
Metric noteWhy does this metric match the goal better than plain accuracy?
Error noteWhich mistakes matter most, and what feature or label issue might explain them?

Continue to Chapter 6 when one tabular project includes a baseline, a real model, metrics, error analysis, and a README that another person can rerun.

Keep this page’s proof of learning as a small evidence card:

Modeling Loop
data, features, model, metric, error review, and next experiment
Artifact
code, score, chart, pipeline, or project README
Failure Check
leakage, metric mismatch, unstable split, overfitting, or unclear business target
Next Action
one controlled experiment rather than many parameter changes
Expected Output
reproducible ML evidence that prepares for deep learning