Skip to content

5 Introduction to Machine Learning: From Basics to Practice

Main visual for machine learning

Chapter 5 has one job: help you turn a data problem into a trainable, evaluable, improvable machine learning project.

You have already learned how data becomes numbers and how loss and gradients explain model improvement. This chapter makes those ideas practical: define a prediction problem, build a baseline, choose a metric, inspect errors, and improve only when evidence says the change helped.

This is the bridge from math intuition to model engineering. Chapter 6 will keep the same evidence habit, but the model will become a neural network trained with tensors and backpropagation.

Main loop of machine learning modeling

Read the picture first. Most reliable ML work follows this loop:

define tasksplit datatrain baselineevaluateinspect errorsimprove

Start with a baseline before chasing model names. A baseline tells you whether later changes actually improve anything.

Use this checklist as both the chapter guide and the task sheet. Establish the baseline and evaluation habit before expanding the model search.

  1. 5.1 ML Basics Follow along: identify classification, regression, clustering, anomaly detection, features, labels, train/test split, and sklearn flow. Evidence to keep: a problem-definition note.

  2. 5.1.5 ML History Follow along: optional background; skim how classic algorithms appeared. Evidence to keep: a short “why this algorithm exists” note.

  3. 5.2 Supervised Learning Follow along: run regression and classification examples before comparing many models. Evidence to keep: one baseline score and one improved score.

  4. 5.3 Unsupervised Learning Follow along: try clustering, dimensionality reduction, and anomaly detection when labels are missing. Evidence to keep: one chart or cluster interpretation.

  5. 5.4 Evaluation Follow along: choose metrics, use cross-validation, diagnose bias/variance, and tune carefully. Evidence to keep: metric choice and error samples.

  6. 5.5 Feature Engineering Follow along: handle missing values, categories, scaling, feature construction, feature selection, and Pipeline. Evidence to keep: feature processing log and leakage check.

  7. 5.6 Projects and 5.6.6 Workshop Follow along: build a reproducible evidence pack before larger house-price, churn, segmentation, or Kaggle work. Evidence to keep: README, model comparison, errors, and next-step plan.

LayerWhat to study nowHow to use it
Required coreTask type, train/test split, baseline, metric, error samples, leakage check, PipelineThese become the evaluation habits for LLM prompts, RAG retrieval, and Agent behavior later
Optional extensionExtra classic algorithms, ML history, Kaggle-style iterationReturn here when a project needs broader algorithm comparison or competition workflow
Depth challengeKeep the data and metric fixed, change one feature or model choice, then explain the before/after errorsThis prevents model hopping without evidence

Key terms for this chapter:

TermMeaning
featureInput column the model can use
label / targetAnswer the model should learn to predict
baselineSimplest model or rule you must beat
metricRuler for judging the model, such as F1, AUC, MAE, or RMSE
leakageTest or target information accidentally entering training
PipelinePreprocessing and model packaged together to reduce leakage

Install sklearn if needed:

Terminal window
python -m pip install scikit-learn

Then run this self-contained baseline. It uses a built-in dataset, splits data, trains a dummy baseline, trains a real model, and compares both.

from sklearn.datasets import load_breast_cancer
from sklearn.dummy import DummyClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
baseline = DummyClassifier(strategy="most_frequent")
baseline.fit(X_train, y_train)
print("Baseline")
print(classification_report(y_test, baseline.predict(X_test), zero_division=0))
model = make_pipeline(StandardScaler(), LogisticRegression(max_iter=1000))
model.fit(X_train, y_train)
print("Logistic regression")
print(classification_report(y_test, model.predict(X_test), zero_division=0))

Expected shape:

Baseline
...
Logistic regression
...

Do not only compare the final scores. Ask: which classes are easy, which are hard, and what error would matter most in the real use case?

  • The baseline tells you what a naive model can do before learning useful patterns.
  • Logistic regression should beat the baseline, but the class-level precision and recall matter more than one headline score.
  • If one class has poor recall, inspect those missed examples before changing the model.
  • Keep the split, metric, and failure samples fixed when comparing the next experiment.
LevelWhat you can prove
Minimum passYou can name the task type, split the data, train a baseline, and read the score.
Project-readyYou can explain why the chosen metric matches the goal, and show one error sample instead of trusting one score.
Deeper checkYou can test for leakage, compare two feature choices, and say what would change in a real product or dataset update.

Before leaving the chapter, save one wrong prediction or weak cluster interpretation. Write it in this format:

case_id:
input_summary:
true_or_expected:
model_output:
why_it_matters:
next_controlled_change:

This small failure note is more useful than another model name. It teaches the habit you will reuse in deep learning curves, prompt evaluation, RAG retrieval errors, and Agent traces.

Keep this page’s proof of learning as a small evidence card:

Modeling Loop
data, features, model, metric, error review, and next experiment
Artifact
code, score, chart, pipeline, or project README
Failure Check
leakage, metric mismatch, unstable split, overfitting, or unclear business target
Next Action
one controlled experiment rather than many parameter changes
Expected Output
reproducible ML evidence that prepares for deep learning
SymptomFirst thing to checkUsual fix
Score is strangely highLeakage or wrong train/test splitInspect features and split before training
Train score high, test score lowOverfittingSimplify the model, regularize, or add data
All models are weakPoor labels, weak features, or wrong metricInspect error samples and label definition
Accuracy looks fine but product risk is highClass imbalance or costly false negativesUse recall, precision, F1, AUC, or threshold review
Results cannot be reproducedRandom seed, data version, or dependency changedFix seeds and record versions

Move to Chapter 6 when you can answer these five questions:

  • Is this task classification, regression, clustering, or anomaly detection?
  • What is the baseline, and what score must a real model beat?
  • Which metric matches the goal, and when is accuracy misleading?
  • How did you check for leakage?
  • What does the model do well, what does it do poorly, and what would you improve next?
Check reasoning and explanation
  1. Decide the task from the target: categories mean classification, numbers mean regression, no labels usually means clustering or anomaly detection.
  2. The baseline is the simplest reproducible model or rule. A real model only matters if it beats that baseline under the same split and metric.
  3. Choose the metric from the cost of mistakes. Accuracy is misleading when classes are imbalanced or when one error type is much more expensive.
  4. Check leakage by asking whether any feature contains target, future, test-set, or human-review information that would not exist at prediction time.
  5. A good next step names one weakness, one evidence sample, and one controlled change rather than changing many knobs at once.

For a printable checklist, use 5.0 Study Guide and Task Sheet. The next chapter moves from sklearn models into neural networks and deep learning training.