6.8.4 Project: Generative Models in Practice [Optional]

Section Overview

A generative project is not finished when it produces one nice sample. You need to show quality, diversity, stability, failures, and why one checkpoint is worth keeping.

Learning Objectives

Explain why generative projects need different evaluation from classification.
Track quality and diversity together.
Build a small checkpoint review table.
Identify mode collapse and blurry-output failure modes.
Package generated samples as project evidence.

See the Evaluation Loop First

Generative model project evaluation loop

train -> sample checkpoints -> review quality + diversity -> keep failures -> choose next step

For a practice project, choose a generation target that is:

visually inspectable;
small enough to train or simulate;
easy to compare across checkpoints.

Digits, icons, simple shapes, or tiny grayscale patterns are better first projects than open-ended photorealistic generation.

Lab: Checkpoint Review Dashboard

Create generative_review_dashboard.py:

checkpoints = [
    {"epoch": 1, "quality": 0.20, "diversity": 0.80, "note": "mostly noise"},
    {"epoch": 10, "quality": 0.45, "diversity": 0.72, "note": "outlines appear"},
    {"epoch": 30, "quality": 0.68, "diversity": 0.60, "note": "usable but varied"},
    {"epoch": 60, "quality": 0.75, "diversity": 0.48, "note": "possible collapse"},
]

print("generation_review")
for row in checkpoints:
    status = "candidate" if row["quality"] >= 0.6 and row["diversity"] >= 0.55 else "review"
    print(
        f"epoch={row['epoch']:03d} "
        f"quality={row['quality']:.2f} "
        f"diversity={row['diversity']:.2f} "
        f"status={status}"
    )

selected = max(
    [row for row in checkpoints if row["diversity"] >= 0.55],
    key=lambda row: row["quality"],
)
print("selected_epoch:", selected["epoch"])

Run it:

python generative_review_dashboard.py

Expected output:

generation_review
epoch=001 quality=0.20 diversity=0.80 status=review
epoch=010 quality=0.45 diversity=0.72 status=review
epoch=030 quality=0.68 diversity=0.60 status=candidate
epoch=060 quality=0.75 diversity=0.48 status=review
selected_epoch: 30

Why not pick epoch 60? Because quality is higher but diversity is lower. A good generative project does not select only the prettiest sample.

What to Save

Evidence	Why
samples by checkpoint	shows training progression
failure samples	reveals limits honestly
diversity notes	catches repeated outputs
quality notes	explains visual improvements
training logs	shows stability or collapse
final selection rule	makes the choice reproducible

Quality, Diversity, Stability

Dimension	Good sign	Warning sign
Quality	samples look like target data	noisy, blurry, broken structure
Diversity	samples vary meaningfully	repeated outputs or one dominant style
Stability	checkpoints improve gradually	sudden collapse or oscillation
Interpretability	failures are documented	only best samples are shown

The common trade-off:

best-looking single sample != best project checkpoint

Project Upgrade Path

Version	What to add
basic	one model, fixed sampling seed, checkpoint samples
standard	quality/diversity table and failure samples
challenge	compare VAE, GAN, or diffusion-style outputs
portfolio	clear story: data, model, samples, failures, next step

Common Mistakes

Mistake	Fix
showing only best samples	show average and failure samples too
ignoring diversity	track repeated outputs or unique patterns
comparing checkpoints by memory	use the same fixed seed set
using a dataset too complex at first	start with small visual targets
not explaining model choice	state why VAE, GAN, or another method fits the goal

Exercises

Add an epoch 90 with quality 0.80 and diversity 0.30. Should it be selected?
Add a failure field to each checkpoint.
Write a 4-row table for your own generative project idea.
Explain mode collapse using the checkpoint table.
Draft a portfolio section titled “Why I selected this checkpoint.”

Key Takeaways

Generative projects need evaluation stories, not just galleries.
Quality and diversity must be read together.
Failure samples make the project more credible.
A clear checkpoint selection rule is part of the deliverable.

Learning Objectives​

See the Evaluation Loop First​

Lab: Checkpoint Review Dashboard​

What to Save​

Quality, Diversity, Stability​

Project Upgrade Path​

Common Mistakes​

Exercises​

Key Takeaways​