Skip to main content

6.8.4 Project: Generative Models in Practice [Optional]

Section Overview

A generative project is not finished when it produces one nice sample. You need to show quality, diversity, stability, failures, and why one checkpoint is worth keeping.

Learning Objectives

  • Explain why generative projects need different evaluation from classification.
  • Track quality and diversity together.
  • Build a small checkpoint review table.
  • Identify mode collapse and blurry-output failure modes.
  • Package generated samples as project evidence.

See the Evaluation Loop First

Generative model project evaluation loop

train -> sample checkpoints -> review quality + diversity -> keep failures -> choose next step

For a practice project, choose a generation target that is:

  • visually inspectable;
  • small enough to train or simulate;
  • easy to compare across checkpoints.

Digits, icons, simple shapes, or tiny grayscale patterns are better first projects than open-ended photorealistic generation.

Lab: Checkpoint Review Dashboard

Create generative_review_dashboard.py:

checkpoints = [
{"epoch": 1, "quality": 0.20, "diversity": 0.80, "note": "mostly noise"},
{"epoch": 10, "quality": 0.45, "diversity": 0.72, "note": "outlines appear"},
{"epoch": 30, "quality": 0.68, "diversity": 0.60, "note": "usable but varied"},
{"epoch": 60, "quality": 0.75, "diversity": 0.48, "note": "possible collapse"},
]

print("generation_review")
for row in checkpoints:
status = "candidate" if row["quality"] >= 0.6 and row["diversity"] >= 0.55 else "review"
print(
f"epoch={row['epoch']:03d} "
f"quality={row['quality']:.2f} "
f"diversity={row['diversity']:.2f} "
f"status={status}"
)

selected = max(
[row for row in checkpoints if row["diversity"] >= 0.55],
key=lambda row: row["quality"],
)
print("selected_epoch:", selected["epoch"])

Run it:

python generative_review_dashboard.py

Expected output:

generation_review
epoch=001 quality=0.20 diversity=0.80 status=review
epoch=010 quality=0.45 diversity=0.72 status=review
epoch=030 quality=0.68 diversity=0.60 status=candidate
epoch=060 quality=0.75 diversity=0.48 status=review
selected_epoch: 30

Why not pick epoch 60? Because quality is higher but diversity is lower. A good generative project does not select only the prettiest sample.

What to Save

EvidenceWhy
samples by checkpointshows training progression
failure samplesreveals limits honestly
diversity notescatches repeated outputs
quality notesexplains visual improvements
training logsshows stability or collapse
final selection rulemakes the choice reproducible

Quality, Diversity, Stability

DimensionGood signWarning sign
Qualitysamples look like target datanoisy, blurry, broken structure
Diversitysamples vary meaningfullyrepeated outputs or one dominant style
Stabilitycheckpoints improve graduallysudden collapse or oscillation
Interpretabilityfailures are documentedonly best samples are shown

The common trade-off:

best-looking single sample != best project checkpoint

Project Upgrade Path

VersionWhat to add
basicone model, fixed sampling seed, checkpoint samples
standardquality/diversity table and failure samples
challengecompare VAE, GAN, or diffusion-style outputs
portfolioclear story: data, model, samples, failures, next step

Common Mistakes

MistakeFix
showing only best samplesshow average and failure samples too
ignoring diversitytrack repeated outputs or unique patterns
comparing checkpoints by memoryuse the same fixed seed set
using a dataset too complex at firststart with small visual targets
not explaining model choicestate why VAE, GAN, or another method fits the goal

Exercises

  1. Add an epoch 90 with quality 0.80 and diversity 0.30. Should it be selected?
  2. Add a failure field to each checkpoint.
  3. Write a 4-row table for your own generative project idea.
  4. Explain mode collapse using the checkpoint table.
  5. Draft a portfolio section titled “Why I selected this checkpoint.”

Key Takeaways

  • Generative projects need evaluation stories, not just galleries.
  • Quality and diversity must be read together.
  • Failure samples make the project more credible.
  • A clear checkpoint selection rule is part of the deliverable.