12.0 Learning Checklist: AIGC and Multimodal

Use this page as a printable checklist. If you need the full explanation, return to the Chapter 12 entry page.

Multimodal portfolio evidence pack

Two-Hour First Pass

Time box	Do this	Stop when you can say
20 min	Read the workflow loop on the entry page	”Multimodal work starts with source-preserved inputs.”
25 min	Run the visual record script	”I can turn visual content into a checkable structured record.”
25 min	Skim multimodal basics and image generation	”Understanding and generation need prompts, models, outputs, and review.”
25 min	Skim ethics and compliance	”External use needs copyright, portrait, sensitive, and factual checks.”
25 min	Read the RAG/Agent bridge	”Multimodal can extend RAG, Agent, and the final capstone.”

Required Evidence

Evidence	Minimum version
`multimodal_pipeline.md`	input, parsing, generation/understanding, review, export
`visual_records.jsonl`	source, page/region/time reference, visible text, objects, uncertainty
`prompts/`	prompt versions, reference assets, negative requirements, selection notes
`outputs/`	candidate outputs, selected output, rejected output, reason
`safety_review.md`	copyright, portrait rights, sensitive content, factuality, usage boundary
`README.md`	goal, run command, source materials, sample output, limitations

Quality Gates

Gate	Pass condition
Source trace	Every input and output keeps source, owner or license, version, and page/region/time reference when relevant.
Prompt/version	Candidate outputs link back to prompt, model or tool, reference assets, and selection reason.
Review	Copyright, portrait or voice, sensitive content, factuality, accessibility, and export scope are checked.
Export	README, manifest, selected outputs, rejected outputs, limits, and next fix can be inspected by another person.

Exit Questions

Can you preserve source references for screenshots, PDFs, images, audio, or video?
Can you turn a non-text input into a structured record that RAG or an Agent can use?
Can you compare generated outputs with prompt versions and review notes?
Can you explain what must be checked before external release?
Can you package the result as a final portfolio or capstone demo?

If the answer is yes, you have the multimodal delivery path. Move to Chapter 13 when the project needs open-source model hosting, runtime ownership, or fine-tuning decisions.

Check reasoning and explanation

Yes means every non-text input has a source, owner, version, and review status, not just a final file.
A good structured record contains extracted content, modality metadata, confidence or review notes, and a stable link back to the source artifact.
Generated outputs should be tied to prompt versions, candidate ids, selected/rejected decisions, and reviewer notes so iteration is explainable.
Before external release, check factual grounding, consent and rights, privacy, sensitive content, safety policy, and whether a human approved high-risk material.
A portfolio-ready package should include the brief, manifest, prompts, selected assets, rejected cases, review checklist, final export, and a README that explains the workflow.

## Evidence to Keep

Keep this page’s proof of learning as a small evidence card:

Brief: user goal, audience, assets, constraints, and export format
Artifacts: source files, prompts, generated candidates, selected output, and rejected versions
Review: factual check, copyright/portrait/sensitive-content check, and human decision
Integration: RAG record, Agent trace, creative package, storyboard, or export preview
Expected Output: reproducible asset package with README, review checklist, and failure notes