Skip to main content

12.0 Learning Checklist: AIGC and Multimodal

Use this page as a printable checklist. If you need the full explanation, return to the Chapter 12 entry page.

Multimodal portfolio evidence pack

Two-Hour First Pass

Time boxDo thisStop when you can say
20 minRead the workflow loop on the entry page"Multimodal work starts with source-preserved inputs."
25 minRun the visual record script"I can turn visual content into a checkable structured record."
25 minSkim multimodal basics and image generation"Understanding and generation need prompts, models, outputs, and review."
25 minSkim ethics and compliance"External use needs copyright, portrait, sensitive, and factual checks."
25 minRead the RAG/Agent bridge"Multimodal can extend RAG, Agent, and the final capstone."

Required Evidence

EvidenceMinimum version
multimodal_pipeline.mdinput, parsing, generation/understanding, review, export
visual_records.jsonlsource, page/region/time reference, visible text, objects, uncertainty
prompts/prompt versions, reference assets, negative requirements, selection notes
outputs/candidate outputs, selected output, rejected output, reason
safety_review.mdcopyright, portrait rights, sensitive content, factuality, usage boundary
README.mdgoal, run command, source materials, sample output, limitations

Quality Gates

GatePass condition
Source traceEvery input and output keeps source, owner or license, version, and page/region/time reference when relevant.
Prompt/versionCandidate outputs link back to prompt, model or tool, reference assets, and selection reason.
ReviewCopyright, portrait or voice, sensitive content, factuality, accessibility, and export scope are checked.
ExportREADME, manifest, selected outputs, rejected outputs, limits, and next fix can be inspected by another person.

Exit Questions

  • Can you preserve source references for screenshots, PDFs, images, audio, or video?
  • Can you turn a non-text input into a structured record that RAG or an Agent can use?
  • Can you compare generated outputs with prompt versions and review notes?
  • Can you explain what must be checked before external release?
  • Can you package the result as a final portfolio or capstone demo?

If the answer is yes, the course has a complete end-to-end path: foundations, data, models, LLM apps, Agents, and multimodal product workflows.