Skip to main content

10.0 Learning Checklist: Computer Vision

Use this page as a printable checklist. If you need the full explanation, return to the Chapter 10 entry page.

Vision portfolio evidence pack

Two-Hour First Pass

Time boxDo thisStop when you can say
20 minRead the output-granularity ladder"Classification, detection, and segmentation differ by output."
25 minRun the pixel lab"I can inspect size, channels, RGB, and grayscale."
25 minSkim 10.1 image basics"Preprocessing changes the data the model sees."
25 minSkim classification, detection, segmentation roadmaps"I know which metric belongs to which task."
25 minRead the debugging loop"I should inspect data and labels before blaming architecture."

Required Evidence

EvidenceMinimum version
opencv_demo.py or pixel_lab.pyimage load or generated image, preprocessing, saved output
vision_dataset.mddata source, classes, sample count, annotation method, limitations
eval_results.mdaccuracy/F1, mAP, IoU/Dice, OCR hit rate, or chosen metric
failure_cases.mdfailed images, possible cause, fix direction
README.mdtask goal, run command, input/output examples, scenario boundary

Quality Gates

GatePass condition
Visual traceOriginal, processed, prediction, and failure images are saved with matching filenames.
AnnotationDataset notes define classes, boxes or masks, source, split, and known label uncertainty.
Metric fitAccuracy/F1, mAP, IoU/Dice, or OCR hit rate matches the task output.
Real-world boundaryReport names lighting, angle, camera or source, latency, image size, and device limits.

Exit Questions

  • Can you explain classification, detection, segmentation, and OCR by output shape?
  • Can you show the original image, processed image, and prediction visualization?
  • Can you explain why annotation quality affects metrics?
  • Can you choose accuracy/F1, mAP, IoU, or Dice for the right task?
  • Can you explain why a demo may fail on real images?

If the answer is yes, you can connect vision to multimodal work in Chapter 12.