7.0 Learning Checklist: LLM Principles, Prompt, and Fine-tuning

Use this page as a printable checklist. If you need the full explanation, return to the Chapter 7 entry page.

LLM study guide evolution path

Two-Hour First Pass

Time box	Do this	Stop when you can say
20 min	Read the token-to-answer picture on the entry page	”Text becomes tokens, vectors, context, then next-token prediction.”
25 min	Skim 7.1 and run one tokenizer example	”Token count affects cost and context limits.”
25 min	Skim 7.2 and the LLM history page	”Scale, data, Transformer, and alignment changed what models can do.”
30 min	Run the prompt testing script from the entry page	”I can compare prompt versions with fixed cases.”
20 min	Read the solution-choice table	”I should not fine-tune before checking Prompt, RAG, tools, and validation.”

Required Evidence

Evidence	Minimum version
`prompts/`	Three prompt versions for one task
`prompt_eval_cases.csv`	At least five fixed inputs and a simple score column
`structured_output_schema.json`	Required fields and allowed value types
`failure_cases.md`	At least three failed outputs and the likely cause
`gpu_train_log.txt`	`device: cuda` training log from 7.4.5 Rent a GPU and Train a Hand-Built GPT-2
`llm_stage_workshop_output.txt`	Output from 7.8.4 Hands-on: Full Chapter 7 Workshop
`README.md`	How to run, what passed, what failed, what to try next

Evidence to Keep

Keep this page’s proof of learning as a small evidence card:

Prompt Versions: at least three versions for one task
Eval Cases: fixed inputs with scores and failure notes
Schema Check: structured output is parsed and validated
Method Choice: Prompt/RAG/fine-tuning/tools decision is written down
Gpt2 Record: mini GPT-2 GPU training log, environment info, and sample output
Exit Proof: workshop output plus README notes

Quality Gates

Gate	Pass condition
Prompt comparison	Same cases, one changed variable, saved outputs and scores.
Structured output	Parser rejects missing fields or wrong types.
Failure analysis	Each failure has a likely cause: instruction, input, schema, missing knowledge, or safety.
Method choice	Decision table explains why Prompt, RAG, fine-tuning, tools, or Agent comes first.
Hand-built GPT-2	Run mini GPT-2 on a CUDA GPU and identify embedding, attention, loss, and generate in the code.

Expected result: your Chapter 7 folder contains prompt versions, fixed eval cases, parser/schema checks, failure notes, a device: cuda mini GPT-2 training log, workshop output, and a README that explains the method choice.

Exit Questions

Can you explain token, embedding, attention, context window, pretraining, Prompt, fine-tuning, and alignment without copying definitions?
Can you change one prompt variable at a time and compare results with the same input cases?
Can you validate JSON output instead of trusting text that only looks like JSON?
Can you explain when missing information calls for RAG instead of a longer Prompt?
Can you explain when repeated behavior adaptation might justify fine-tuning?
Can you open or rent a GPU notebook, run mini GPT-2 with device: cuda, and save loss plus generated text?

Check reasoning and explanation

Treat each term as part of one flow: token and embedding are the representation layer, attention routes context, the context window limits what can be seen at once, pretraining builds the base model, Prompt steers the run, fine-tuning changes behavior with data, and alignment keeps outputs useful and safe.
Keep the same cases, change only one prompt variable, and save both the outputs and the score so the comparison is reproducible instead of anecdotal.
Use a schema or parser to validate structure, required fields, and types. If parsing fails, reject the output instead of reading it as if it were correct.
Use RAG when the answer depends on fresh, private, or citable facts from documents rather than what the model may remember.
Fine-tuning becomes worth considering when the same behavior keeps showing up across many high-quality examples and Prompt plus validation still is not enough.

If the answer is yes, move to Chapter 8. Chapter 8 will connect these ideas to real LLM applications and RAG systems.

Evidence to Keep

Keep this page’s proof of learning as a small evidence card:

Prompt Versions: at least three versions for one task
Eval Cases: fixed inputs with scores and failure notes
Schema Check: structured output is parsed and validated
Method Choice: Prompt/RAG/fine-tuning/tools decision is written down
Gpt2 Record: mini GPT-2 GPU training log, environment info, and sample output
Exit Proof: workshop output plus README notes