7.8.1 Project Roadmap: Choose Prompt, RAG, or Finetuning
This capstone turns Chapter 7 into one engineering decision: is the problem task expression, missing knowledge, unstable format, unsafe behavior, or weak evaluation?
See the Project Route First
Section titled “See the Project Route First”


Do not start with the strongest model or the most complex framework. Start with a small domain task, a prompt baseline, fixed examples, and a failure log.
Run an Evidence Pack Check
Section titled “Run an Evidence Pack Check”Use this tiny project log before writing a report. It forces you to show the baseline, improvement, next route, and whether finetuning is actually justified.
project = { "task": "classify course questions", "baseline_pass_rate": 0.62, "prompt_v2_pass_rate": 0.78, "rag_needed": True, "finetune_needed": False,}
improvement = project["prompt_v2_pass_rate"] - project["baseline_pass_rate"]
print("task:", project["task"])print("improvement:", round(improvement, 2))print("next_route:", "RAG" if project["rag_needed"] else "Prompt")print("fine_tune_now:", project["finetune_needed"])Expected output:
task: classify course questionsimprovement: 0.16next_route: RAGfine_tune_now: FalseIf your project cannot fill these fields, keep the project smaller. A clear comparison beats a large but untestable demo.
Learn in This Order
Section titled “Learn in This Order”| Step | Do | Evidence |
|---|---|---|
| 1 | Pick one domain task | One-sentence task definition and 10 fixed examples |
| 2 | Build a prompt baseline | Prompt version, outputs, pass/fail notes |
| 3 | Classify failure types | Task wording, missing knowledge, format drift, safety boundary |
| 4 | Choose the next method | Prompt iteration, RAG, or finetuning decision note |
| 5 | Package the result | README, run command, screenshots, failure case, next steps |
If you want a guided starter, run 7.8.4 Hands-on: Full Chapter 7 Workshop before designing your own domain project.
Decision Rule: Name the Failure Before Choosing the Method
Section titled “Decision Rule: Name the Failure Before Choosing the Method”The capstone should not say “we used RAG” or “we fine-tuned” just because the method sounds advanced. First name the dominant failure.
| Dominant failure | Better first route | Evidence you need |
|---|---|---|
| The model does not know the answer source | RAG | Retrieved document supports the answer |
| The output format drifts | Structured output + validation | Parser pass rate improves |
| The instruction is vague | Prompt iteration | Same cases improve after one prompt change |
| The behavior repeats incorrectly across many cases | Fine-tuning / LoRA candidate | Enough labeled examples and held-out eval cases |
| The task needs external action | Tool / Agent route | Tool call trace and recovery behavior |
This one table keeps the project from becoming a method showcase. It turns it into an engineering decision.
Evidence to Keep
Section titled “Evidence to Keep”Keep this page’s proof of learning as a small evidence card:
- Project Choice
- Prompt, RAG, fine-tuning, or hybrid route
- Baseline
- simplest working method first
- Evaluation
- fixed cases and scoring rule
- Deliverable
- README, prompts, outputs, failures, decision log
- Bridge
- Chapter 8 turns this into retrieval-backed applications
Project Deliverable Standards
Section titled “Project Deliverable Standards”| Deliverable | Minimum Standard | Stronger Portfolio Version |
|---|---|---|
| README | Goal, run command, model or API choice, sample input/output | Add method trade-offs, cost notes, evaluation, and retrospective |
| Examples | At least 10 fixed cases | Compare prompt, RAG, finetuning, or rule-based versions |
| Evaluation | Clear pass/fail rule | Add scores, failure-type statistics, and regression notes |
| Prompt/data record | Save prompt versions or sample format | Add schema validation, data quality checks, and safety notes |
| Presentation | Screenshot or short GIF proving it runs | Explain why the chosen route beats alternatives |
Pass Check
Section titled “Pass Check”You pass this chapter when you can clearly explain “why not finetune here,” “why RAG is needed here,” or “why this prompt change works,” with a fixed evaluation set rather than a single good answer.
The final project can be basic: compare two prompt versions on one domain task. The stronger version adds RAG or a small finetuning experiment, but only after the baseline and failure log prove the need.
Check reasoning and explanation
- A passing answer explains how tokens, context, attention, prompts, and generation behavior connect in one request-response path.
- The evidence should include at least one reproducible prompt or structured-output test, plus notes on why the output passed or failed.
- A good self-check separates prompt design, RAG, fine-tuning, and alignment: use the lightest method that fixes the observed problem.