Skip to content

8.5.1 Project Roadmap: Build a Cited Knowledge Assistant

This capstone proves you can connect knowledge, model calls, application flow, and engineering evidence into one reproducible LLM application.

LLM application capstone project roadmap

LLM application project learning order diagram

LLM application project delivery loop diagram

The project is not “connect a vector database.” It is a traceable loop: documents, chunks, retrieval, context, answer, citations, logs, evaluation, and improvement.

Use this checklist before calling the project done.

project = {
"project_type": "knowledge-base assistant",
"documents": 5,
"eval_questions": 10,
"citations": True,
"empty_retrieval_handled": True,
"failure_cases": 3,
}
ready = (
project["documents"] >= 3
and project["eval_questions"] >= 10
and project["citations"]
and project["empty_retrieval_handled"]
and project["failure_cases"] >= 1
)
print("ready:", ready)
print("project_type:", project["project_type"])
print("evidence:", "docs, eval, citations, failures")

Expected output:

Terminal window
ready: True
project_type: knowledge-base assistant
evidence: docs, eval, citations, failures

If ready is False, do not add another feature yet. Complete the evidence loop first.

StepProjectWhat It Trains
1Enterprise or course knowledge baseRetrieval, permissions, citations, traceable answers
2Intelligent assistantRetrieval, session state, and tool calling as product features
3RAG + finetuning systemSeparate missing knowledge from unstable behavior
4SOP document assistantDocument parsing, structured output, and template rendering
5Full hands-on workshopA minimum reproducible loop before adding real APIs or databases

If you need a guided baseline, start with 8.5.6 Hands-on: Full Chapter 8 RAG App Workshop.

Keep this page’s proof of learning as a small evidence card:

Project Goal
user task and business boundary
Baseline
simplest prompt/RAG/app version first
Evaluation
fixed cases, retrieval evidence, answer quality, and citation check
Failure Log
at least one failed case with likely cause
Deliverable
README, run command, screenshots/logs, next step
DeliverableMinimum RequirementStronger Portfolio Version
READMEGoal, run command, dependencies, and examplesAdd architecture diagram, design trade-offs, cost, and retrospective
Knowledge base sampleRaw documents, chunks, metadata, and source fieldsAdd permission rules, document version, and update notes
Retrieval logsMatched passages, scores, and rankingAdd failure-type statistics and before/after comparison
Answer citationsFinal answers show supporting sourcesAdd citation faithfulness checks
Failure casesAt least one documented failureAdd 3 or more cases with cause, fix, and regression check
EvaluationFixed questions with pass/fail rulesAdd baseline, metrics, and regression testing
Deployment noteHow to run and required environment variablesAdd Docker, monitoring, and fallback notes

You pass this chapter when the project can answer with citations, show retrieval logs, handle empty retrieval, keep evaluation cases, and explain at least one failure.

The strongest portfolio version is not the largest one. It is the version where another developer can reproduce the run, inspect the evidence, and understand how you would improve the next iteration.

Check reasoning and explanation
  1. A passing answer traces the full path from query to chunks, retrieval scores, cited evidence, answer, and fallback behavior.
  2. The evidence should include retrieved passages, source metadata, a cited answer, and at least one empty-retrieval or wrong-retrieval case.
  3. A good self-check explains whether a failure came from chunking, retrieval, ranking, prompt assembly, missing sources, or unsupported generation.