Skip to content

9.0 Learning Checklist: AI Agents and Agent Systems

Use this page as a printable checklist. If you need the full explanation, return to the Chapter 9 entry page.

Agent trace evidence pack

Time boxDo thisStop when you can say
20 minRead the execution loop on the entry page”An Agent is a goal-state-tool-observation loop.”
25 minRun the trace script”I can replay every action and observation.”
25 minSkim 9.1 and 9.2”I can separate Agent, workflow, RAG, ReAct, and Plan-and-Execute.”
25 minSkim 9.3 tool safety”Tool schema and permissions matter more than clever prompting.”
25 minRead the boundary map”I know when not to use an Agent.”
EvidenceMinimum version
tools_schema.md1-2 tools with name, purpose, parameters, return value, errors, and risk level
agent_traces.jsonlat least three runs with goal, step, action, input, observation, and result
safety_boundary.mdmaximum steps, tool whitelist, blocked actions, human approval rules
protocol_card.mdMCP/server capability, peer-agent handoff fields, authorization rule, trace fields
agent_sandbox_trace.jsonallow/confirm/deny decisions with at least one blocked injection or tool-poisoning case
failure_cases.mdat least three failures: wrong tool, bad parameter, loop, blocked permission, unsupported answer
eval_tasks.csv3-5 fixed tasks with expected outcome and success criteria
README.mdrun command, trace example, safety example, evaluation result, limitation

Keep this page’s proof of learning as a small evidence card:

Single Agent Trace
one complete goal-plan-action-observation loop
Tool Contract
schema, permission, protocol card, error behavior, and observation
Memory Note
what is written, retrieved, forgotten, or updated
Eval Note
success score, safety check, sandbox trace, and failure reason
Project Readme
run command, trace, limitations, and next action
GatePass condition
Tool schemaEach tool has purpose, parameters, return value, errors, and risk level.
Trace replayA reviewer can replay why every tool call happened.
Safety boundaryNon-whitelisted or risky actions are blocked or routed to human approval.
Stop controlMax steps and stop conditions prevent loops and cost spikes.

Expected result: your Chapter 9 project folder contains tool schemas, replayable traces, safety boundaries, fixed eval tasks, failure notes, and a README that explains why the design stays single-Agent until the loop is reliable.

  • Can you explain why an Agent is different from a normal LLM application?
  • Can you show a trace and explain why each tool call happened?
  • Can you block a risky or non-whitelisted tool?
  • Can you separate tool discovery from authorization and agent handoff?
  • Can you define a stop condition and maximum step count?
  • Can you explain why multi-agent should come after single-Agent reliability?
Check reasoning and explanation
  1. An Agent keeps a goal-plan-tool-observation loop, so the system can act, inspect the result, and decide the next step instead of only generating one reply.
  2. A useful trace shows the goal, planned step, tool call, input, observation, and why the next step followed from that observation.
  3. Block risky or non-whitelisted tools with a tool allowlist, schema checks, risk labels, maximum step limits, and human approval when needed.
  4. Good stop conditions include success, no progress, max steps reached, or risk escalation.
  5. Single-Agent stability comes first because multi-agent systems are harder to trace, debug, and control safely.

If the answer is yes, continue to the next direction: deployment, multimodal Agents, or the final course project.

Keep this page’s proof of learning as a small evidence card:

Single Agent Trace
one complete goal-plan-action-observation loop
Tool Contract
schema, permission, error behavior, and observation
Memory Note
what is written, retrieved, forgotten, or updated
Eval Note
success score, safety check, and failure reason
Project Readme
run command, trace, limitations, and next action