9.1.2 What Is an AI Agent

Learning Goals
Section titled “Learning Goals”By the end of this section, you will be able to:
- clearly explain the differences between workflows, chatbots, and Agents
- understand the minimal components of an Agent
- run a mini Agent example with tool calling
- understand why an Agent is more than “just wrapping a prompt”
How This Section Connects to the Earlier Main Thread of LLM Applications
Section titled “How This Section Connects to the Earlier Main Thread of LLM Applications”If you just finished Stage 8B, you can understand this section as follows:
- earlier, you already learned to build systems with “model + knowledge + tools + application”
- this section starts answering: when do these systems step into Agent territory instead of remaining fixed workflows?
So the real importance of this section is not a single definition, but:
- first separate Agent from workflows, chat systems, and function-call systems
A Better Overall Analogy for Beginners
Section titled “A Better Overall Analogy for Beginners”You can think of these three system types like this:
- workflow: a fixed subway line
- chatbot: a front desk receptionist
- Agent: an assistant that decides for itself what to do next
Of course, this assistant can talk, but its real key point is not “being able to chat” — it is:
- whether it can organize a chain of actions toward a goal
Don’t Rush to Put Agent on a Pedestal
Section titled “Don’t Rush to Put Agent on a Pedestal”When many people first hear about Agents, they imagine them as “AI employees that can think and execute tasks autonomously.”
That description is not entirely wrong, but it can be too abstract.
A more stable understanding is:
Agent = a system that completes tasks step by step based on goals, state, and tools.
It usually has these abilities:
- receive a goal
- break down steps
- call tools
- continue acting based on results
- stop when appropriate
When Learning Agent for the First Time, What Should You Focus on First?
Section titled “When Learning Agent for the First Time, What Should You Focus on First?”The first thing to focus on is not the word “autonomous,” but this sentence:
The key of an Agent is not that it can talk, but that it can organize a chain of actions around a goal.
Once this idea is stable, when you later look at:
- planning
- tools
- memory
- multi-Agent systems
you will more naturally understand why they exist.
What Is the Difference Between Workflows, Chatbots, and Agents?
Section titled “What Is the Difference Between Workflows, Chatbots, and Agents?”Workflow
Section titled “Workflow”Every step is written in advance:
- user asks a question
- query the database
- assemble the prompt
- return the answer
This is more like a fixed pipeline.
Chatbot
Section titled “Chatbot”The focus is on “conversation.” It may not proactively break down tasks or use external tools.
Agent as a System Role
Section titled “Agent as a System Role”The focus is on “dynamically choosing actions to achieve the goal.”
For example, an Agent may:
- first judge what the user wants
- then decide whether to check the weather, search docs, or calculate
- after getting the result, organize the output
Why Must These Three Concepts Be Separated First?
Section titled “Why Must These Three Concepts Be Separated First?”Because many systems look like they “just connect a model,” but their engineering forms are completely different:
- workflow is more like a fixed route
- chatbot is more like a conversation interface
- Agent is more like a goal-driven execution system
If you don’t separate these boundaries at the beginning, later it becomes easy to:
- call anything with many tools an Agent
- think something is an Agent just because it has state
- mistake a chatting system for an Agent
A System Boundary Diagram That Beginners Can Remember First
Section titled “A System Boundary Diagram That Beginners Can Remember First”flowchart LR A["Fixed steps"] --> B["Workflow"] C["Conversation-focused"] --> D["Chatbot"] E["Continuous goal-based decision-making"] --> F["Agent"]This diagram is very important because it helps beginners first see:
- an Agent is not “a smarter chat box”
- instead, the way the system is controlled changes

The Minimal Components of an Agent
Section titled “The Minimal Components of an Agent”You can first break an Agent into 4 parts:
| Component | Role |
|---|---|
| Goal | What needs to be accomplished this time |
| Model / decision-maker | What to do next |
| Tools | What external capabilities can be called |
| State / memory | Where the current task has progressed to |
When You See These Four Parts for the First Time, What Is the Most Worth Remembering Sentence?
Section titled “When You See These Four Parts for the First Time, What Is the Most Worth Remembering Sentence?”You can remember this first:
Agent = goal + decision + tools + state.
Later, many chapters in 9 AI Agent and Intelligent Agent Systems are essentially expanding these four parts.
An analogy:
An Agent is like an intern who can get things done: there is a task goal, a toolbox, work records, and they still need to decide the next step on their own.
Look at Another Minimal “Candidate Action” Example
Section titled “Look at Another Minimal “Candidate Action” Example”def choose_action(query): if "weather" in query: return "use_weather_tool" if "refund" in query or "certificate" in query: return "use_docs_tool" if "calculate" in query: return "use_calculator" return "reply_directly"
for query in ["What's the weather in Beijing", "What is the refund policy", "calculate 7 * 8"]: print(query, "->", choose_action(query))Expected output:
What's the weather in Beijing -> use_weather_toolWhat is the refund policy -> use_docs_toolcalculate 7 * 8 -> use_calculatorThis example is very suitable for beginners because it helps you grasp one core action first:
- an Agent does not answer first
- it first decides what to do next
A System Boundary Diagram That Beginners Can Remember First
Section titled “A System Boundary Diagram That Beginners Can Remember First”flowchart LR A["User goal"] --> B["Decision maker"] B --> C["Tools"] C --> D["Result observation"] D --> B B --> E["Final answer"]This diagram is especially important because it reminds you:
- the key of an Agent is not just outputting one sentence
- instead, it enters a closed loop of “goal -> action -> observation”

A Mini Agent That Does Not Depend on a Large Model
Section titled “A Mini Agent That Does Not Depend on a Large Model”To make the principle clearer, let’s not use a real large model yet. Instead, we’ll write a “rule-based Agent.”
import astimport operator
OPS = { ast.Add: operator.add, ast.Sub: operator.sub, ast.Mult: operator.mul, ast.Div: operator.truediv,}
def safe_calculate(expression): def visit(node): if isinstance(node, ast.Expression): return visit(node.body) if isinstance(node, ast.Constant) and isinstance(node.value, (int, float)): return node.value if isinstance(node, ast.BinOp) and type(node.op) in OPS: return OPS[type(node.op)](visit(node.left), visit(node.right)) if isinstance(node, ast.UnaryOp) and isinstance(node.op, ast.USub): return -visit(node.operand) raise ValueError("unsupported_expression")
return visit(ast.parse(expression, mode="eval"))
def tool_weather(city): fake_weather = { "Beijing": "Sunny, 22°C", "Shanghai": "Cloudy, 25°C", "Shenzhen": "Light rain, 28°C" } return fake_weather.get(city, "No weather data available for this city")
def tool_calculate(expression): return str(safe_calculate(expression))
def tool_search_docs(keyword): docs = { "refund": "You can apply for a refund within 7 days of purchase and if your learning progress is below 20%.", "certificate": "You can receive a certificate after completing all required items and passing the final assessment." } for k, v in docs.items(): if k in keyword: return v return "No relevant document found."
def simple_agent(user_query): steps = []
if "weather" in user_query: city = "Beijing" if "Beijing" in user_query else "Shanghai" if "Shanghai" in user_query else "Shenzhen" steps.append(f"Detected a weather query, preparing to call the weather tool, city={city}") result = tool_weather(city) steps.append(f"Tool returned: {result}") final_answer = f"Current weather in {city}: {result}"
elif "refund" in user_query or "certificate" in user_query: steps.append("Detected a knowledge query, preparing to call the docs tool") result = tool_search_docs(user_query) steps.append(f"Tool returned: {result}") final_answer = result
elif "calculate" in user_query: expression = user_query.replace("calculate", "").strip() steps.append(f"Detected a calculation task, preparing to call the calculator tool, expression={expression}") result = tool_calculate(expression) steps.append(f"Tool returned: {result}") final_answer = f"The calculation result is: {result}"
else: steps.append("No tool matched, replying with the default answer directly") final_answer = "I don’t yet know which tool to call."
return steps, final_answer
query = "calculate 23 * 7"steps, answer = simple_agent(query)
print("User question:", query)print("Execution steps:")for step in steps: print("-", step)print("Final answer:", answer)Expected output:
User question: calculate 23 * 7Execution steps:- Detected a calculation task, preparing to call the calculator tool, expression=23 * 7- Tool returned: 161Final answer: The calculation result is: 161This example is simple, but it already contains the core flavor of an Agent:
- recognize the task
- choose a tool
- get the result
- organize the output
What Is the Relationship Between an Agent and “Function Calling”?
Section titled “What Is the Relationship Between an Agent and “Function Calling”?”Agents often use function calling (Function Calling / Tool Calling), but the two are not exactly the same.
Function Calling
Section titled “Function Calling”The focus is on whether the model can produce structured parameters and correctly call a tool.
The Boundary of Agent and Function Calling
Section titled “The Boundary of Agent and Function Calling”The focus is on whether the model or system can dynamically decide around a goal:
- when to call a tool
- which tool to call
- how many times to call it
- what to do next after the call
So you can remember it like this:
Tool calling is a common capability of an Agent, but an Agent is not just tool calling.
Why Is This Step So Easy for Beginners to Mix Up?
Section titled “Why Is This Step So Easy for Beginners to Mix Up?”Because many early demos all look like:
- identify intent
- call one tool
- answer with the result
But a real Agent goes further and cares about:
- when to call
- which tool to call
- what to do next after the call
- whether it needs to iterate further
Why Is an Agent Harder Than a Normal Q&A System?
Section titled “Why Is an Agent Harder Than a Normal Q&A System?”Because it adds an extra layer of “action.”
A normal Q&A system is more like:
- look at the input
- generate an answer
An Agent is more like:
- look at the input
- plan
- try to act
- observe the result
- then decide the next step
This brings more challenges:
- errors accumulate over multiple steps
- tool calls may fail
- cost and latency are higher
- safety risks are also greater
A Looping Idea That Feels More Like an Agent
Section titled “A Looping Idea That Feels More Like an Agent”A real Agent system often looks like this:
flowchart LR A["User goal"] --> B["Decide next action"] B --> C["Call tools / reason"] C --> D["Observe result"] D --> B D --> E["Task completed, output answer"]
style A fill:#e3f2fd,stroke:#1565c0,color:#333 style B fill:#fff3e0,stroke:#e65100,color:#333 style C fill:#f3e5f5,stroke:#6a1b9a,color:#333 style D fill:#fffde7,stroke:#f9a825,color:#333 style E fill:#e8f5e9,stroke:#2e7d32,color:#333This is why Agents emphasize:
- planning
- observation
- feedback
- iteration
What Is Most Worth Understanding in This Loop Is Not the Diagram, But the “Closed Loop”
Section titled “What Is Most Worth Understanding in This Loop Is Not the Diagram, But the “Closed Loop””In other words, the key of an Agent is not a one-time output, but:
- look at the goal
- take an action
- observe the result
- then decide the next step
This is one of the most fundamental differences between it and a normal Q&A system.
The Safest Default Order When Building Your First Agent Project
Section titled “The Safest Default Order When Building Your First Agent Project”A more stable sequence is usually:
- first implement single-step tool calling
- then make the system able to choose actions
- then add next-step judgment after observing results
- finally introduce more complex planning and memory
This is easier than trying to build a “fully autonomous Agent” from the start, and it is much more likely to produce a truly controllable system.
If You Turn This Into a Project or Notes, What Is Most Worth Showing?
Section titled “If You Turn This Into a Project or Notes, What Is Most Worth Showing?”What is most worth showing is usually not:
- a demo video that simply shows “it can call tools”
But rather:
- the user goal
- what action the Agent chose
- why it chose that action
- what the tool returned
- how the Agent continued to the next step based on the result
This makes it easier for others to see:
- you understand the action loop
- you are not just wiring a model and tools together
What Tasks Are Suitable for Agents?
Section titled “What Tasks Are Suitable for Agents?”More Suitable
Section titled “More Suitable”- multi-step tasks
- tasks that need external tools
- tasks that need strategy adjustments based on intermediate results
For example:
- research assistant
- automated reports
- data analysis assistant
- code fixing assistant
Less Suitable
Section titled “Less Suitable”- simple FAQs that can be answered in one step
- tasks with completely fixed workflows
- scenarios that require extremely high stability and cannot tolerate free-form behavior
In many cases, a workflow is actually more suitable than an Agent.
Common Beginner Mistakes
Section titled “Common Beginner Mistakes”Thinking “able to chat” means Agent
Section titled “Thinking “able to chat” means Agent”Wrong. A chatbot does not necessarily act autonomously in steps.
Thinking an Agent is definitely more advanced than a workflow
Section titled “Thinking an Agent is definitely more advanced than a workflow”Not necessarily. For simple and stable tasks, a workflow may be cheaper and more reliable.
Thinking that adding tool calling solves everything
Section titled “Thinking that adding tool calling solves everything”The more tools and steps you add, the harder debugging and safety become.
Checklist: How to Tell Whether a System Is an Agent
Section titled “Checklist: How to Tell Whether a System Is an Agent”Many beginners call “chatbots, RAG applications, and tool-calling applications” Agents. A safer way is to check with the table below first.
| Question | If the answer is “yes” | More like |
|---|---|---|
| Are the steps completely fixed? | It follows the same process every time | Workflow |
| Is the main goal continuous conversation? | The focus is on understanding context and replying | Chatbot |
| Is it only calling one tool once? | The user asks something and it calls one corresponding function | Tool-calling application |
| Does it decide the next step based on intermediate results? | Tool results affect later actions | Agent |
| Does it have clear stop conditions and execution records? | You can tell why it continued or stopped | Closer to a controllable Agent |
You can remember one rule first: if a system does not “observe the result and then decide the next step,” it is usually still just a workflow or a tool-calling application, and there is no need to rush into calling it an Agent.
How Should Your First Agent Project Be Built to Stay Stable?
Section titled “How Should Your First Agent Project Be Built to Stay Stable?”For your first Agent project, it is not recommended to start with a “fully automatic complex assistant.” A more stable version path is:
| Version | Goal | Acceptance Criteria |
|---|---|---|
| v0.1 Single-step tools | Can choose one tool based on the user question | Print tool_call, parameters, and tool result |
| v0.2 Multi-step execution | Can complete a 2–3 step task | Each step has a trace, and it never loops forever |
| v0.3 Failure recovery | Can explain and try an alternative when a tool fails | Has error logs and fallback answers |
| v0.4 Human confirmation | Requires confirmation before high-risk actions | Can distinguish read-only tools from write tools |
| v0.5 Project showcase | Has README, examples, failure cases, and safety boundaries | Can explain why the Agent acted this way |
This path helps you focus on the “controllable action loop” instead of chasing flashy autonomy.
Agent Execution Trace Template
Section titled “Agent Execution Trace Template”The most valuable thing to show in an Agent project is the execution process. It is recommended to record at least these fields for each run:
| Field | Example | Purpose |
|---|---|---|
goal | Help me make a study plan for this week | User goal |
step | 1 | Which step it is |
thought_type | plan / tool / observe / final | Current stage type |
action | search_course_docs | Action taken |
arguments | {topic: "RAG"} | Tool parameters |
observation | Found 3 relevant chapters | Return from the tool or environment |
next_decision | Continue generating the plan | Why continue or stop |
A minimal trace might look like this:
goal: Create a RAG study planstep 1: action=search_course_docs, arguments={topic: RAG}observation: Found 3 chapters on RAG basics, document processing, and retrieval strategiesstep 2: action=build_plan, arguments={days: 3}observation: A 3-day plan has been generatedfinal: Return the study plan and explain which chapters were referencedWithout a trace, when an Agent goes wrong, it is hard to locate the problem: did it misunderstand the goal, choose the wrong tool, pass the wrong parameters, get an abnormal tool return, or forget to define a stop condition?
What an Agent Should Not Do
Section titled “What an Agent Should Not Do”The stronger an Agent becomes, the more boundaries it needs. Especially in the beginner stage, remember: not everything is suitable to be executed autonomously by an Agent.
| Risky action | Safer approach |
|---|---|
| Delete files, submit code, send messages, place orders or payments | Require human confirmation |
| Execute arbitrary code without whitelist restrictions | Restrict tool permissions and runtime environment |
| Keep trying forever until success | Set a maximum number of steps, maximum cost, and timeout |
| Make up conclusions when evidence is insufficient | Clearly allow “I don’t know” |
| Rely on memory to overwrite current facts | Read the current state first, then act |
This table is not meant to weaken the Agent, but to make it more like a reliable system. A truly engineered Agent is not mainly about “whether it can do many things by itself,” but about “whether it knows when to stop, when to ask a human, and when it should not act.”
Evidence to Keep
Section titled “Evidence to Keep”Keep this page’s proof of learning as a small evidence card:
- Agent Boundary
- how this differs from chatbot or fixed workflow
- Goal State Action
- goal, current state, next action, observation
- Architecture Parts
- planner, tools, memory, guardrails, evaluator
- Failure Check
- over-autonomy, vague goal, missing state, or no trace
- Next Action
- build the smallest traceable single-agent loop
Summary
Section titled “Summary”The most important sentence in this section is:
An Agent is not a “talking model,” but a “system that can take actions around a goal.”
Its value is not only in answering, but in completing tasks. In the next chapters, we will continue to expand on reasoning, tools, memory, multi-Agent systems, deployment, and safety.
What You Should Take Away From This Section
Section titled “What You Should Take Away From This Section”- The key of an Agent is not conversation, but the action loop
- Workflows, chat systems, function calling, and Agents must first be clearly separated
- In 9 AI Agent and Intelligent Agent Systems, all later modules are essentially expansions of these four parts: “goal + decision + tools + state”
Exercises
Section titled “Exercises”- Add another tool to
simple_agent(), such as “check course schedule.” - Make the Agent support a two-step task like “first check the docs, then calculate.”
- Think about this: if a tool returns an error message, how should the Agent handle it more safely?
Reference implementation and walkthrough
- A new tool should have a clear name, input shape, return shape, and failure mode. For
check_course_schedule, return a small structured result instead of a long paragraph. - A two-step task needs state between steps: the result of document lookup should be stored and passed into the calculation step, with a maximum-step guard.
- The Agent should not hide tool errors or continue as if they were facts. It should classify the error, retry only when safe, ask for missing information if needed, or return a controlled failure message.