9.2.2 LLM Reasoning Capability

LLM reasoning and intermediate state checkpoints

Section Overview

When many people first encounter Agents, they naturally think of large models as:

able to chat
able to write
able to call tools

But what really gives an Agent a “brain” is not just whether it can talk, but:

When faced with a complex problem, can it break the problem apart, keep track of intermediate states, and gradually reach a conclusion?

That is what reasoning capability is meant to solve.

Learning Objectives

Understand that “knowledge retrieval” and “reasoning-based problem solving” are not the same thing
Understand the three common task patterns of LLM reasoning
Learn through a runnable example why “intermediate state” matters
Understand why Agents cannot rely on tools alone and still need a reasoning layer

How This Section Connects to the Previous Agent Basics

If you just finished learning “What is an Agent?”, you can think of this section as:

You already know that an Agent needs to take actions around a goal
This section starts answering: what does it actually rely on to break down complex problems and maintain intermediate states?

So what matters most in this section is not “reasoning sounds advanced,” but:

What exactly does the reasoning layer do in an Agent system?

What Is “Reasoning,” and How Is It Different from “Remembering Answers”?

Remembering answers: like looking up an internal dictionary

If I ask:

What is the capital of France?

The model is more like it is using a knowledge pattern it has already learned. This kind of question is closer to:

memory
retrieval
pattern matching

Reasoning: the answer is not directly written in the question

If I ask:

What is 3 * (4 + 2) - 5?

The model cannot rely only on “having memorized this formula.” It needs to do:

Calculate the expression inside the parentheses first
Then do multiplication
Finally do subtraction

In other words:

The key to a reasoning problem is not whether the model has seen the exact question before, but whether it can maintain a correct chain of intermediate states.

When learning reasoning for the first time, what should you focus on most?

What you should focus on first is not terminology, but this sentence:

The hardest part of reasoning is not the final answer, but whether each intermediate step stays correct all the way through.

Once this idea is solid, later when you see:

Chain-of-Thought
ReAct
Plan-and-Execute

you can naturally understand them as ways of helping the model preserve the chain of intermediate states.

An analogy: reasoning is like cooking, not just naming a dish

“Knowing what Kung Pao Chicken is” is more like knowledge. “Marinate the chicken, then stir-fry the aromatics, then add the sauce, then reduce it” is more like reasoning.

It requires the system to:

know the order
remember intermediate results
avoid skipping steps

What Three Types of Problems Do LLM Reasoning Commonly Face?

Arithmetic and symbolic reasoning

For example:

multi-step arithmetic
date and time calculation
unit conversion

The characteristics of these problems are:

the conclusion depends on the steps
if one step is wrong, everything that follows is wrong

Constraint satisfaction and comparative decision-making

For example:

choosing a plan under budget, time, and inventory constraints
scheduling
route planning

These problems are not necessarily just arithmetic. They emphasize:

multiple conditions must hold at the same time
intermediate judgments must not contradict each other

State integration before and after tool use

This is the most common type in Agent scenarios.

For example:

Check the weather first
Then check the flight
Finally decide whether to reschedule

The tool gives you external information, but turning that information into a conclusion still requires reasoning.

Why are these three types especially suitable as an entry point for learning Agents?

Because they are not “look it up and you’re done” problems. Instead, they require:

breaking the task into steps
saving intermediate results
integrating external observations back in

That is exactly the core work of the reasoning layer in an Agent.

Let’s First Run a Real Example That Clearly Shows “Intermediate State”

The code below parses an expression into an abstract syntax tree, then evaluates it recursively while recording each step of the calculation.

It is not simulating the LLM itself, but it helps you build a very important intuition:

The core of a multi-step problem is not the final answer, but how intermediate states are passed along correctly.

import ast
import operator

OPS = {
    ast.Add: ("+", operator.add),
    ast.Sub: ("-", operator.sub),
    ast.Mult: ("*", operator.mul),
    ast.Div: ("/", operator.truediv),
}


def solve(node):
    if isinstance(node, ast.Constant):
        return node.value, []

    if isinstance(node, ast.BinOp):
        left_value, left_steps = solve(node.left)
        right_value, right_steps = solve(node.right)

        symbol, fn = OPS[type(node.op)]
        result = fn(left_value, right_value)
        step = f"{left_value} {symbol} {right_value} = {result}"

        return result, left_steps + right_steps + [step]

    raise TypeError(f"Unsupported node: {type(node)}")


expression = "3 * (4 + 2) - 5"
tree = ast.parse(expression, mode="eval").body
answer, steps = solve(tree)

print("expression:", expression)
print("steps:")
for step in steps:
    print("-", step)
print("answer:", answer)

Expected output:

expression: 3 * (4 + 2) - 5
steps:
- 4 + 2 = 6
- 3 * 6 = 18
- 18 - 5 = 13
answer: 13

What is most worth learning from this code is not `ast`

What is truly worth taking away is:

each step produces a clear intermediate result
the next step depends on the previous one
the final answer is only the last layer of state

This is very similar to how LLMs handle complex reasoning.

Why is intermediate state more important than the “final answer”?

Because if one intermediate step is wrong, even if you happen to get the final answer right, it is hard to reproduce the result reliably.

What a reasoning system should really pursue is:

a dependable process
errors that can be traced

not accidentally hitting the right answer once.

Why do Agents especially depend on this ability?

Because the problems Agents handle are usually not completed in one step. They may need to:

read the requirement first
then query data
then compare constraints
and finally choose an action

This is essentially maintaining a longer chain of intermediate states.

Why is this code more educational than just “getting the final answer right”?

Because it clearly shows you:

how each step comes about
what the next step depends on
exactly where an error starts to propagate

And that is precisely what an Agent system needs most:

a readable process
checkable intermediate states
traceable errors

Why Is LLM Reasoning Sometimes Strong, but Sometimes Suddenly Unstable?

It is good at patterned step-by-step structure

If the task can be organized into fairly clear steps, LLMs often perform well, for example:

decomposing a problem
explaining reasons
generating candidate solutions

It can drift on long chains

Common issues include:

missing steps
repeating steps
making mistakes in intermediate numbers
conflicting constraints from beginning to end

In other words, LLM reasoning is not a “stable logic engine,” but more like:

a language-based reasoner that is good at drafting steps

That is why many complex tasks need external tools

For example:

use a calculator for precise numeric calculations
use a database to check the real state
use a rules engine to detect constraint conflicts

In an Agent, the reasoning layer often does not work alone, but cooperates with tools.

Why are “can reason” and “can chat” completely different things?

Because chatting is more like:

fluent language
natural style

while reasoning is more like:

steps must be correct
constraints must not conflict
intermediate states must remain stable

This is why a model that seems “very good at talking” can still fail in complex multi-step tasks.

When Should You Turn on a “Stronger Reasoning Mode”?

When the answer requires multiple steps of derivation

If the problem clearly requires:

step-by-step calculation
sequential judgment
condition filtering

then it is worth using a more explicit reasoning strategy.

When the cost of errors is high

For example:

financial calculations
configuration changes
assisted medical advice

In these problems, “looks reasonable” is not enough. What is needed more is:

a process that can be verified

When the problem depends on external observations

For example:

check inventory first, then make a purchasing suggestion
check flights first, then decide whether to reschedule

At this point, reasoning must be combined with tool-based observations.

The Most Common Misconceptions

Misconception 1: If the model is big enough, it will naturally reason well

Larger models usually bring a higher ceiling, but that does not mean all complex reasoning will be stable.

Misconception 2: Reasoning just means writing out more steps

No. Truly effective reasoning means:

the steps depend on one another
intermediate states can be reused
the final answer is supported by the process

Misconception 3: If you have tools, you do not need reasoning

Tools only provide external capabilities. Deciding:

when to call them
which one to call
how to integrate the results

still depends on reasoning.

Summary

The most important thing in this lesson is not to mystify “reasoning,” but to first build a clear judgment:

The essence of LLM reasoning capability is to maintain correct intermediate states in multi-step problems, and integrate external information, constraint conditions, and local results into a final conclusion.

Once you establish this understanding, when you later learn:

CoT
ReAct
Plan-and-Execute

you will see that they are all helping the model complete this task more stably.

What You Should Take Away from This Lesson

Reasoning is not “writing the answer longer,” but maintaining a correct chain of intermediate states
Agents need reasoning because many tasks are not completed in a single step
Tools provide capabilities, while reasoning organizes those capabilities and results

Exercises

Replace the expression in the example with 12 / (3 + 1) + 7 and see whether the step output matches your expectations.
Explain in your own words: why is the key to a reasoning problem in the “intermediate state” and not only the “final answer”?
Think of an Agent task you have done, and identify at least two places where one step clearly depends on the result of the previous step.
Why does “having tools” not mean “being able to reason”?

Learning Objectives​

How This Section Connects to the Previous Agent Basics​

What Is “Reasoning,” and How Is It Different from “Remembering Answers”?​

Remembering answers: like looking up an internal dictionary​

Reasoning: the answer is not directly written in the question​

When learning reasoning for the first time, what should you focus on most?​

An analogy: reasoning is like cooking, not just naming a dish​

What Three Types of Problems Do LLM Reasoning Commonly Face?​

Arithmetic and symbolic reasoning​

Constraint satisfaction and comparative decision-making​

State integration before and after tool use​

Why are these three types especially suitable as an entry point for learning Agents?​

Let’s First Run a Real Example That Clearly Shows “Intermediate State”​

What is most worth learning from this code is not ast​

Why is intermediate state more important than the “final answer”?​

Why do Agents especially depend on this ability?​

Why is this code more educational than just “getting the final answer right”?​

Why Is LLM Reasoning Sometimes Strong, but Sometimes Suddenly Unstable?​

It is good at patterned step-by-step structure​

It can drift on long chains​

That is why many complex tasks need external tools​

Why are “can reason” and “can chat” completely different things?​

When Should You Turn on a “Stronger Reasoning Mode”?​

When the answer requires multiple steps of derivation​

When the cost of errors is high​

When the problem depends on external observations​

The Most Common Misconceptions​

Misconception 1: If the model is big enough, it will naturally reason well​

Misconception 2: Reasoning just means writing out more steps​

Misconception 3: If you have tools, you do not need reasoning​

Summary​

What You Should Take Away from This Lesson​

Exercises​

Learning Objectives

How This Section Connects to the Previous Agent Basics

What Is “Reasoning,” and How Is It Different from “Remembering Answers”?

Remembering answers: like looking up an internal dictionary

Reasoning: the answer is not directly written in the question

When learning reasoning for the first time, what should you focus on most?

An analogy: reasoning is like cooking, not just naming a dish

What Three Types of Problems Do LLM Reasoning Commonly Face?

Arithmetic and symbolic reasoning

Constraint satisfaction and comparative decision-making

State integration before and after tool use

Why are these three types especially suitable as an entry point for learning Agents?

Let’s First Run a Real Example That Clearly Shows “Intermediate State”

What is most worth learning from this code is not `ast`

Why is intermediate state more important than the “final answer”?

Why do Agents especially depend on this ability?

Why is this code more educational than just “getting the final answer right”?

Why Is LLM Reasoning Sometimes Strong, but Sometimes Suddenly Unstable?

It is good at patterned step-by-step structure

It can drift on long chains

That is why many complex tasks need external tools

Why are “can reason” and “can chat” completely different things?

When Should You Turn on a “Stronger Reasoning Mode”?

When the answer requires multiple steps of derivation

When the cost of errors is high

When the problem depends on external observations

The Most Common Misconceptions

Misconception 1: If the model is big enough, it will naturally reason well

Misconception 2: Reasoning just means writing out more steps

Misconception 3: If you have tools, you do not need reasoning

Summary

What You Should Take Away from This Lesson

Exercises