7.6.2 Finetuning Overview

Learning Objectives
Section titled “Learning Objectives”- Understand which kinds of problems finetuning is truly good at solving
- Understand why not every task should start with finetuning
- Distinguish the basic ideas behind full finetuning and parameter-efficient fine-tuning (PEFT)
- Build a more practical intuition for finetuning decisions
First, Build a Map
Section titled “First, Build a Map”Start with a more realistic scenario
Section titled “Start with a more realistic scenario”Suppose you are building a course Q&A assistant. After launch, you find three kinds of problems:
- Some answers are wrong because it does not know the latest course rules
- Some answers are correct, but the format is always unstable
- Some responses always drift away from your customer service tone and do not fit the brand style over time
These three kinds of problems all look like “the model is not performing well,” but the solutions are not the same. The first is more like a knowledge problem, and you would usually consider RAG first; the second is more like an output constraint problem, and you would usually start with Prompt or structured output; the third is more like a long-term behavior shaping problem, where finetuning starts to become valuable.
So before learning finetuning, do not rush to train. First learn how to judge: what kind of problem is this exactly?
If you have already learned pretraining and Prompt, then the most natural continuation here is:
- Earlier, you learned where model capability comes from, and how to use the model more stably without changing parameters
- Now, this section answers: when is Prompt not enough, and when do you really need to update parameters?
So what really matters in the finetuning overview is not “whether you can train,” but:
- When should you change parameters?
- Is changing parameters really worth it?
For beginners, the best way to understand this section is not “start training first,” but to first see the decision tree clearly:
flowchart TD A["Model performance is not good"] --> B{"Is it a knowledge problem?"} B -->|Yes| C["Consider RAG / retrieval first"] B -->|No| D{"Is it a format problem?"} D -->|Yes| E["Consider Prompt / structured output first"] D -->|No| F{"Is it a stable behavior problem?"} F -->|Yes| G["Then consider finetuning"]What this section really wants to solve is:
- When should you finetune?
- What kinds of problems does finetuning solve, and what does it not solve?
What Problem Does Finetuning Actually Solve?
Section titled “What Problem Does Finetuning Actually Solve?”You can roughly think of it as:
Making a foundation model perform more stably on a more specific task, style, or domain.
For example:
- Better at a fixed output format
- Better adapted to a certain business response style
- Better at the task patterns of a vertical domain
This means finetuning is more like:
- Shaping capabilities
rather than just:
- Adding knowledge
When you first learn finetuning, what should you focus on first?
Section titled “When you first learn finetuning, what should you focus on first?”What you should focus on first is not method names like LoRA or full finetuning, but this sentence:
Finetuning is more like shaping model behavior, not simply “stuffing knowledge” into the model.
Once this idea is stable, many later judgments become easier:
- Why knowledge updates are often better handled by RAG
- Why format stability problems should sometimes be handled by Prompt first
- Why finetuning is worth considering when behavior is unstable over the long term
Why Should Not Every Problem Start with Finetuning?
Section titled “Why Should Not Every Problem Start with Finetuning?”Many problems are better handled first by considering:
- Prompt
- RAG
- Tool calling
If the problem is “the knowledge is not up to date”
Section titled “If the problem is “the knowledge is not up to date””The more natural first choice is often:
- Retrieval
If the problem is “the output format is unstable”
Section titled “If the problem is “the output format is unstable””The more natural first choice is often:
- Prompt optimization
- Structured output
When is finetuning more worth prioritizing?
Section titled “When is finetuning more worth prioritizing?”When you find the problem is more like:
- Model behavior is unstable over the long term
- Style requirements are fixed
- A certain task appears repeatedly and the pattern is stable
At that point, finetuning becomes more valuable.
Remember this one sentence first:
First determine whether this is a knowledge problem, a format problem, or a behavior problem.
Judgment table for the three types of problems
Section titled “Judgment table for the three types of problems”| Symptom | Problem type | Prioritize |
|---|---|---|
| The model does not know the company’s latest refund policy | Knowledge problem | RAG / retrieval / knowledge base updates |
| The answer content is correct, but the JSON format is often wrong | Format problem | Prompt / structured output / validation retries |
| The model does not consistently follow fixed phrasing and task style | Behavior problem | Finetuning / PEFT |
| The user’s question needs a tool lookup before answering | Action problem | Tool calling / Agent / workflow |
This table is very important because it helps you avoid a common mistake: thinking finetuning is the answer whenever performance is poor. In real projects, many problems are not solved by changing parameters.

The Difference Between Full Finetuning and Parameter-Efficient Finetuning
Section titled “The Difference Between Full Finetuning and Parameter-Efficient Finetuning”Full Finetuning
Section titled “Full Finetuning”Intuitively, this means:
- Most of the model’s parameters are allowed to be updated
Advantages:
- Flexible
Disadvantages:
- High memory usage
- High cost
- Harder to train
Parameter-Efficient Finetuning (PEFT)
Section titled “Parameter-Efficient Finetuning (PEFT)”Intuitively, this means:
- You do not heavily modify the whole model
- You only train a small number of additional parameters
Advantages:
- More resource-efficient
- Easier to reuse
That is why PEFT is becoming more and more common in real projects.
When you first look at PEFT, what is most worth remembering?
Section titled “When you first look at PEFT, what is most worth remembering?”What is most worth remembering is not the details of specific algorithms, but this:
- It solves the real-world problem of “resources and maintenance cost”
In other words, PEFT is not just trendy. It is:
- A more practical adaptation path when you do not want to heavily modify the whole model
A Cost Map for Adaptation Approaches
Section titled “A Cost Map for Adaptation Approaches”flowchart LR A["Foundation model"] --> B{"How do you adapt to the task?"} B --> C["Prompt / RAG<br/>No parameter updates"] B --> D["PEFT<br/>Train only a small number of extra parameters"] B --> E["Full finetuning<br/>Update many parameters"]
C --> C1["Low cost<br/>Good for starting with a baseline"] D --> D1["Moderate cost<br/>Suitable for stable task adaptation"] E --> E1["High cost<br/>Suitable for strong customization but harder to maintain"]This diagram can serve as a reminder when choosing a solution for the first time: the further to the right you go, the deeper the changes, the higher the cost, and the more you need stable data and clear benefits.
A Minimal Parameter-Scale Example
Section titled “A Minimal Parameter-Scale Example”params = { "full_finetune": 100_000_000, "peft": 5_000_000}
for name, count in params.items(): print(name, "trainable_params =", count)Expected output:
full_finetune trainable_params = 100000000peft trainable_params = 5000000What is this code reminding us of?
Section titled “What is this code reminding us of?”It is not telling you a precise number. It is reminding you of this:
The first real-world question in finetuning methods is often: “How many parameters do we actually need to change?”
This directly determines:
- Memory usage
- Training speed
- Storage cost
When Is Finetuning Really Valuable?
Section titled “When Is Finetuning Really Valuable?”When you want the model to form stable behavior
Section titled “When you want the model to form stable behavior”For example:
- A specific response style
- A specific task format
- Specific domain habits
When you have stable, sustainable data
Section titled “When you have stable, sustainable data”If your task data:
- Is large enough
- Has good quality
- Follows relatively stable patterns
then finetuning is usually more meaningful.
When is it not worth it?
Section titled “When is it not worth it?”If the requirements change frequently, or the knowledge updates often, then in many cases finetuning is not the first choice.
The Easiest Place to Overestimate Finetuning
Section titled “The Easiest Place to Overestimate Finetuning”Misconception 1: Thinking finetuning can solve everything
Section titled “Misconception 1: Thinking finetuning can solve everything”It cannot. Many problems are better solved with:
- Retrieval
- Workflows
- Prompt
Misconception 2: Thinking finetuning will make the model “memorize the knowledge base”
Section titled “Misconception 2: Thinking finetuning will make the model “memorize the knowledge base””Finetuning is better for shaping behavior, and is not always suitable for carrying rapidly changing knowledge.
Misconception 3: Thinking that training it means it will definitely get better
Section titled “Misconception 3: Thinking that training it means it will definitely get better”If the data is poor, finetuning may actually make the model worse.
A Very Practical Question
Section titled “A Very Practical Question”Before deciding whether to finetune, ask:
- Is this a knowledge problem or a behavior problem?
- Will this task shape remain stable for a long time?
- Do I have clean and stable data?
- Do I really have the resources to handle training and maintenance?
If these questions are answered clearly, the finetuning decision will usually be much more solid.
The safest order when doing a project for the first time
Section titled “The safest order when doing a project for the first time”If you want to truly ship a task, it is recommended to go in this order:
- First use Prompt to build a baseline
- Then use retrieval or a workflow to build a second-layer baseline
- Only when behavior is still unstable for the long term should you consider finetuning
In this way, it will be easier later to explain:
- What finetuning actually solved
- Whether it was worth it
Evidence to Keep
Section titled “Evidence to Keep”Keep this page’s proof of learning as a small evidence card:
- Problem Type
- behavior adaptation, format, tone, or domain routine
- Not For
- missing facts that RAG should supply
- Cost Map
- full fine-tune vs PEFT vs prompting
- Eval Baseline
- pre-finetune behavior recorded
- Go No Go
- enough quality data and stable evaluation
Summary
Section titled “Summary”The most important thing in this section is not to treat finetuning as the default action, but to understand:
Finetuning is better suited for solving “model behavior and task adaptation” problems, not every problem.
Once this judgment is established, when you later learn LoRA, QLoRA, and engineering practice, you will not rush in blindly.
What You Should Take Away from This Section
Section titled “What You Should Take Away from This Section”- Finetuning is not the default action, but a more expensive adaptation method
- First distinguish knowledge problems, format problems, and behavior problems
- Only when the task is long-term stable, the data is reliable, and the benefits are clear does finetuning become a more worthy priority
Exercises
Section titled “Exercises”- Think of a real project of yours and judge whether its problem is more like a knowledge problem or a behavior problem.
- Explain in your own words: why should not all tasks prioritize finetuning?
- If requirements change frequently, why is finetuning not necessarily the first choice?
- Why do people say that “data quality” often affects finetuning results more than the “method name”?
Project reference and review notes
- A knowledge problem usually means the model lacks fresh or private facts, so RAG or tools may fit better. A behavior problem means the model knows enough but does not follow the desired format, tone, workflow, or decision pattern reliably.
- Finetuning costs data, training time, evaluation effort, deployment work, and maintenance. If prompting, retrieval, or workflow control solves the problem, finetuning may add risk without enough benefit.
- Frequent requirement changes make finetuned behavior go stale quickly. Updating prompts, retrieval rules, or workflow logic is usually faster and easier to audit than retraining.
- Finetuning mostly amplifies patterns in the training data. Clear, consistent, representative examples matter more than whether the method is called full finetuning, LoRA, or another PEFT variant.