8.1.7 Advanced RAG Architectures

Learning Objectives

After completing this section, you will be able to:

Understand why basic RAG is not enough in complex scenarios
Recognize common architectures such as routing-based, multi-hop, and Agentic RAG
Run a toy example of “multi-knowledge-base routing”
Know when to upgrade a RAG architecture, and when not to

Why Does Basic RAG Eventually Hit Its Limits?

Basic RAG Is Suitable for “One Question -> One Retrieval -> One Answer”

This is already enough for many FAQs and simple Q&A tasks. But when the problem gets more complex, bottlenecks start to appear.

For example:

Need to search across multiple knowledge bases
Need to check policies first, then product docs
Need to break the task into multiple sub-questions

Common Complex Scenarios

For example, a user asks:

“Can this learner get a refund? If not, is there an extension option?”

This actually implies multiple actions:

Check the refund policy
Determine whether the current conditions are met
Then check the extension option

At this point, “retrieving only once” is often not enough.

Advanced RAG architecture decision map

The main lesson in this map is simple: use the lightest architecture that actually solves the failure, not the most complicated one you can name.

Routing-Based RAG: Decide Where to Search First

When One Knowledge Base Is Not Enough, Route First

Many systems do not have just one document store. They may have:

Policy knowledge base
Product knowledge base
Technical documentation knowledge base
FAQ knowledge base

If all queries go into the same store, the noise can be very high. A better approach is:

First determine which knowledge base the question belongs to, then retrieve from there.

A Runnable Multi-Store Routing Example

policy_docs = [
    "Refund policy: You can apply for a refund within 7 days after purchasing the course.",
    "Certificate policy: You can get a certificate after passing the test."
]

tech_docs = [
    "If login fails, first check your account password and network connection.",
    "A 401 error from API calls usually indicates authentication failure."
]

def route_query(query):
    query_lower = query.lower()
    if "refund" in query_lower or "certificate" in query_lower:
        return "policy"
    if "login" in query_lower or "api" in query_lower or "401" in query_lower:
        return "tech"
    return "default"

def retrieve_simple(query, docs):
    query_lower = query.lower()
    keywords = []

    if "refund" in query_lower:
        keywords.extend(["refund", "refund policy"])
    if "certificate" in query_lower:
        keywords.extend(["certificate", "certificate policy"])
    if "login" in query_lower or "401" in query_lower or "api" in query_lower:
        keywords.extend(["login", "401", "api"])
    if not keywords:
        keywords = query_lower.split()

    return [doc for doc in docs if any(keyword in doc.lower() for keyword in keywords)]

queries = ["how to get a refund", "how to handle a 401 error"]

for q in queries:
    route = route_query(q)
    if route == "policy":
        hits = retrieve_simple(q, policy_docs)
    elif route == "tech":
        hits = retrieve_simple(q, tech_docs)
    else:
        hits = []
    print(q, "-> routed to", route, "->", hits)

Expected output:

how to get a refund -> routed to policy -> ['Refund policy: You can apply for a refund within 7 days after purchasing the course.']
how to handle a 401 error -> routed to tech -> ['A 401 error from API calls usually indicates authentication failure.']

This is the simplest version of “Router RAG.” It is not “smarter retrieval” by itself. Its value is that it reduces the search space before retrieval, so the retriever has less irrelevant material to fight through.

Multi-hop RAG: Break the Problem into Multiple Steps

Some Questions Cannot Be Answered in One Step

For example:

“What conditions has this person completed, and what is still missing for them to get certified?”

This kind of question usually requires:

Check the certification rules
Check the user’s completion status
Compare the two

Multi-hop RAG Is More Like Solving a Problem Step by Step

Instead of finding all the materials at once, it works like this:

Solve the first sub-question first
Then continue retrieving based on the intermediate result

This feels closer to an Agent.

Agentic RAG: Retrieval Is No Longer a Fixed Pipeline

What Is the Difference from Normal RAG?

Normal RAG is more like a fixed flow:

Retrieve
Assemble context
Answer

Agentic RAG, on the other hand, may:

Decide whether retrieval is needed
Decide how many times to retrieve
Decide whether to rewrite the query or switch data sources
Then decide whether to continue acting

Advantages and Trade-offs

Advantages:

More flexible
Can handle complex tasks

Trade-offs:

Harder to debug
Slower
Higher cost

So not every RAG system should be made agentic.

Structured Retrieval: Not All Knowledge Should Go into a Pure Text Store

When the Data Itself Has Structure

For example:

Order table
User status
Ticketing system
Grade table

These kinds of data are often better handled by:

SQL queries
API queries
Graph databases

rather than forcing them into plain text and then retrieving from that.

A Common Upgrade Path

Real systems may combine:

Unstructured document RAG
Structured database queries
Tool calling

This is also why “advanced RAG” is often closely tied to Agents.

Graph RAG and Knowledge Graph Thinking

What Problem Does It Solve?

When knowledge points have obvious relationships, plain text chunking may not be enough.

For example:

Person relationships
Company organizational structure
Product dependency relationships

In these cases, a graph structure makes it easier to express the connections between nodes.

When Is It Worth Considering?

When your questions often require:

Jumping across multiple entities
Following relationship chains
Structured reasoning

you can consider graph-style retrieval.

Advanced RAG architecture selection map

When Should You Upgrade to Advanced RAG?

Signs That It Is Worth Upgrading

If you are already facing these problems:

Multiple knowledge bases interfere with each other
One retrieval is often not enough
Structured data needs to work together with retrieval
The question clearly needs step-by-step reasoning

then it may be time to upgrade the architecture.

Signs That It Is Not Worth Upgrading

If you have not even stabilized basic RAG yet:

Chunking is unreasonable
There is no evaluation set
You have not tuned top-k

then do not rush into advanced architectures.

Common Beginner Mistakes

Wanting to Use Agentic RAG as Soon as a Task Looks Complex

In many cases, getting routing and retrieval strategies right already solves most of the problem.

Thinking “More Components” Means “More Advanced”

More components do not necessarily mean a better system; they may just make maintenance harder.

Upgrading Architectures Without Evaluation

Without evaluation, you cannot tell whether the upgrade is a real improvement or just “something that looks more complicated.”

Evidence to Keep

Keep this page’s proof of learning as a small evidence card:

Query: one user question or test case
Retrieved Chunks: chunk ids, scores, and source titles
Answer: final response with citation or source note
Failure Check: missing evidence, wrong chunk, stale doc, or unsupported claim
Next Action: chunking, embedding, reranking, prompt, or eval change

Summary

The most important takeaway from this section is:

Advanced RAG is not about showing off. It is about giving the system a smarter way to organize retrieval when basic RAG cannot cover complex questions.

Polishing a simple architecture first, and then deciding whether to upgrade, is usually the more mature engineering path.

Exercises

Add a “course content knowledge base” to the routing example and extend the route_query() rules.
Think about your own project: is there any data that would actually be better suited to SQL / API queries rather than pure text retrieval?
Try to come up with a question that can only be answered with multi-hop retrieval.

Project reference and review notes

A good route should send syllabus, lesson, and exercise questions to the course content knowledge base while keeping account, payment, or policy questions on their own routes. Route decisions should be easy to inspect.
Structured facts such as order status, enrollment records, inventory, permissions, grades, and live prices are often better served by SQL or APIs. Text retrieval is better for explanations, policies, manuals, and long-form knowledge.
A multi-hop question needs evidence from more than one place, such as “Which lesson teaches vector databases, and what project later uses that concept?” One retrieval finds the lesson, another finds the project connection.