Tutorial 12: Agentic RAG

Agentic RAG gives an LLM agent control over the retrieval process - deciding when, what, and how to retrieve.

Overview

Previous patterns use fixed retrieval flows. Agentic RAG:

Agent decides when to retrieve
Multiple retrieval rounds possible
Query decomposition for complex questions
Iterative refinement

Architecture

Retrieval as a Tool

python

from langchain_core.tools import tool

@tool
def search_documents(query: str) -> str:
    """Search the document database for information.

    Args:
        query: The search query.

    Returns:
        Retrieved document contents.
    """
    docs = retriever.retrieve_documents(query, k=3)
    return "\n\n".join([doc.page_content for doc in docs])

tools = [search_documents]
llm_with_tools = llm.bind_tools(tools)

Agent System Prompt

python

SYSTEM_PROMPT = """You are a research assistant with document search.

Strategy:
1. Break complex questions into sub-questions
2. Search for each aspect separately
3. Synthesize information from multiple searches
4. Provide comprehensive answers with sources

You can search multiple times if needed."""

State Definition

python

from langchain_core.messages import BaseMessage

class AgenticRAGState(TypedDict):
    messages: List[BaseMessage]  # Conversation history

Node Functions

Agent Node

python

def agent(state: AgenticRAGState) -> dict:
    """Agent decides next action."""
    messages = [SystemMessage(content=SYSTEM_PROMPT)] + state["messages"]
    response = llm_with_tools.invoke(messages)
    return {"messages": [response]}

Tool Execution Node

python

from langgraph.prebuilt import ToolNode

# Create tool execution node
tool_node = ToolNode(tools)

# Or implement custom execution
def execute_tools(state: AgenticRAGState) -> dict:
    """Execute tool calls from agent."""
    last_message = state["messages"][-1]
    tool_calls = last_message.tool_calls

    tool_messages = []
    for tool_call in tool_calls:
        tool_result = search_documents.invoke(tool_call["args"])
        tool_messages.append(
            ToolMessage(
                content=tool_result,
                tool_call_id=tool_call["id"]
            )
        )

    return {"messages": tool_messages}

ReAct Loop

python

def should_continue(state: AgenticRAGState) -> str:
    last_message = state["messages"][-1]

    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return "tools"
    return "end"

graph.add_conditional_edges(
    "agent",
    should_continue,
    {"tools": "tools", "end": END}
)
graph.add_edge("tools", "agent")  # Loop back

Graph Construction

python

from langgraph.graph import StateGraph, START, END

graph = StateGraph(AgenticRAGState)

# Nodes
graph.add_node("agent", agent)
graph.add_node("tools", execute_tools)

# Edges
graph.add_edge(START, "agent")
graph.add_conditional_edges(
    "agent",
    should_continue,
    {"tools": "tools", "end": END}
)
graph.add_edge("tools", "agent")

agentic_rag = graph.compile()

Usage

python

from langchain_core.messages import HumanMessage

# Simple question - single retrieval
result = agentic_rag.invoke({
    "messages": [HumanMessage(content="What is Self-RAG?")]
})

# Complex question - multiple retrievals
result = agentic_rag.invoke({
    "messages": [HumanMessage(content="Compare Self-RAG and CRAG")]
})

print(result["messages"][-1].content)

Advanced: Multiple Tools

Provide multiple retrieval strategies:

python

@tool
def search_documents(query: str) -> str:
    """Search local documents."""
    docs = retriever.retrieve_documents(query, k=3)
    return format_docs(docs)

@tool
def search_web(query: str) -> str:
    """Search the web for current information."""
    results = web_search_api(query)
    return format_web_results(results)

@tool
def list_available_documents() -> str:
    """List all available documents in the database."""
    return retriever.list_documents()

tools = [search_documents, search_web, list_available_documents]

Query Decomposition Example

The agent can break down complex queries:

User: "Compare Self-RAG and CRAG, and explain which is better for current events."

Agent Reasoning:
1. Search for "Self-RAG" → Gets Self-RAG info
2. Search for "CRAG" → Gets CRAG info
3. Search for "current events RAG" → Gets info on temporal queries
4. Synthesizes comparison and recommendation

Benefits

Aspect	Standard RAG	Agentic RAG
Control	Fixed flow	Agent decides
Queries	Single	Multiple
Complexity	Simple	Complex supported
Adaptability	Predefined	Dynamic

Best Practices

Clear tool descriptions: Help agent choose right tool
Max iterations: Prevent infinite loops
Cost monitoring: Multiple LLM calls can add up
Tool result formatting: Make results easy for agent to parse
Error handling: Handle tool failures gracefully

Configuration

bash

# Environment variables
AGENTIC_RAG_MAX_ITERATIONS=10
AGENTIC_RAG_AGENT_MODEL=llama3.2:3b

Comparison with Previous Patterns

Pattern	Retrieval Control	Best For
Basic RAG	None	Simple Q&A
Self-RAG	Quality checks	Accuracy
CRAG	Fallback logic	Coverage
Adaptive RAG	Query routing	Efficiency
Agentic RAG	Full agent control	Complex research

Quiz

Test your understanding of Agentic RAG:

Knowledge Check

What is the key difference between Agentic RAG and previous RAG patterns?

AAgentic RAG is faster than other patterns

BThe agent decides when, what, and how to retrieve

CAgentic RAG only uses web search

DAgentic RAG does not support multiple sources

Knowledge Check

What is the ReAct loop in Agentic RAG?

AA retrieval optimization technique

BA reasoning and acting cycle where the agent decides and executes tools

CA type of neural network architecture

DA caching mechanism for faster responses

Knowledge Check

Why might Agentic RAG perform multiple retrievals for a single question?

ATo increase response speed

BTo handle complex questions by breaking them into sub-questions

CTo reduce API costs

DMultiple retrievals are always required

Knowledge Check

How is retrieval implemented in Agentic RAG?

AAs a fixed pipeline step

BAs a tool that the agent can invoke

CAs a background process

DAs a separate microservice

Knowledge Check

What best practice helps prevent infinite loops in Agentic RAG?

AUsing smaller models

BSetting max iterations limit

CCaching all results

DDisabling web search

Tutorial 12: Agentic RAG ​

Overview ​

Architecture ​

Retrieval as a Tool ​

Agent System Prompt ​

State Definition ​

Node Functions ​

Agent Node ​

Tool Execution Node ​

ReAct Loop ​

Graph Construction ​

Usage ​

Advanced: Multiple Tools ​

Query Decomposition Example ​

Benefits ​

Best Practices ​

Configuration ​

Comparison with Previous Patterns ​

Quiz ​

Knowledge Check

Knowledge Check

Knowledge Check

Knowledge Check

Knowledge Check

Tutorial 12: Agentic RAG

Overview

Architecture

Retrieval as a Tool

Agent System Prompt

State Definition

Node Functions

Agent Node

Tool Execution Node

ReAct Loop

Graph Construction

Usage

Advanced: Multiple Tools

Query Decomposition Example

Benefits

Best Practices

Configuration

Comparison with Previous Patterns

Quiz