Skip to content

Tutorial 11: Adaptive RAG

Adaptive RAG intelligently routes queries to the optimal retrieval strategy based on question type.

Overview

Not all questions need the same approach:

  • Document questions → Vector store
  • Current events → Web search
  • Simple factual → Direct LLM response

Adaptive RAG classifies and routes accordingly.

Architecture

Query Router

The QueryRouter classifies questions:

python
from langgraph_ollama_local.rag.graders import QueryRouter

router = QueryRouter(llm)

# Route examples
router.route("What is Self-RAG?")         # → "vectorstore"
router.route("Latest AI news today?")     # → "websearch"
router.route("What is 2 + 2?")            # → "direct"

State Definition

python
class AdaptiveRAGState(TypedDict):
    question: str
    query_type: Literal["vectorstore", "websearch", "direct"]
    documents: List[Document]
    generation: str

Node Functions

Query Classification

python
def classify_query(state: AdaptiveRAGState) -> dict:
    """Classify the query type."""
    query_type = router.route(state["question"])
    return {"query_type": query_type}

Vector Store Retrieval

python
def retrieve_vectorstore(state: AdaptiveRAGState) -> dict:
    """Retrieve from local vector store."""
    docs = retriever.retrieve_documents(state["question"], k=4)
    return {"documents": docs}
python
def retrieve_websearch(state: AdaptiveRAGState) -> dict:
    """Search the web for current information."""
    web_docs = web_search(state["question"], max_results=3)
    return {"documents": web_docs}

Direct Response

python
def direct_answer(state: AdaptiveRAGState) -> dict:
    """Answer directly without retrieval."""
    response = llm.invoke(state["question"])
    return {"generation": response.content}

Routing Logic

python
def route_query(state: AdaptiveRAGState) -> str:
    """Route based on query classification."""
    return state["query_type"]

# In graph construction
graph.add_conditional_edges(
    "classify",
    route_query,
    {
        "vectorstore": "vectorstore",
        "websearch": "websearch",
        "direct": "direct",
    }
)

Graph Construction

python
from langgraph.graph import StateGraph, START, END

graph = StateGraph(AdaptiveRAGState)

# Nodes
graph.add_node("classify", classify_query)
graph.add_node("vectorstore", retrieve_vectorstore)
graph.add_node("websearch", retrieve_websearch)
graph.add_node("direct", direct_answer)
graph.add_node("generate", generate)

# Edges
graph.add_edge(START, "classify")

# Conditional routing after classification
graph.add_conditional_edges(
    "classify",
    route_query,
    {
        "vectorstore": "vectorstore",
        "websearch": "websearch",
        "direct": "direct",
    }
)

# Routes to generation
graph.add_edge("vectorstore", "generate")
graph.add_edge("websearch", "generate")
graph.add_edge("direct", END)
graph.add_edge("generate", END)

adaptive_rag = graph.compile()

Usage Examples

python
# Document question - routes to vectorstore
result = adaptive_rag.invoke({
    "question": "What is Self-RAG?"
})

# Current events - routes to web search
result = adaptive_rag.invoke({
    "question": "What are the latest AI breakthroughs in 2024?"
})

# Simple factual - direct answer
result = adaptive_rag.invoke({
    "question": "What is 15 + 27?"
})

Query Router Prompt

The router uses structured output to classify queries:

python
ROUTER_PROMPT = """You are an expert at routing questions.

Route questions as follows:
- vectorstore: Questions about specific documents or domain knowledge
- websearch: Questions about current events or recent information
- direct: Simple factual or math questions that don't need retrieval

Question: {question}

Route to:"""

Benefits

AspectFixed RAGAdaptive RAG
Simple questionsFull retrievalDirect answer
CoverageLocal onlyMulti-source
EfficiencySame for allOptimized per query
LatencyHigherLower (when skipping retrieval)

Configuration

bash
# Environment variables
ADAPTIVE_RAG_ROUTER_MODEL=llama3.2:3b
ADAPTIVE_RAG_ENABLE_DIRECT=true

Best Practices

  1. Use small models for routing: Classification is lightweight
  2. Monitor routing decisions: Track which route is chosen
  3. Tune routing prompt: Adjust for your specific use case
  4. Add confidence scores: Route to multiple paths when uncertain
  5. Cache routing decisions: Reuse for similar queries

Performance Considerations

Adaptive RAG can significantly improve efficiency:

  • Direct route: Saves ~2-3 seconds (no retrieval)
  • Web search route: Fresher information for time-sensitive queries
  • Vectorstore route: Best for domain-specific knowledge

Quiz

Test your understanding of Adaptive RAG:

Knowledge Check

What are the three routing options in Adaptive RAG?

ALocal, Remote, Cache
BFast, Medium, Slow
CVectorstore, Websearch, Direct
DSimple, Complex, Hybrid

Knowledge Check

Which type of question would be routed to the 'direct' path?

AWhat is Self-RAG?
BWhat are the latest AI breakthroughs in 2024?
CWhat is 2 + 2?
DSummarize the uploaded document

Knowledge Check

What is the primary efficiency benefit of Adaptive RAG compared to fixed RAG?

AUses less memory
BTrains faster
COptimizes processing per query type, skipping retrieval when not needed
DHas better accuracy on all questions

Knowledge Check

What component classifies the query type in Adaptive RAG?

ADocument Grader
BQuery Router
CHallucination Checker
DAnswer Grader

Knowledge Check Fill In

Why is it recommended to use small models for query routing?