Tutorial 11: Adaptive RAG
Adaptive RAG intelligently routes queries to the optimal retrieval strategy based on question type.
Overview
Not all questions need the same approach:
- Document questions → Vector store
- Current events → Web search
- Simple factual → Direct LLM response
Adaptive RAG classifies and routes accordingly.
Architecture
Query Router
The QueryRouter classifies questions:
python
from langgraph_ollama_local.rag.graders import QueryRouter
router = QueryRouter(llm)
# Route examples
router.route("What is Self-RAG?") # → "vectorstore"
router.route("Latest AI news today?") # → "websearch"
router.route("What is 2 + 2?") # → "direct"State Definition
python
class AdaptiveRAGState(TypedDict):
question: str
query_type: Literal["vectorstore", "websearch", "direct"]
documents: List[Document]
generation: strNode Functions
Query Classification
python
def classify_query(state: AdaptiveRAGState) -> dict:
"""Classify the query type."""
query_type = router.route(state["question"])
return {"query_type": query_type}Vector Store Retrieval
python
def retrieve_vectorstore(state: AdaptiveRAGState) -> dict:
"""Retrieve from local vector store."""
docs = retriever.retrieve_documents(state["question"], k=4)
return {"documents": docs}Web Search
python
def retrieve_websearch(state: AdaptiveRAGState) -> dict:
"""Search the web for current information."""
web_docs = web_search(state["question"], max_results=3)
return {"documents": web_docs}Direct Response
python
def direct_answer(state: AdaptiveRAGState) -> dict:
"""Answer directly without retrieval."""
response = llm.invoke(state["question"])
return {"generation": response.content}Routing Logic
python
def route_query(state: AdaptiveRAGState) -> str:
"""Route based on query classification."""
return state["query_type"]
# In graph construction
graph.add_conditional_edges(
"classify",
route_query,
{
"vectorstore": "vectorstore",
"websearch": "websearch",
"direct": "direct",
}
)Graph Construction
python
from langgraph.graph import StateGraph, START, END
graph = StateGraph(AdaptiveRAGState)
# Nodes
graph.add_node("classify", classify_query)
graph.add_node("vectorstore", retrieve_vectorstore)
graph.add_node("websearch", retrieve_websearch)
graph.add_node("direct", direct_answer)
graph.add_node("generate", generate)
# Edges
graph.add_edge(START, "classify")
# Conditional routing after classification
graph.add_conditional_edges(
"classify",
route_query,
{
"vectorstore": "vectorstore",
"websearch": "websearch",
"direct": "direct",
}
)
# Routes to generation
graph.add_edge("vectorstore", "generate")
graph.add_edge("websearch", "generate")
graph.add_edge("direct", END)
graph.add_edge("generate", END)
adaptive_rag = graph.compile()Usage Examples
python
# Document question - routes to vectorstore
result = adaptive_rag.invoke({
"question": "What is Self-RAG?"
})
# Current events - routes to web search
result = adaptive_rag.invoke({
"question": "What are the latest AI breakthroughs in 2024?"
})
# Simple factual - direct answer
result = adaptive_rag.invoke({
"question": "What is 15 + 27?"
})Query Router Prompt
The router uses structured output to classify queries:
python
ROUTER_PROMPT = """You are an expert at routing questions.
Route questions as follows:
- vectorstore: Questions about specific documents or domain knowledge
- websearch: Questions about current events or recent information
- direct: Simple factual or math questions that don't need retrieval
Question: {question}
Route to:"""Benefits
| Aspect | Fixed RAG | Adaptive RAG |
|---|---|---|
| Simple questions | Full retrieval | Direct answer |
| Coverage | Local only | Multi-source |
| Efficiency | Same for all | Optimized per query |
| Latency | Higher | Lower (when skipping retrieval) |
Configuration
bash
# Environment variables
ADAPTIVE_RAG_ROUTER_MODEL=llama3.2:3b
ADAPTIVE_RAG_ENABLE_DIRECT=trueBest Practices
- Use small models for routing: Classification is lightweight
- Monitor routing decisions: Track which route is chosen
- Tune routing prompt: Adjust for your specific use case
- Add confidence scores: Route to multiple paths when uncertain
- Cache routing decisions: Reuse for similar queries
Performance Considerations
Adaptive RAG can significantly improve efficiency:
- Direct route: Saves ~2-3 seconds (no retrieval)
- Web search route: Fresher information for time-sensitive queries
- Vectorstore route: Best for domain-specific knowledge
Quiz
Test your understanding of Adaptive RAG:
Knowledge Check
What are the three routing options in Adaptive RAG?
Knowledge Check
Which type of question would be routed to the 'direct' path?
Knowledge Check
What is the primary efficiency benefit of Adaptive RAG compared to fixed RAG?
Knowledge Check
What component classifies the query type in Adaptive RAG?
Knowledge Check Fill In
Why is it recommended to use small models for query routing?