Skip to content

Setup Guide

This guide will help you set up your development environment for building LangGraph agents with local LLMs via Ollama. Follow these steps to ensure everything works before starting the tutorials.

Prerequisites

Before you begin, make sure you have:

1. Python 3.12 or Higher

Check your Python version:

bash
python --version
# or
python3 --version

If you need to install or upgrade Python:

  • macOS: brew install python@3.12
  • Ubuntu/Debian: sudo apt install python3.12 python3.12-venv python3-pip
  • Windows: Download from python.org

Why Python 3.12+?

This project uses modern Python features like type hints and async syntax that require Python 3.12 or newer. Python 3.13 is also supported.

2. Ollama Setup Options

You have two options for running Ollama with your agents. Choose based on your needs:

Recommended for Production & Learning

ollama-local-serve provides production-grade observability with real-time monitoring, request/response logging, GPU metrics, and LangChain integration. Perfect for development, debugging, and production deployments.

ollama-local-serve is a complete Ollama wrapper with built-in monitoring and observability:

Key Features:

  • Real-time monitoring dashboard with GPU metrics
  • Request/response logging and tracing
  • Prometheus metrics for production monitoring
  • Redis-backed response caching
  • Docker and Kubernetes ready
  • Built-in LangChain integration
  • LAN/network deployment support

Quick Start with pip:

bash
# Install with all features
pip install ollama-local-serve[all]

# Initialize configuration and start services
make init && make up

Quick Start with Docker:

bash
# Clone the repository
git clone https://github.com/AbhinaavRamesh/ollama-local-serve.git
cd ollama-local-serve

# Start all services
docker-compose up -d

Access Points:

  • Dashboard: http://localhost:3000 - Real-time monitoring UI
  • API Server: http://localhost:8000 - FastAPI with logging
  • Ollama: http://localhost:11434 - Standard Ollama API

LangChain Integration Example:

python
from langchain_community.llms import Ollama

# Use the monitored endpoint instead of direct Ollama
llm = Ollama(
    base_url="http://localhost:8000",  # ollama-local-serve endpoint
    model="llama3.2:3b"
)

# All requests are now logged and monitored
response = llm.invoke("Explain LangGraph")
# View request/response in dashboard at localhost:3000

Architecture Overview:

See the ollama-local-serve GitHub repository for full documentation.


Option 2: Basic Ollama Setup

Quick Prototyping Only

Basic Ollama setup is great for quick experiments but lacks monitoring, logging, and production features. Consider using ollama-local-serve for serious development.

For minimal setup without monitoring features:

Install Ollama:

macOS and Linux:

bash
curl -fsSL https://ollama.com/install.sh | sh

Windows: Download the installer from ollama.com

Verify Ollama is Running:

bash
ollama --version
# Should output: ollama version 0.x.x

# Check if the service is running
curl http://localhost:11434/api/version
# Should return JSON with version info

If Ollama isn't running, start it:

  • macOS/Windows: Ollama starts automatically on login
  • Linux: systemctl start ollama or run ollama serve in a terminal

Comparison: When to Use Each Option

Featureollama-local-serveBasic Ollama
Setup Time2-3 minutes1 minute
Monitoring Dashboard✅ Real-time UI❌ None
Request Logging✅ Full tracing❌ None
GPU Metrics✅ Yes❌ No
Response Caching✅ Redis-backed❌ No
Production Ready✅ Yes⚠️ Limited
Docker Support✅ Full support⚠️ Manual
Prometheus Metrics✅ Built-in❌ None
LAN Deployment✅ Easy⚠️ Manual
Debugging Tools✅ Extensive❌ None

Use ollama-local-serve if:

  • You want to monitor your agent's LLM calls
  • You're building production applications
  • You need request/response debugging
  • You want GPU utilization metrics
  • You're deploying to Docker/Kubernetes
  • You need response caching for performance

Use basic Ollama if:

  • You're doing quick one-off experiments
  • You don't need any monitoring
  • You want the absolute minimal setup

3. Git (for cloning the repository)

bash
git --version

If not installed:

  • macOS: brew install git
  • Ubuntu/Debian: sudo apt install git
  • Windows: Download from git-scm.com

Installation

The easiest way to get started is to install the package directly from PyPI:

bash
# Create and activate a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install with all features
pip install langgraph-ollama-local[all]

# Verify installation
langgraph-local check

Install only what you need:

bash
# Core patterns only (tutorials 1-7)
pip install langgraph-ollama-local

# Add RAG dependencies (tutorials 8-13)
pip install langgraph-ollama-local[rag]

# Add notebook support
pip install langgraph-ollama-local[notebooks]

# All features
pip install langgraph-ollama-local[all]

Option 2: Install from Source (For Development)

If you want to modify the code or contribute:

bash
git clone https://github.com/AbhinaavRamesh/langgraph-ollama-tutorial.git
cd langgraph-ollama-tutorial

# Create virtual environment
python -m venv venv

# Activate it
# On macOS/Linux:
source venv/bin/activate

# On Windows:
venv\Scripts\activate

# Install in development mode with all dependencies
pip install -e ".[all]"

You should see (venv) in your terminal prompt when activated.

What Gets Installed

The [all] option installs:

  • Core LangGraph and LangChain dependencies
  • RAG dependencies (ChromaDB, FAISS, embeddings)
  • Persistence backends (SQLite, Redis support)
  • Development tools (pytest, linting)
  • Jupyter notebook support

Verify Installation

bash
langgraph-local check

This command will:

  • Verify Ollama connection
  • Check if required models are available
  • Display your current configuration

Expected output:

Checking Ollama connection...
✓ Connected to Ollama at http://127.0.0.1:11434
✓ Ollama version: 0.x.x

Checking for models...
✗ Recommended model 'llama3.2:3b' not found

Configuration:
  OLLAMA_HOST: 127.0.0.1
  OLLAMA_PORT: 11434
  OLLAMA_MODEL: llama3.2:3b

Ollama Model Setup

For the best balance of speed and quality on consumer hardware, we recommend llama3.2:3b:

bash
ollama pull llama3.2:3b

This model:

  • Size: ~2GB download
  • RAM: Requires ~4GB
  • Speed: Fast responses on CPU
  • Quality: Good for most tutorial tasks

Verify the Model

bash
ollama list

You should see llama3.2:3b in the output:

NAME              ID              SIZE      MODIFIED
llama3.2:3b       a80c4f17acd5    2.0 GB    2 days ago

Alternative Models

Depending on your hardware, you might prefer:

ModelSizeRAMUse Case
llama3.2:1b1.3GB2GBFastest, limited capability
llama3.2:3b2.0GB4GBRecommended - best balance
llama3.1:8b4.7GB8GBBetter quality, slower
qwen2.5:7b4.7GB8GBGood for code tasks
llama3.1:70b40GB64GBHighest quality, GPU recommended

To use a different model:

bash
# Pull the model
ollama pull llama3.1:8b

# Set it as default in .env (see Configuration below)
OLLAMA_MODEL=llama3.1:8b

Model Performance

Smaller models (1b, 3b) work well for learning but may struggle with complex multi-agent scenarios. For production use, consider 7b or larger models.


Configuration

Create Environment File

bash
cp .env.example .env

Edit .env with your settings based on which Ollama option you chose:

For ollama-local-serve (Option 1):

bash
# Ollama Configuration
OLLAMA_HOST=127.0.0.1          # localhost or LAN server IP
OLLAMA_PORT=8000               # ollama-local-serve API port (for monitoring)
OLLAMA_MODEL=llama3.2:3b       # model to use by default

# Monitoring & Caching (ollama-local-serve)
MONITORING_ENABLED=true
REDIS_HOST=localhost
REDIS_PORT=6379

# Optional: Web Search (for CRAG tutorials 10, 13)
TAVILY_API_KEY=                # Get free key at tavily.com

For basic Ollama (Option 2):

bash
# Ollama Configuration
OLLAMA_HOST=127.0.0.1          # localhost or LAN server IP
OLLAMA_PORT=11434              # default Ollama port
OLLAMA_MODEL=llama3.2:3b       # model to use by default

# Optional: Web Search (for CRAG tutorials 10, 13)
TAVILY_API_KEY=                # Get free key at tavily.com

# Monitoring disabled for basic setup
MONITORING_ENABLED=false

Configuration Options

VariableDefaultDescription
OLLAMA_HOST127.0.0.1Ollama server IP address
OLLAMA_PORT11434 or 8000Use 8000 for ollama-local-serve, 11434 for basic Ollama
OLLAMA_MODELllama3.2:3bDefault model for all agents
MONITORING_ENABLEDfalseEnable monitoring features (ollama-local-serve only)
REDIS_HOSTlocalhostRedis server for caching (ollama-local-serve only)
REDIS_PORT6379Redis port (ollama-local-serve only)
TAVILY_API_KEY(optional)Web search API key for CRAG tutorials
LOG_LEVELINFOLogging level: DEBUG, INFO, WARNING

Verify Configuration

bash
langgraph-local config

Expected output (with ollama-local-serve):

Current Configuration:
  Ollama Host: http://127.0.0.1:8000
  Default Model: llama3.2:3b
  Web Search: Disabled (no Tavily key)
  Monitoring: Enabled
  Redis: localhost:6379

Expected output (basic Ollama):

Current Configuration:
  Ollama Host: http://127.0.0.1:11434
  Default Model: llama3.2:3b
  Web Search: Disabled (no Tavily key)
  Monitoring: Disabled

Optional: Web Search Setup

Tutorials 10 (CRAG) and 13 (Perplexity Clone) use web search for retrieving external information. This is optional but recommended for the full experience.

Get a Tavily API Key

  1. Sign up at tavily.com
  2. Get your free API key (1000 searches/month)
  3. Add to .env:
bash
TAVILY_API_KEY=tvly-your-key-here
python
from tavily import TavilyClient

client = TavilyClient(api_key="tvly-your-key")
results = client.search("What is LangGraph?")
print(results)

Alternative Search Tools

You can also use DuckDuckGo (no API key required) or Google Custom Search. See the CRAG tutorial for alternatives.


Optional: LAN Server Setup

Running Ollama on a GPU-equipped server and accessing it from laptops/workstations? This is a common setup for teams or when you want to centralize GPU resources.

Best for LAN Deployments

ollama-local-serve is specifically designed for LAN deployments with built-in monitoring and multi-client support. See Option 1 above for features and benefits.

On your GPU server:

bash
# Install ollama-local-serve
pip install ollama-local-serve[all]

# Configure for LAN access
# Edit .env to set OLLAMA_HOST=0.0.0.0
make init

# Start all services (Ollama, FastAPI, Dashboard, Redis)
make up

# Access dashboard at http://your-server-ip:3000

On your development machine(s):

Update your project's .env:

bash
OLLAMA_HOST=192.168.1.100      # your GPU server IP
OLLAMA_PORT=8000               # ollama-local-serve API port (for monitoring)
# OR use port 11434 for direct Ollama access (no monitoring)

MONITORING_ENABLED=true
REDIS_HOST=192.168.1.100       # for shared caching

Benefits of LAN Setup with ollama-local-serve:

  • All team members can monitor LLM usage in real-time
  • Shared Redis cache improves response times
  • Centralized logging and metrics
  • GPU utilization visible to everyone
  • No configuration changes needed in your LangGraph code

Manual Ollama LAN Setup

If you prefer the basic Ollama setup without monitoring:

On the server:

bash
# Configure Ollama to bind to all interfaces
export OLLAMA_HOST=0.0.0.0:11434
ollama serve

On your development machine:

bash
# Test connection
curl http://your-server-ip:11434/api/version

# Update .env
OLLAMA_HOST=your-server-ip
OLLAMA_PORT=11434

Security Note

Ollama and ollama-local-serve have no built-in authentication. Only expose on trusted networks or use a VPN/SSH tunnel for remote access. Consider using a reverse proxy with authentication for production deployments.


Verifying Installation

Quick Test

Create a simple test file test_setup.py:

python
from langgraph_ollama_local import LocalAgentConfig
from langchain_core.messages import HumanMessage

# Initialize
config = LocalAgentConfig()
llm = config.create_chat_client()

# Test basic call
response = llm.invoke([HumanMessage(content="Say hello!")])
print(response.content)

Run it:

bash
python test_setup.py

Expected output:

Hello! How can I help you today?

Run Example Notebook

bash
# Start Jupyter
jupyter lab

# Open examples/core_patterns/01_chatbot_basics.ipynb

Run all cells. If everything works, you're ready to start the tutorials!


Troubleshooting

Ollama Connection Issues

Problem: Connection refused to localhost:11434

Solutions:

  • Check if Ollama is running: curl http://localhost:11434/api/version
  • Restart Ollama: ollama serve (Linux) or restart the app (macOS/Windows)
  • Check firewall settings if using a remote server

Model Not Found

Problem: Model 'llama3.2:3b' not found

Solutions:

bash
# List available models
ollama list

# Pull the model
ollama pull llama3.2:3b

# Or use a different model
OLLAMA_MODEL=llama3.1:8b

Import Errors

Problem: ModuleNotFoundError: No module named 'langgraph'

Solutions:

bash
# Make sure virtual environment is activated
source venv/bin/activate  # or venv\Scripts\activate on Windows

# Reinstall
pip install -e ".[all]"

# Or install specific dependencies
pip install langgraph langchain-core

Slow Generation

Problem: Model takes too long to respond

Solutions:

  • Use a smaller model: ollama pull llama3.2:1b
  • Check CPU/memory usage: top or htop
  • Consider using a GPU server with LAN setup
  • Reduce context length in configuration

Memory Errors

Problem: RuntimeError: Out of memory

Solutions:

  • Switch to a smaller model (1b instead of 3b)
  • Close other applications
  • Increase system swap/page file
  • Use quantized models: ollama pull llama3.2:3b-q4_0

Jupyter Kernel Issues

Problem: Jupyter can't find the installed packages

Solutions:

bash
# Install ipykernel in your venv
pip install ipykernel

# Create kernel spec
python -m ipykernel install --user --name=langgraph-env

# Select 'langgraph-env' as kernel in Jupyter

Next Steps

Once everything is installed and verified:

  1. Start with Tutorial 01: Chatbot Basics
  2. Browse all tutorials: Tutorial Index
  3. Join the community: GitHub Discussions

Additional Resources

Need Help?


Ready to start building? Head to Tutorial 01: Chatbot Basics to begin your LangGraph journey!