Dev Corner AI & Intelligence LangGraph & LangChain

LangGraph Tutorial 2026: Build Stateful AI Agents with Ollama

🟡Intermediate

Build production-grade stateful AI agents in 30 minutes: master graph-based workflows, tool nodes, human-in-the-loop checkpoints, and persistent memory with local Ollama. No cloud APIs required.

Author

Kofi Mensah

Inference Economics & Hardware Architect

Published

May 18, 2026

Duration

Reading

20 min

Build

35 min

Article Roadmap

Key Takeaways

LangGraph represents agent workflows as directed graphs — nodes are Python functions (LLM calls or tool executions), edges route between them based on state conditions, and the StateGraph manages typed state flowing through the graph.
Conditional edges are LangGraph's branching mechanism — 'add_conditional_edges(node, routing_function)' calls routing_function with the current state and routes to the returned node name, enabling dynamic agent loops and decision trees.
Checkpointing with SqliteSaver or PostgresSaver enables conversation persistence across sessions — the same thread_id in the config dict resumes from exactly where the agent left off, with full message history intact.
Human-in-the-loop (HITL) requires compiling the graph with 'interrupt_before=['sensitive_node']' — the agent pauses before executing that node and returns control to your code, allowing review before high-stakes actions like sending emails or modifying databases.

Key Takeaways

Graphs, not chains: LangGraph uses a directed graph (nodes + edges) instead of sequential chains — enabling loops, conditionals, and parallel branches that linear chains can’t express.
State is typed: Define your state as a TypedDict or Pydantic model — type safety prevents the “wrong key name” bugs that plague untyped agent frameworks.
compile() creates a runnable: After defining nodes and edges, graph.compile() returns an object with .invoke(), .stream(), and .astream() — invoke it like any other callable.
Interrupts for HITL: interrupt_before=["node_name"] pauses the graph before sensitive nodes — the agent shows its plan and waits for human confirmation before acting.

Introduction

Direct Answer: How do I build a stateful AI agent with LangGraph and local Ollama in 2026?

Install with pip install langgraph langchain-ollama langchain. Connect to a local Ollama model: from langchain_ollama import ChatOllama; llm = ChatOllama(model="qwen3:14b"). Define state: from langgraph.graph import StateGraph, MessagesState. Add nodes: graph = StateGraph(MessagesState); graph.add_node("agent", lambda state: {"messages": [llm.invoke(state["messages"])]}). Add edges: graph.set_entry_point("agent"); graph.add_edge("agent", END). Compile and invoke: app = graph.compile(); result = app.invoke({"messages": [HumanMessage("hello")]}). For tool-using agents, use llm.bind_tools(tools) and add a ToolNode that executes tool calls, with a conditional edge routing between the agent node and the tool node based on whether the last message contains tool calls.

LangGraph State Machine Diagram

SIMPLE AGENT (No Tools):
┌─────────────────────┐
│  INPUT STATE        │
│ messages: [...]     │
└──────────┬──────────┘
           │
           ↓
      [ENTRY POINT]
           │
           ↓
    ┌─────────────┐
    │   LLM Node  │  (call_llm function)
    │ - read msgs │
    │ - invoke    │  
    │ - append    │
    └─────────────┘
           │
           ↓
        [END]
           │
           ↓
┌─────────────────────┐
│  OUTPUT STATE       │
│ messages: [...+AI]  │
└─────────────────────┘

---

REACT AGENT (With Tools):
┌─────────────────────┐
│  INPUT STATE        │
│ messages: [...]     │
└──────────┬──────────┘
           │
           ↓
      [ENTRY POINT]
           │
           ↓
    ┌─────────────────┐
    │  AGENT Node     │  (LLM decides: should I use a tool?)
    │  - read msgs    │
    │  - invoke LLM   │
    │  - append AI    │
    └────────┬────────┘
             │
    ┌────────┴────────┐
    │                 │
    ↓                 ↓
Does last msg    No tool calls
have tool_calls?  (just answer)
    │                 │
   YES                ↓
    │              [END] → Output
    ↓
┌──────────────────┐
│  TOOL Node       │  (execute_tools function)
│  - read tool call│
│  - execute tool  │
│  - append result │
└────────┬─────────┘
         │
         ↓
    Loop back to AGENT NODE (with tool result in messages)
         │
         ↓
    (repeat until tool_calls=None)
         │
         ↓
      [END] → Output

Setup

pip install langgraph langchain-ollama langchain-core --break-system-packages
python3 -c "import langgraph; print('LangGraph:', langgraph.__version__)"

Expected output:

LangGraph: 0.3.4

Part 1: Simple Stateful Agent — From Concept to Code

# 01_simple_agent.py — The simplest LangGraph: single node that loops on demand
# Why StateGraph? Linear chains can't loop, conditionally branch, or handle interrupts.
# Graphs enable sophisticated agent patterns: ReAct loops, tool-use agents, human-in-the-loop.

from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage, SystemMessage
from langgraph.graph import StateGraph, MessagesState, END

# Connect to local Ollama (zero cloud APIs)
llm = ChatOllama(model="qwen3:14b", temperature=0)

def call_llm(state: MessagesState) -> dict:
    """
    Node function: takes current state → returns updated state.
    
    Why a function? Each node in the graph is a callable that:
      1) Receives state (conversation history)
      2) Does something (LLM call, tool execution, etc.)
      3) Returns updated state (appends new message to history)
    
    The graph framework passes state between nodes automatically.
    """
    # Build message list: system prompt + conversation history
    system = SystemMessage(content="You are a helpful assistant. Be concise.")
    messages = [system] + state["messages"]  # state["messages"] is a list of Message objects
    
    # LLM call: invoke with full message history, returns single response
    response = llm.invoke(messages)
    
    # Return updated state (append LLM response to message history)
    # The graph framework merges this dict into the state automatically
    return {"messages": [response]}

# === Build the graph ===
graph = StateGraph(MessagesState)  # MessagesState is TypedDict with messages: list
graph.add_node("llm", call_llm)    # Add node named "llm" that runs call_llm function
graph.set_entry_point("llm")       # Start here when graph is invoked
graph.add_edge("llm", END)         # After LLM node, end (no loops yet)

# Compile = creates runnable execution engine with .invoke(), .stream(), etc.
app = graph.compile()

# === Invoke the graph ===
result = app.invoke({"messages": [HumanMessage("What is LangGraph?")]})
# Returns: {"messages": [HumanMessage(...), AIMessage("LangGraph is...")]}
print(result["messages"][-1].content)  # Print last message (the AI response)

Why this structure?

StateGraph: Provides graph semantics (nodes, edges, state management) that chains can’t offer
MessagesState: Typed dictionary ensures you don’t have typos like messages vs message_history
Node functions: Each node is a Python function — testable, debuggable, reusable
compile(): Transforms graph definition into an executable object with async support

Expected output:

LangGraph is a library for building stateful, multi-actor AI applications using graph-based workflows, built on top of LangChain.

Part 2: ReAct Agent with Tools — Adding Tool Calling to the Loop

# 02_react_agent.py — Agent that can think (LLM) and act (call tools) in a loop
# ReAct = Reasoning + Acting: LLM decides when to use tools, we execute them, LLM reasons about results

from langchain_ollama import ChatOllama
from langchain_core.tools import tool  # Decorator to mark functions as LLM tools
from langchain_core.messages import HumanMessage
from langgraph.graph import StateGraph, MessagesState, END
from langgraph.prebuilt import ToolNode  # Built-in node that executes tools

# === Define tools ===
# Each @tool becomes callable by the LLM if you call llm.bind_tools(tools)

@tool
def get_server_info() -> str:
    """
    Get current server CPU and memory usage.
    
    LLM will read this docstring and description to decide when to call it.
    Tool description should be: what it does, when to use it, what it returns.
    """
    import subprocess
    
    # Get CPU load average (1-minute load)
    cpu = subprocess.run(
        ["awk", "{print $1}", "/proc/loadavg"],
        capture_output=True, text=True
    ).stdout.strip()
    
    # Get memory usage (used / total)
    mem = subprocess.run(
        ["free", "-h"],
        capture_output=True, text=True
    ).stdout.split("\n")[1].split()
    
    return f"CPU load: {cpu} | Memory: {mem[2]} used of {mem[1]}"

@tool
def count_processes() -> str:
    """Count the number of running processes on this server."""
    import subprocess
    
    # ps aux lists all processes; we count non-header lines
    count = subprocess.run(["ps", "aux", "--no-header"], capture_output=True, text=True)
    return f"Running processes: {len(count.stdout.strip().splitlines())}"

# List of tools the agent can use
tools = [get_server_info, count_processes]

# LLM that can call tools (bind_tools attaches tool definitions to the model)
llm = ChatOllama(model="llama4:scout", temperature=0)  # Llama 4 Scout: best open-weight tool-caller
llm_with_tools = llm.bind_tools(tools)  # Now this LLM will emit tool_call messages when appropriate

def call_llm(state: MessagesState) -> dict:
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

def should_continue(state: MessagesState) -> str:
    last = state["messages"][-1]
    return "tools" if getattr(last, "tool_calls", None) else END

tool_node = ToolNode(tools)

graph = StateGraph(MessagesState)
graph.add_node("llm", call_llm)
graph.add_node("tools", tool_node)
graph.set_entry_point("llm")
graph.add_conditional_edges("llm", should_continue)
graph.add_edge("tools", "llm")

app = graph.compile()

result = app.invoke({
    "messages": [HumanMessage("Check my server's current load and tell me how many processes are running.")]
})
print(result["messages"][-1].content)
print(f"\nTotal messages: {len(result['messages'])}")

Expected output:

Your server's current CPU load is 0.42 with 347 running processes. The load is low and memory
usage appears normal based on the information retrieved.

Total messages: 5

5 messages: Human → AI (tool calls) → Tool (server info) → Tool (process count) → AI (final answer).

Part 3: Custom State with Typed Dict

# 03_custom_state.py
from typing import TypedDict, List, Optional, Annotated
from langgraph.graph import StateGraph, END
from langchain_ollama import ChatOllama
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
import operator, ollama

class ResearchState(TypedDict):
    topic: str
    messages: Annotated[List[BaseMessage], operator.add]  # Messages accumulate
    research_notes: List[str]
    final_report: Optional[str]
    iteration: int

llm = ChatOllama(model="qwen3:14b", temperature=0.3)

def research_node(state: ResearchState) -> dict:
    prompt = f"Research the topic: {state['topic']}. Provide 3 key facts. Be specific."
    response = llm.invoke([HumanMessage(prompt)])
    return {
        "messages": [response],
        "research_notes": state["research_notes"] + [response.content],
        "iteration": state["iteration"] + 1
    }

def write_report_node(state: ResearchState) -> dict:
    notes = "\n\n".join(state["research_notes"])
    prompt = f"Based on these research notes, write a 2-paragraph summary:\n\n{notes}"
    response = llm.invoke([HumanMessage(prompt)])
    return {"messages": [response], "final_report": response.content}

def router(state: ResearchState) -> str:
    if state["iteration"] < 2:
        return "research"   # Do more research
    return "write"          # Enough research, write the report

graph = StateGraph(ResearchState)
graph.add_node("research", research_node)
graph.add_node("write", write_report_node)
graph.set_entry_point("research")
graph.add_conditional_edges("research", router)
graph.add_edge("write", END)
app = graph.compile()

result = app.invoke({
    "topic": "pgvector HNSW index performance in PostgreSQL 17",
    "messages": [],
    "research_notes": [],
    "final_report": None,
    "iteration": 0
})

print(f"Research iterations: {result['iteration']}")
print(f"Notes collected: {len(result['research_notes'])}")
print("\nFinal Report:")
print(result["final_report"])

Expected output:

Research iterations: 2
Notes collected: 2

Final Report:
pgvector 0.8 introduced HNSW (Hierarchical Navigable Small World) indexing as the primary
production index type for vector similarity search in PostgreSQL 17. HNSW achieves 95-97%
recall at query latencies of 3-5ms for collections under 1M vectors...

Part 4: Human-in-the-Loop with Checkpointing

# 04_hitl.py
from langgraph.checkpoint.sqlite import SqliteSaver
from langgraph.graph import StateGraph, MessagesState, END, interrupt
from langchain_ollama import ChatOllama
from langchain_core.messages import HumanMessage, AIMessage
from langchain_core.tools import tool

@tool
def send_email(to: str, subject: str, body: str) -> str:
    """Send an email. This action is irreversible."""
    # In production: actually send the email
    return f"Email sent to {to}: '{subject}'"

llm = ChatOllama(model="llama4:scout", temperature=0)
llm_with_tools = llm.bind_tools([send_email])

def agent_node(state: MessagesState) -> dict:
    response = llm_with_tools.invoke(state["messages"])
    return {"messages": [response]}

def should_continue(state: MessagesState) -> str:
    last = state["messages"][-1]
    if getattr(last, "tool_calls", None):
        return "review"   # Human review before tool execution
    return END

def review_node(state: MessagesState) -> dict:
    """HITL: agent pauses here for human approval."""
    last = state["messages"][-1]
    print("\n[HUMAN REVIEW REQUIRED]")
    for tc in last.tool_calls:
        print(f"  Tool: {tc['name']}")
        print(f"  Args: {tc['args']}")
    approval = input("Approve? (y/n): ").strip().lower()
    if approval != "y":
        return {"messages": [AIMessage(content="Action cancelled by user.")]}
    return state   # Unchanged state → continues to tool execution

from langgraph.prebuilt import ToolNode
tool_node = ToolNode([send_email])

memory = SqliteSaver.from_conn_string("checkpoints.db")

graph = StateGraph(MessagesState)
graph.add_node("agent",  agent_node)
graph.add_node("review", review_node)
graph.add_node("tools",  tool_node)
graph.set_entry_point("agent")
graph.add_conditional_edges("agent", should_continue)
graph.add_edge("review", "tools")
graph.add_edge("tools", "agent")

app = graph.compile(checkpointer=memory)

# Session with persistent memory
config = {"configurable": {"thread_id": "email-session-001"}}

result = app.invoke(
    {"messages": [HumanMessage("Send a welcome email to [email protected]")]},
    config=config
)
# Agent pauses at review_node and prompts for human approval

Expected output:

[HUMAN REVIEW REQUIRED]
  Tool: send_email
  Args: {'to': '[email protected]', 'subject': 'Welcome!', 'body': 'Welcome to our platform...'}
Approve? (y/n): y

[Email sent successfully]

Troubleshooting

`ValidationError: 1 validation error for MessagesState`

Cause: Adding a non-BaseMessage object to the messages list. Fix: Wrap all messages in HumanMessage(), AIMessage(), or ToolMessage() — never use plain dicts.

Graph runs forever (infinite loop)

Cause: Conditional edge always returns the same node. Fix: Add a safety counter: if state.get("iteration", 0) > 10: return END in your routing function.

`ModuleNotFoundError: langgraph.checkpoint.sqlite`

Fix: pip install langgraph[sqlite] --break-system-packages to install with SQLite checkpoint support.

LangGraph vs CrewAI vs AutoGen: Framework Comparison

Feature	LangGraph	CrewAI	AutoGen	LangChain
Graph-based workflows	✅ Native	❌ Role-based only	⚠️ Custom loops	❌ Chains only
Conditional edges	✅ Built-in	⚠️ Via Router	⚠️ Custom code	❌ Not directly
Tool use (function calling)	✅ ToolNode + tool_executor	✅ Agent tools	✅ Tool_use patterns	✅ Via bind_tools()
Checkpointing/memory	✅ PostgreSQL, SQLite	❌ No native checkpointer	⚠️ Custom storage	❌ Not native
Human-in-the-loop	✅ `interrupt_before=[]`	❌ Complex workaround	⚠️ Custom pause logic	❌ No built-in
Local LLM support (Ollama)	✅ via langchain-ollama	✅ Partial	✅ Partial	✅ Native
Learning curve	Moderate	Steep (role semantics)	Steep (AG2 syntax)	Shallow
Production-ready	✅ Yes (0.3.x stable)	✅ Yes	✅ Yes	✅ Yes
Best use case	Complex stateful agents	Multi-role teams	Group conversations	Simple pipelines

Winner for sovereign agents: LangGraph — native graph control + checkpointing + local LLM support = production-grade agentic systems with full data privacy.

Agent Loop in Action — Step-by-Step Walkthrough

When you call app.invoke({"messages": [HumanMessage("What's the CPU load?")]}), here’s what happens:

STEP 1: INVOKE
  Input: {"messages": [HumanMessage("What's the CPU load?")]}
  
STEP 2: AGENT NODE RUNS
  - State message history: [Human("What's the CPU load?")]
  - LLM reads messages
  - LLM sees tool definitions (get_server_info, count_processes)
  - LLM decides: "I should call get_server_info tool"
  - LLM returns: AIMessage with tool_calls=[ToolCall(name="get_server_info", ...)]
  - State updated: [Human(...), AI(tool_calls=[...])]
  
STEP 3: CONDITIONAL EDGE CHECK
  - Does last message have tool_calls?
  - YES → route to TOOL NODE
  - NO → route to END (done)
  
STEP 4: TOOL NODE EXECUTES
  - Read tool_calls from AI message
  - Execute: subprocess.run(...) to get server CPU/memory
  - Return result: "CPU load: 2.45 | Memory: 8GB used of 16GB"
  - Create ToolMessage with result
  - State updated: [Human(...), AI(tool_calls=[...]), Tool(result="...")]
  
STEP 5: LOOP BACK TO AGENT NODE
  - Agent reads updated message history (now includes tool result)
  - Agent responds: AIMessage("The CPU load is 2.45...")
  - Does this new message have tool_calls? NO
  - Route to END
  
STEP 6: END
  - Return final state: [Human(...), AI(tool_calls=[...]), Tool(...), AI(final answer)]
  - User sees: "The CPU load is 2.45..."
  
TOTAL TIME: ~500ms (local Ollama inference + subprocess call)

Tool Execution Error Handling

Real tools fail. Network timeouts, permission denied, subprocess crashes. Here’s how to handle them:

┌──────────────────────┐
│  Tool Call Received  │
│ (get_server_info)    │
└──────────┬───────────┘
           │
           ↓
    ┌─────────────┐
    │ EXECUTE     │
    │ subprocess  │
    └──────┬──────┘
           │
    ┌──────┴─────────┐
    │                │
    ↓                ↓
 SUCCESS        FAILURE
   │                │
   │          ┌─────────────────┐
   │          │ ERROR TYPE?     │
   │          └────┬────────────┘
   │               │
   │        ┌──────┴──────────┐
   │        │                 │
   │        ↓                 ↓
   │     RETRY-ABLE      PERMANENT
   │     (timeout,        (permission,
   │      network)        not found)
   │        │                 │
   │        ↓                 ↓
   │    RETRY x3          FAIL: Return
   │ exponential          error message
   │  backoff             to user
   │        │                 │
   │        ↓                 ↓
   │    ┌────────────────────┐
   └────→ Return result      │
        (to agent) ← FAIL msg
           │
           ↓
        ┌────────────────────┐
        │ Agent reflects on  │
        │ result + continues │
        └────────────────────┘

Part 5: Production Patterns & Error Recovery

Retry Logic with Exponential Backoff

# Production-grade LangGraph with automatic retries
import asyncio
from langgraph.graph import StateGraph, MessagesState, END
from langchain_core.messages import HumanMessage

async def safe_tool_call_with_retry(tool, input_data, max_retries=3):
    """Call tool with exponential backoff retry."""
    for attempt in range(max_retries):
        try:
            return await asyncio.to_thread(tool.invoke, input_data)
        except Exception as e:
            if attempt == max_retries - 1:
                raise  # Final attempt failed
            wait_time = 2 ** attempt  # 1s, 2s, 4s
            print(f"Tool error: {e}. Retrying in {wait_time}s...")
            await asyncio.sleep(wait_time)

def create_resilient_tool_node(tools_dict):
    """ToolNode that retries failed calls."""
    async def resilient_execute_tools(state: MessagesState):
        last_message = state["messages"][-1]
        tool_calls = last_message.tool_calls
        results = []
        
        for tool_call in tool_calls:
            tool_name = tool_call["name"]
            tool = tools_dict.get(tool_name)
            if not tool:
                results.append({"error": f"Unknown tool: {tool_name}"})
                continue
            
            try:
                result = await safe_tool_call_with_retry(tool, tool_call["args"])
                results.append({"success": True, "data": result})
            except Exception as e:
                results.append({"error": str(e), "tool": tool_name})
        
        return {"messages": [HumanMessage(content=str(results), name="tools")]}
    
    return resilient_execute_tools

Timeout Enforcement & Circuit Breakers

from threading import Event

class CircuitBreaker:
    """Prevent cascading failures when downstream services are down."""
    
    def __init__(self, failure_threshold=5, timeout=60):
        self.failure_count = 0
        self.failure_threshold = failure_threshold
        self.timeout = timeout
        self.last_failure_time = None
        self.is_open = False
    
    def call(self, func, *args, **kwargs):
        """Execute func with circuit breaker protection."""
        if self.is_open:
            # Check if timeout has passed; try to recover
            if time.time() - self.last_failure_time > self.timeout:
                self.is_open = False
                self.failure_count = 0
            else:
                raise Exception("Circuit breaker open: downstream service unavailable")
        
        try:
            result = func(*args, **kwargs)
            self.failure_count = 0  # Reset on success
            return result
        except Exception as e:
            self.failure_count += 1
            self.last_failure_time = time.time()
            if self.failure_count >= self.failure_threshold:
                self.is_open = True
            raise

# Usage: protect database calls
db_breaker = CircuitBreaker(failure_threshold=5)

def tool_query_database(query: str):
    return db_breaker.call(execute_query, query)

Part 6: Monitoring & Observability

Comprehensive Agent Logging

import logging
from datetime import datetime
import json

class AgentLogger:
    """Production logging for agent state and metrics."""
    
    def __init__(self, filepath: str):
        self.filepath = filepath
        self.logger = logging.getLogger("langgraph_agent")
        handler = logging.FileHandler(filepath)
        handler.setFormatter(logging.Formatter(
            '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
        ))
        self.logger.addHandler(handler)
        self.logger.setLevel(logging.DEBUG)
    
    def log_graph_run(self, graph_id: str, initial_input: dict, result: dict, duration_ms: float):
        """Log complete graph execution."""
        self.logger.info(json.dumps({
            "event": "graph_run",
            "graph_id": graph_id,
            "duration_ms": duration_ms,
            "input_tokens": initial_input.get("prompt_length", 0),
            "output_length": len(str(result)),
            "timestamp": datetime.now().isoformat()
        }))
    
    def log_error(self, graph_id: str, node_name: str, error: Exception):
        """Log node-level errors."""
        self.logger.error(json.dumps({
            "event": "node_error",
            "graph_id": graph_id,
            "node": node_name,
            "error_type": type(error).__name__,
            "error_message": str(error),
            "timestamp": datetime.now().isoformat()
        }))
    
    def log_tool_use(self, graph_id: str, tool_name: str, input_data: dict, output: str, duration_ms: float):
        """Log tool execution."""
        self.logger.info(json.dumps({
            "event": "tool_execution",
            "graph_id": graph_id,
            "tool": tool_name,
            "input_length": len(str(input_data)),
            "output_length": len(output),
            "duration_ms": duration_ms,
            "timestamp": datetime.now().isoformat()
        }))

# Production usage
logger = AgentLogger("/var/log/langgraph_agent.json")

# In your graph:
start_time = time.time()
result = app.invoke({"messages": [...]}, config={"configurable": {"thread_id": unique_id}})
duration_ms = (time.time() - start_time) * 1000
logger.log_graph_run(unique_id, {"prompt_length": len(prompt)}, result, duration_ms)

Part 7: Cost Analysis & Resource Optimization

Token-Level Cost Tracking for Agents

Single LangGraph Agent Query Breakdown:

Local Ollama (Qwen3 14B Quantized):
  System prompt: 100 tokens
  Agent loop iteration 1:
    - LLM call: 200 input tokens → 150 output tokens
    - Tool invocation (no LLM cost)
  Agent loop iteration 2:
    - LLM call: 300 input tokens → 100 output tokens
  Total: 600 tokens processed
  Cost: $0.00 (zero, runs locally)
  Latency: 3-5 seconds (10-20 tok/sec × 300 tokens)

Cloud API (GPT-4o):
  Same 2-iteration loop: 600 input + 250 output tokens
  Cost: (600 × $0.005 + 250 × $0.015) / 1000 = $0.0060
  Latency: 2-3 seconds (faster, but costs accumulate)

Scaling to 1M agents/day:
  Local: $0/day, 10M seconds compute (≈ 10 servers @ $50/mo)
  Cloud: $6,000/day, 5M seconds (pay API directly)
  
Annual cost:
  Local: $6,000/year (infrastructure)
  Cloud: $2.19M/year (API fees)
  Savings: $2.184M/year for sovereign approach

Memory Optimization

# Large conversation histories blow up context windows
# Solution: Summarize before checkpointing

async def summarize_before_checkpoint(state: MessagesState):
    """Compress conversation history before saving checkpoint."""
    if len(state["messages"]) > 20:
        # Keep only recent 5 messages + summarize rest
        old_messages = state["messages"][:-5]
        recent_messages = state["messages"][-5:]
        
        # Summarize old messages (1 LLM call vs storing 1000s of tokens)
        summary_prompt = f"Summarize this conversation: {old_messages}"
        summary = llm.invoke(summary_prompt)
        
        # Return compressed state
        return {
            "messages": [HumanMessage(content=f"[Summary] {summary}"), *recent_messages]
        }
    return state

Part 8: Geo-Specific Deployment Considerations

# PostgreSQL checkpoint storage with encryption
apiVersion: v1
kind: Secret
metadata:
  name: langgraph-postgres-eu
  namespace: agents
type: Opaque
stringData:
  connection_string: "postgresql://user:[email protected]:5432/langgraph_db?sslmode=require"

# In Python:
from langgraph.checkpoint.postgres import PostgresSaver
import os

eu_checkpoint = PostgresSaver.from_conn_string(
    os.getenv("POSTGRES_EU_CONNECTION"),
    sslmode="require"  # Force encryption in transit
)

# GDPR: Delete user data on request
async def delete_agent_history(user_id: str):
    """Right to be forgotten: purge all agent runs for a user."""
    await eu_checkpoint.delete_thread_history(user_id)

Part 9: Distributed Agent Orchestration

For multi-agent systems (10+ agents running simultaneously):

from concurrent.futures import ThreadPoolExecutor

class AgentPool:
    """Run multiple agents in parallel with resource limits."""
    
    def __init__(self, num_workers: int = 4):
        self.executor = ThreadPoolExecutor(max_workers=num_workers)
        self.results = {}
    
    def submit_agent(self, agent_id: str, app, input_data: dict):
        """Queue agent for execution."""
        future = self.executor.submit(lambda: app.invoke(input_data))
        self.results[agent_id] = future
    
    def get_result(self, agent_id: str, timeout_s: int = 30):
        """Retrieve result with timeout."""
        return self.results[agent_id].result(timeout=timeout_s)

# Production: 5 agents processing queries in parallel
pool = AgentPool(num_workers=5)
for i in range(5):
    pool.submit_agent(f"agent_{i}", app, {"messages": [HumanMessage("analyze this")]})

# Collect results with error handling
results = {}
for agent_id in range(5):
    try:
        results[f"agent_{agent_id}"] = pool.get_result(f"agent_{agent_id}")
    except Exception as e:
        results[f"agent_{agent_id}"] = {"error": str(e)}

Conclusion

LangGraph 0.3 provides the graph primitives — nodes, edges, state, checkpointers, interrupts — to build any agentic workflow from simple loops to complex multi-agent systems. All examples run against local Ollama models: zero cloud API cost, zero data leakage.

See AI Agent Design Patterns 2026 for the conceptual framework behind these patterns, and LangChain and LangGraph Sovereign Harness for broader LangChain ecosystem patterns.

AI Agent Design Patterns 2026: Reflection, Tool Use, Planning & Multi-Agent

>_ 13 May | 18 min | Dev Corner

🟡Intermediate

Build sovereign AI agents from first principles. Covers the four agentic design patterns — reflection, tool use, planning, and multi-agent — with Python code examples using local Ollama models.

By Kofi Mensah

CrewAI Tutorial 2026: Multi-Agent Systems with Local Ollama

>_ 15 May | 24 min | Dev Corner

🟡Intermediate

Build sovereign multi-agent crews with CrewAI and local Ollama models. Covers role-based agents, task delegation, crew orchestration, tool integration, and production deployment with zero cloud APIs.

By Kofi Mensah

Build a Sovereign Coding Agent with LangChain Deep Agents: The 2026 Harness

>_ 20 Mar | 18 min | Dev Corner

🟡Intermediate

Build a high-autonomy agent using LangChain Deep Agents with planning, filesystem access, and subagent spawning. 100% local-first with MCP support.

By Divya Prakash

#langgraph #langchain #ollama #ai-agents #stateful #python #dev-corner #2026

Key Takeaways

Introduction

LangGraph State Machine Diagram

Setup

Part 1: Simple Stateful Agent — From Concept to Code

Part 2: ReAct Agent with Tools — Adding Tool Calling to the Loop

Part 3: Custom State with Typed Dict

Part 4: Human-in-the-Loop with Checkpointing

Troubleshooting

ValidationError: 1 validation error for MessagesState

Graph runs forever (infinite loop)

ModuleNotFoundError: langgraph.checkpoint.sqlite

LangGraph vs CrewAI vs AutoGen: Framework Comparison

Agent Loop in Action — Step-by-Step Walkthrough

Tool Execution Error Handling

Part 5: Production Patterns & Error Recovery

Retry Logic with Exponential Backoff

Timeout Enforcement & Circuit Breakers

Part 6: Monitoring & Observability

Comprehensive Agent Logging

Part 7: Cost Analysis & Resource Optimization

Token-Level Cost Tracking for Agents

Memory Optimization

Part 8: Geo-Specific Deployment Considerations

EU GDPR Compliance for Agent Checkpoints

Part 9: Distributed Agent Orchestration

Conclusion

People Also Ask

What is the difference between LangGraph and LangChain?

Does LangGraph work with models other than Ollama?

Further Reading

Get the Sovereign Stack Playbook

You're in — welcome to the community!

Further Reading

AI Agent Design Patterns 2026: Reflection, Tool Use, Planning & Multi-Agent

CrewAI Tutorial 2026: Multi-Agent Systems with Local Ollama

Build a Sovereign Coding Agent with LangChain Deep Agents: The 2026 Harness

Get the Sovereign Stack Playbook

You're in — welcome!

Comments

Recently Visited

`ValidationError: 1 validation error for MessagesState`

`ModuleNotFoundError: langgraph.checkpoint.sqlite`