Key Takeaways
- Agents have personas: Role + goal + backstory shapes how the LLM approaches each task — a “Senior Security Researcher” agent frames its analysis differently from a “Technical Writer” agent. See AI Agent Design Patterns 2026 for the underlying principles.
- Tasks have outputs: Each Task defines
expected_outputwhich sets the quality bar and format, andoutput_fileto save results to disk automatically. Combine with Docker Volumes to persist agent outputs in production. - Sequential vs hierarchical: Sequential processes run tasks in order, passing outputs between agents. Hierarchical adds a manager LLM that dynamically assigns tasks — more flexible, uses more inference calls. Compare with LangChain and LangGraph Local Agents for graph-based state management.
- Local models work identically:
LLM(model="ollama/llama4:scout")replacesLLM(model="gpt-4o")with no other code changes — same crew, same tasks, zero API cost.
Introduction
Direct Answer: How do I build a multi-agent CrewAI system with local Ollama models in 2026?
Install CrewAI, create agents (Researcher, Writer, Reviewer) with local Ollama LLM configured, define tasks with expected outputs, assemble into a Crew with hierarchical process, and execute with crew.kickoff(). All agent reasoning, tool execution, and task delegation runs locally on Ollama with zero cloud API calls. Requires Llama 4 Scout or Qwen3 14B for reliable agent behavior.
Part 1: Environment Setup
Before building agents, ensure Ollama is running locally with a capable model loaded. CrewAI communicates with Ollama via HTTP on port 11434.
pip install crewai --break-system-packages
# Verify
python3 -c "import crewai; print('CrewAI:', crewai.__version__)"
# Ensure capable model is available
ollama list | grep -E "llama4|qwen3:14b"
Expected output:
CrewAI: 0.80.4
NAME SIZE
llama4:scout 12 GB
Part 2: Building Agents with Local LLM
Agents are the workers in a multi-agent crew. Each agent has a role, goal, and backstory that shapes how it approaches tasks. The backstory is critical — it frames the agent’s expertise and decision-making.
# simple_crew.py
from crewai import Agent, Task, Crew, Process, LLM
# Local Ollama model — identical API to OpenAI
local_llm = LLM(
model="ollama/llama4:scout",
base_url="http://localhost:11434",
temperature=0.3,
)
# ── Define Agents ─────────────────────────────────────────────────────────
researcher = Agent(
role="Senior Technical Researcher",
goal="Find accurate, specific, and actionable technical information",
backstory="""You are an expert researcher with 10 years of experience in
software engineering and infrastructure. You always provide specific version
numbers, exact commands, and verified facts. You never guess.""",
llm=local_llm,
verbose=False,
max_iter=5, # Max reasoning iterations before forced answer
)
writer = Agent(
role="Technical Documentation Writer",
goal="Transform research into clear, structured technical documentation",
backstory="""You write developer documentation that is precise, concise,
and immediately actionable. You use bullet points, code examples, and
clear headings. You never use marketing language.""",
llm=local_llm,
verbose=False,
)
# ── Define Tasks ──────────────────────────────────────────────────────────
research_task = Task(
description="""Research the topic: {topic}
Find: what it is, why it matters, and 3 specific use cases with examples.""",
expected_output="""A factual research summary with:
- Clear definition
- 3 concrete use cases with specific examples
- Key technical specifications or version information""",
agent=researcher,
)
writing_task = Task(
description="""Write a developer quickstart guide for: {topic}
Use the research findings provided. Include working code examples.""",
expected_output="""A structured quickstart guide with:
- 1-paragraph introduction
- Prerequisites (numbered list)
- 3-5 step setup guide with code
- One working code example
- Common pitfall to avoid""",
agent=writer,
context=[research_task], # Receives researcher's output as context
)
# ── Assemble Crew ─────────────────────────────────────────────────────────
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
process=Process.sequential,
verbose=False,
)
# ── Run ───────────────────────────────────────────────────────────────────
result = crew.kickoff(inputs={"topic": "Redis pub/sub for real-time notifications in Python"})
print(result.raw)
Expected output:
## Redis Pub/Sub Quickstart for Python Developers
Redis Pub/Sub enables real-time message broadcasting between services using a
publisher/subscriber pattern where publishers send to channels and subscribers
receive without direct coupling.
**Prerequisites:**
1. Redis 7.4+ running on localhost:6379
2. Python 3.12+ with redis-py: `pip install redis`
**Setup:**
**Step 1 — Publisher:**
```python
import redis
r = redis.Redis()
r.publish('notifications', '{"user_id": 42, "event": "new_message"}')
…
---
## Part 3: Hierarchical Crew with Manager Agent
Hierarchical crews use a manager agent to dynamically assign tasks to worker agents. This approach is more flexible than sequential task chains but requires more LLM inference calls.
```python
# hierarchical_crew.py — manager agent dynamically assigns tasks
from crewai import Agent, Task, Crew, Process, LLM
from crewai.tools import tool
import subprocess
local_llm = LLM(model="ollama/llama4:scout", base_url="http://localhost:11434")
# Tools available to agents
@tool("Server Health Check")
def check_server_health(host: str) -> str:
"""Check if a server is reachable and return its HTTP status."""
result = subprocess.run(
["curl", "-sI", "--max-time", "3", f"http://{host}"],
capture_output=True, text=True
)
return result.stdout.split('\n')[0] if result.returncode == 0 else f"Unreachable: {result.stderr}"
@tool("List Running Services")
def list_services() -> str:
"""List all active systemd services."""
result = subprocess.run(
["systemctl", "list-units", "--type=service", "--state=running", "--no-pager", "--plain"],
capture_output=True, text=True
)
return result.stdout[:1000] # Truncate long output
# Specialised agents
infra_agent = Agent(
role="Infrastructure Engineer",
goal="Assess server health and identify operational issues",
backstory="Expert in Linux systems, networking, and service management. Uses tools to gather actual system data.",
tools=[check_server_health, list_services],
llm=local_llm,
)
security_agent = Agent(
role="Security Analyst",
goal="Identify security vulnerabilities and misconfigurations",
backstory="Specialises in server hardening, CVE analysis, and security best practices for Ubuntu servers.",
llm=local_llm,
)
report_agent = Agent(
role="Report Writer",
goal="Produce clear, actionable technical reports from findings",
backstory="Transforms technical findings into structured reports with prioritised action items.",
llm=local_llm,
)
# Tasks
health_task = Task(
description="Check the health of localhost and list running services",
expected_output="Server health status with list of running services",
agent=infra_agent,
)
security_task = Task(
description="Analyse the server configuration for security issues. Focus on: exposed ports, unnecessary services, and hardening gaps",
expected_output="Security findings with severity levels (Critical/High/Medium/Low) and specific fixes",
agent=security_agent,
context=[health_task],
)
report_task = Task(
description="Write an executive summary of the server assessment",
expected_output="2-page technical report: Summary, Findings (prioritised), Action Items (numbered)",
agent=report_agent,
context=[health_task, security_task],
output_file="server-assessment-report.md", # Auto-saves to file
)
crew = Crew(
agents=[infra_agent, security_agent, report_agent],
tasks=[health_task, security_task, report_task],
process=Process.sequential,
verbose=True,
)
result = crew.kickoff()
print("\nReport saved to: server-assessment-report.md")
Part 2.5: Complete Hierarchical Crew with Manager Agent
A manager agent makes dynamic decisions about which worker agent handles each task:
# hierarchical_with_manager.py — Complete Hierarchical Crew with Manager Agent Decision-Making
# Manager agent evaluates each task and assigns it to the most suitable worker agent
# Use case: complex multi-step projects where task type determines which specialist handles it
from crewai import Agent, Task, Crew, Process, LLM
# ══════════════════════════════════════════════════════════════════════════════════════════════
# Local LLM Configuration — Ollama Integration
# ══════════════════════════════════════════════════════════════════════════════════════════════
# Local Ollama model on http://localhost:11434
# temperature=0.3: Low randomness, deterministic output (good for structured tasks)
# Higher temperature (>0.7): More creative, varied responses (good for brainstorming)
local_llm = LLM(
model="ollama/llama4:scout",
base_url="http://localhost:11434",
temperature=0.3 # Conservative: focus on accuracy, not creativity
)
# ══════════════════════════════════════════════════════════════════════════════════════════════
# Manager Agent — Orchestrates Team Task Assignment
# ══════════════════════════════════════════════════════════════════════════════════════════════
# Manager role: evaluates task requirements and routes to appropriate specialist
# Key difference from sequential crew: manager can make dynamic decisions per task
# Instead of pre-assigning tasks, manager reviews task description and assigns based on expertise
manager = Agent(
role="Project Manager",
goal="Coordinate a team to deliver high-quality technical documentation and code",
# Backstory provides context for decision-making; influences how manager evaluates tasks
# "Experienced project manager" persona helps with task prioritization
backstory="""You are an experienced project manager overseeing a distributed team of
specialists. You decide which team member is best suited for each task based on:
- Task complexity and type (code, documentation, review, testing)
- Individual agent expertise and past performance
- Delivery timeline and quality requirements
You prioritise quality, accuracy, and timely delivery over speed.""",
llm=local_llm,
verbose=True, # Log all manager decisions and reasoning
)
# ══════════════════════════════════════════════════════════════════════════════════════════════
# Worker Agents — Specialised Teams for Different Tasks
# ══════════════════════════════════════════════════════════════════════════════════════════════
# ── Code Expert Agent ──────────────────────────────────────────────────────────────────────
# Assigned to: tasks requiring code implementation, architectural decisions, optimization
code_expert = Agent(
role="Senior Software Engineer",
goal="Provide expert-level code examples and technical deep dives with best practices",
# Backstory emphasises: years of experience, production-quality standards, security mindset
backstory="""You are a principal engineer with 15+ years of experience in distributed systems
and scalable architecture. You write production-grade code with comprehensive error handling,
security considerations, and performance optimizations. Your code includes detailed comments
and follows SOLID principles.""",
llm=local_llm,
)
# ── Technical Writer Agent ────────────────────────────────────────────────────────────────
# Assigned to: explaining concepts, creating tutorials, simplifying complex topics
writer = Agent(
role="Technical Writer",
goal="Translate complex technical concepts into clear, accessible documentation",
# Backstory emphasises: accessibility, junior developer mindset, concrete examples
backstory="""You specialise in making advanced technical topics accessible to junior developers.
You use real-world analogies, step-by-step instructions, code snippets, and concrete examples.
You anticipate common questions and explain the 'why' behind technical decisions.""",
llm=local_llm,
)
# ── Code Reviewer Agent ────────────────────────────────────────────────────────────────
# Assigned to: code review, quality assurance, best practices validation, edge case testing
reviewer = Agent(
role="QA Engineer",
goal="Ensure technical accuracy and catch edge cases",
backstory="""You test everything. You find edge cases others miss. You ensure all
code examples actually work and all claims are technically sound.""",
llm=local_llm,
)
# ── Tasks (no specific agent yet — manager will assign) ────────────────────
task1 = Task(
description="""Write production-grade Python code for a rate limiter that uses
token buckets and works with async/await. Include comprehensive error handling.""",
expected_output="Complete, tested Python code with docstrings and type hints",
agent=code_expert, # Manager can reassign, but this is the default
)
task2 = Task(
description="""Explain the token bucket algorithm in simple terms. Use a water
bucket metaphor. Then explain why it's better than fixed-window rate limiting.""",
expected_output="2-paragraph explanation with metaphor and comparison",
agent=writer,
)
task3 = Task(
description="""Review the code from task1 and the explanation from task2. Check for:
1. Code correctness and edge cases
2. Explanation accuracy
3. Any security issues
Return a list of findings and fixes.""",
expected_output="QA report with findings list and specific fixes",
agent=reviewer,
context=[task1, task2], # Depends on other tasks' outputs
)
# ── Hierarchical Crew (manager coordinates workers) ────────────────────────
crew = Crew(
agents=[code_expert, writer, reviewer], # Workers
tasks=[task1, task2, task3],
manager_agent=manager, # Manager makes assignments
process=Process.hierarchical, # Manager delegates dynamically
verbose=True,
)
# ── Run with manager coordination ──────────────────────────────────────────
result = crew.kickoff(
inputs={
"topic": "Rate limiting for REST APIs",
"audience": "intermediate Python developers"
}
)
print("\n" + "="*80)
print("HIERARCHICAL CREW OUTPUT")
print("="*80)
print(result.raw)
Key differences from sequential:
process=Process.hierarchicalenables manager decision-makingmanager_agent=managerspecifies who makes assignments- Tasks don’t pre-assign agents — manager chooses based on task description and agent roles
- Manager can reassign tasks if it thinks a different agent is better suited
Expected workflow:
[Manager]
"I need code — assigning to code_expert"
→ [Code Expert] produces code
"I need explanation — assigning to writer"
→ [Writer] produces explanation
"I need review of both — assigning to reviewer"
→ [Reviewer] produces QA report
[Final Output]
Complete documentation with all three perspectives
When to use hierarchical:
- Task requirements are complex and need human-like delegation decisions
- Agent roles can handle overlapping domains (manager must choose)
- Long task sequences where mid-course corrections needed
- Multi-stage workflows (research → design → implementation → review)
Part 3: Custom Tool Integration
# custom_tools.py — add domain-specific tools to your agents
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
import requests, json
class WebSearchInput(BaseModel):
query: str = Field(description="The search query")
max_results: int = Field(default=5, description="Number of results to return")
class SovereignWebSearch(BaseTool):
name: str = "Sovereign Web Search"
description: str = "Search the web using a privacy-respecting search engine (SearXNG)"
args_schema: type[BaseModel] = WebSearchInput
def _run(self, query: str, max_results: int = 5) -> str:
# SearXNG is self-hostable — see https://searxng.github.io/searxng/
# For demo, using public instance
try:
response = requests.get(
"https://searx.be/search",
params={"q": query, "format": "json", "categories": "general"},
timeout=5,
headers={"User-Agent": "SovereignBot/1.0"}
)
results = response.json().get("results", [])[:max_results]
return "\n\n".join(
f"Title: {r['title']}\nURL: {r['url']}\nSnippet: {r.get('content', '')[:200]}"
for r in results
)
except Exception as e:
return f"Search failed: {e}"
# Use in an agent
search_tool = SovereignWebSearch()
research_agent = Agent(
role="Web Researcher",
goal="Research current information using web search",
backstory="Expert researcher who finds accurate, current information from the web.",
tools=[search_tool],
llm=LLM(model="ollama/llama4:scout", base_url="http://localhost:11434"),
)
Part 4: Sovereignty Audit
python3 - << 'EOF'
import subprocess, threading, time
import crewai
from crewai import Agent, Task, Crew, LLM
external_connections = []
def monitor():
for _ in range(20):
r = subprocess.run(['ss','-tnp','state','established'],
capture_output=True, text=True)
for line in r.stdout.splitlines():
if 'python' in line and '127.0.0.1' not in line and '::1' not in line:
external_connections.append(line)
time.sleep(0.5)
t = threading.Thread(target=monitor, daemon=True)
t.start()
llm = LLM(model="ollama/qwen3:14b", base_url="http://localhost:11434")
agent = Agent(role="Test", goal="Test", backstory="Test agent", llm=llm)
task = Task(description="Say 'hello'", expected_output="The word hello", agent=agent)
crew = Crew(agents=[agent], tasks=[task])
crew.kickoff()
t.join(timeout=3)
if external_connections:
print(f"External connections found: {external_connections}")
else:
print("✓ Zero external connections — CrewAI + Ollama is fully sovereign")
EOF
Expected output:
✓ Zero external connections — CrewAI + Ollama is fully sovereign
Troubleshooting
Agent stopped due to iteration limit
Cause: max_iter reached — model is looping or can’t complete the task.
Fix: Increase max_iter=10 in the Agent, or simplify the task description. With smaller models (7B), reduce task complexity.
Agents produce inconsistent output format
Cause: Smaller models don’t reliably follow the expected_output structure.
Fix: Switch to llama4:scout or qwen3:14b. Add explicit format instructions to expected_output: “Return ONLY a JSON object with keys: findings, severity, recommendation”.
Connection refused to Ollama
Fix: Ensure Ollama is running: ollama serve. Check: curl http://localhost:11434/api/version.
Conclusion
CrewAI with local Ollama delivers multi-agent orchestration with zero cloud cost and full data sovereignty. The crew pattern — specialised agents with distinct roles, defined tasks with context passing, and tool integration — handles complex tasks that would overwhelm a single-agent approach.
See AI Agent Design Patterns 2026 for the underlying patterns CrewAI implements, and LangGraph Tutorial 2026 for a more code-level alternative to CrewAI’s higher-level abstraction.
People Also Ask
What is the difference between CrewAI and LangGraph?
CrewAI is a higher-level framework focused on role-based multi-agent collaboration — you define Agent personas, Tasks, and let the Crew handle orchestration. It’s fast to build with but less flexible. LangGraph is a lower-level graph-based framework where you define every node and edge explicitly — more control, steeper learning curve. Use CrewAI for business-process automation where the agent roles map to human roles (researcher, writer, reviewer). Use LangGraph for complex workflows requiring precise control over state, branching, and tool call sequences.
Does CrewAI support async execution?
CrewAI 0.80+ supports async crew execution with await crew.kickoff_async(inputs={...}). For parallel task execution (tasks without dependencies), use Process.parallel — agents run their tasks simultaneously, reducing total time. Tasks with context=[other_task] dependencies still run sequentially after their dependencies complete.
Troubleshooting & Common Issues
Issue: ConnectionError: Failed to connect to Ollama server
Cause: Ollama not running or listening on wrong address.
# Fix: Start Ollama and verify
ollama serve # Start Ollama in one terminal
# In another terminal:
curl http://localhost:11434/api/tags # Should return model list
Issue: RuntimeError: No model specified in LLM config
Cause: Model name doesn’t exist or wrong format.
# Fix: Check available models and use correct name
import subprocess
result = subprocess.run(['ollama', 'list'], capture_output=True, text=True)
print(result.stdout) # Shows available models
# Correct format:
llm = LLM(model="ollama/llama2:7b", base_url="http://localhost:11434")
Issue: Task took too long to execute (>300s timeout)
Cause: LLM too slow or task too complex.
# Fix: Increase timeout or simplify task
task = Task(
description="Simpler description with less context needed",
expected_output="Concise output (1 paragraph, not 10)",
timeout=600 # 10 minutes instead of 5
)
Issue: Tool execution failed: Command not found
Cause: Tool uses bash command not available on system.
# Fix: Check command availability
import subprocess
result = subprocess.run(['which', 'curl'], capture_output=True)
if result.returncode != 0:
# Install: apt-get install curl
print("curl not found")
# Or write cross-platform tool:
@tool("Get URL")
def fetch_url(url: str) -> str:
import requests # Pure Python, platform-agnostic
return requests.get(url).text
Issue: Agent stuck in loop repeating same task
Cause: Tool feedback doesn’t change agent’s reasoning.
# Fix: Make tools return clear pass/fail signals
@tool("Check if done")
def is_complete(status: str) -> str:
"""Returns 'SUCCESS' or 'FAILURE: reason', not ambiguous output."""
if status == "complete":
return "SUCCESS: Task completed as requested"
return "FAILURE: Not done yet. Try next step."
Quick Reference: CrewAI vs Sequential vs Hierarchical
| Process Type | When to Use | Pros | Cons |
|---|---|---|---|
| Sequential | Tasks have clear order (research → write → review) | Simple, easy to understand | Slow (tasks must wait for prior completion) |
| Hierarchical | Complex projects with manager deciding task assignment | Flexible, manager adapts to results | More LLM calls, higher cost |
| Parallel | Multiple independent tasks (translate to 3 languages) | Fast, all agents work simultaneously | More complex output merging |
Crew Architecture Decision Tree
What type of problem are you solving?
├─ Multi-step workflow with clear order
│ └─ Use Process.sequential (research → write → publish)
├─ Dynamic task assignment based on content
│ └─ Use Process.hierarchical (manager decides which agent)
├─ Multiple independent similar tasks
│ └─ Use Process.parallel (agents work simultaneously)
└─ Complex dependencies + custom logic
└─ Use LangGraph instead (more control)
Common Agent Patterns
Pattern 1: Researcher → Writer → Reviewer (Sequential)
# Step 1: Research agent finds information
research_task = Task(description="Research topic X")
# Step 2: Writer creates content (depends on research)
write_task = Task(
description="Write article from research",
context=[research_task] # Waits for research to complete
)
# Step 3: Reviewer checks quality (depends on writing)
review_task = Task(
description="Review article for quality",
context=[write_task]
)
crew = Crew(
agents=[researcher, writer, reviewer],
tasks=[research_task, write_task, review_task],
process=Process.sequential
)
Pattern 2: Manager Assigns Tasks (Hierarchical)
# Manager decides who handles each task
manager_task = Task(
description="Coordinate team to produce report",
agent=manager # Manager assigned, routes to team
)
# Worker tasks DON'T have specific agents pre-assigned
# Manager will assign based on task type
task1 = Task(description="Code the API module")
task2 = Task(description="Write API documentation")
task3 = Task(description="Test API endpoints")
crew = Crew(
agents=[manager, dev, writer, tester],
tasks=[task1, task2, task3],
process=Process.hierarchical,
manager_agent=manager
)
Pattern 3: Tool Feedback Loop (Agentic)
@tool("Execute code")
def run_code(code: str) -> str:
# Return clear success/failure, not just output
try:
result = exec(code)
return f"SUCCESS: Code executed. Output: {result}"
except Exception as e:
return f"FAILURE: {type(e).__name__}: {str(e)}"
# Agent sees clear signal, not confused output
Performance Optimization Tips
Tip 1: Reduce LLM Input Size
# ❌ Bad: Full 100,000-token context
task = Task(
description=f"Analyze this data: {entire_dataset}"
)
# ✅ Good: Summarized context
task = Task(
description="Analyze top 10 data points by importance"
)
Tip 2: Use Faster Models
# ❌ Slow: 70B parameter model
llm = LLM(model="ollama/llama2:70b") # 5+ seconds per response
# ✅ Fast: Smaller model for simple tasks
llm = LLM(model="ollama/mistral:7b") # 500ms per response
Tip 3: Cache Tool Results
tool_cache = {}
@tool("Get weather")
def get_weather(city: str) -> str:
if city in tool_cache:
return tool_cache[city]
result = requests.get(...).json()
tool_cache[city] = result
return result
Frequently Asked Questions (FAQ)
Q: How much does it cost to run CrewAI vs ChatGPT API?
A: Local Ollama: ~$0 (runs on your hardware). OpenAI API: ~$0.10–$1 per task (depends on tokens used). For 1000 tasks: Ollama = free; OpenAI = $100–$1000.
Q: Can I use CrewAI with GPT-4 or Claude?
A: Yes. Replace the LLM:
llm = LLM(
model="gpt-4",
base_url="https://api.openai.com/v1",
api_key=os.getenv("OPENAI_API_KEY")
)
Q: How do I debug why an agent isn’t using a tool?
A: Check verbose logs:
agent = Agent(verbose=True) # Prints all reasoning steps
crew = Crew(..., verbose=True) # Prints full execution trace
Q: Can agents collaborate across multiple machines?
A: CrewAI doesn’t have built-in distributed support. Use message queues (RabbitMQ, Redis) to coordinate agents across machines, or deploy entire crew on one machine and scale horizontally with multiple crew instances.
Q: What’s the maximum number of agents in a crew?
A: Technically unlimited, but practically:
- 3–5 agents: optimal (fast, manageable)
- 10+ agents: slow (context window fills, LLM confusion)
- 50+ agents: likely to fail (token limits, complexity)
Q: How do I prevent an agent from hallucinating?
A: Use tool-grounded workflow:
# ❌ Bad: LLM makes up data
task = Task(description="What is the weather in NYC?")
# ✅ Good: Agent must use weather tool
task = Task(
description="Use weather tool to find NYC temperature",
tools=[get_weather] # Only available tool
)
Q: Can CrewAI handle real-time data streams?
A: Not natively. For streaming data, use async execution:
while True:
new_data = fetch_latest_data()
result = await crew.kickoff_async(inputs={"data": new_data})
process_result(result)
await asyncio.sleep(60) # Check every minute
Q: What happens if a tool fails? Does the crew retry?
A: By default, agents see the error and adapt their next step. For automatic retry:
@tool("Unreliable API")
def call_api_with_retry(url: str, retries: int = 3) -> str:
for attempt in range(retries):
try:
return requests.get(url).json()
except:
if attempt == retries - 1:
return "ERROR: API unavailable after 3 retries"
time.sleep(2 ** attempt) # Exponential backoff
Related Vucense Guides
- AI Agent Design Patterns 2026 — understand reflection, tool use, and planning patterns that CrewAI orchestrates
- Best Open-Weight AI Models 2026 — choose the best local LLM for your crew agents
- LangChain & LangGraph Local Agents — alternative multi-agent framework with similar sovereignty advantages
Further Reading
Vucense Guides
- AI Agent Design Patterns 2026 — conceptual foundation CrewAI implements
- LangChain and LangGraph Local Agents 2026 — lower-level alternative for complex workflows
- Best Open-Weight AI Models 2026 — choose the right model for CrewAI agents
- How to Install Ollama and Run LLMs Locally — prerequisite: Ollama runtime setup
- MCP Protocol: Build an AI Tool Server in Python — standardise agent tools
Official Documentation
- CrewAI Official Documentation — multi-agent orchestration framework
- CrewAI GitHub Repository — source code and examples
- Ollama Official — local LLM runtime
- Ollama Model Library — 100+ quantised models available
- Python Official Documentation — Python 3.10+ reference
- LangChain Documentation — LLM framework (CrewAI built on top)
- LangGraph Documentation — stateful agent workflows
Agent Frameworks & Tools
- AutoGen Framework — Microsoft’s multi-agent framework
- LlamaIndex (GPT Index) — data indexing for RAG agents
- Haystack by DeepSet — NLP and search framework
- Dify.ai — no-code agent builder
- MCP Protocol Specification — standard tool interface for agents
Performance & Monitoring
- Hugging Face Hub — model repository and API access
- Weights & Biases (W&B) — experiment tracking for agent tuning
- Prometheus Monitoring — metrics collection for agent systems
Tested on: Ubuntu 24.04 LTS (RTX 4090). CrewAI 0.80.4, Ollama 0.5.12, Llama 4 Scout. Last verified: May 16, 2026.