Vucense
Dev Corner Local AI & On-Device Inference Local AI Stack Builds

Open WebUI: Install and Configure Your Local ChatGPT Alternative (2026)

🟢Beginner

Install Open WebUI on Ubuntu 24.04 to get a ChatGPT-style interface for local Ollama models. Covers Docker setup, model management, RAG with documents, multi-user config, and HTTPS with Nginx.

Open WebUI: Install and Configure Your Local ChatGPT Alternative (2026)
Article Roadmap

Key Takeaways

  • ChatGPT UI, local models: Open WebUI is the browser interface Ollama deserves. Models pull and manage through the UI, conversations persist locally, and document RAG runs on your hardware.
  • One Docker command: The entire UI deploys in one docker run. No configuration files needed for the basic setup.
  • RAG without cloud: Upload PDFs, markdown files, or web pages — Open WebUI chunks them, embeds them locally (using Ollama’s embedding models), and retrieves relevant context for each query.
  • SovereignScore 96/100: Open WebUI is fully open-source (MIT). All data stays local. Two points deducted because the default embedding model downloads from Hugging Face on first use.

Introduction

Direct Answer: How do I install Open WebUI as a local ChatGPT alternative on Ubuntu 24.04 in 2026?

Install Ollama first (curl -fsSL https://ollama.com/install.sh | sh), pull a model (ollama pull qwen3:14b), then deploy Open WebUI with Docker: docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://localhost:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main. Access the interface at http://localhost:8080. On first visit, create an admin account. The interface automatically discovers all models available in your Ollama instance. For HTTPS access (required for microphone features), put Nginx in front: proxy localhost:8080 with a Let’s Encrypt certificate. Open WebUI supports multiple users, conversation history, document upload for RAG, image generation via ComfyUI, and voice input — all running locally.

“Open WebUI is what ChatGPT would look like if OpenAI released the frontend as open-source. Same interface, same features, same polish — but your models, your data, your hardware.”


Prerequisites

Ollama must be installed and running with at least one model pulled. If you haven’t done this:

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a capable model (10GB, requires 10GB VRAM or 16GB+ RAM for CPU)
ollama pull qwen3:14b
# Or for 8GB VRAM:
ollama pull qwen3:7b

# Verify Ollama is running
curl -s http://localhost:11434/api/version | python3 -c "import json,sys; print('Ollama:', json.load(sys.stdin)['version'])"

Expected output:

Ollama: 0.5.12

Full installation guide: How to Install Ollama and Run LLMs Locally.


Part 1: Basic Installation

# Single-command Docker deployment
docker run -d \
  --network=host \
  -v open-webui:/app/backend/data \
  -e OLLAMA_BASE_URL=http://localhost:11434 \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

# Wait for startup (~30 seconds)
sleep 30
docker logs open-webui | tail -5

Expected output:

INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
# Verify the UI is accessible
curl -sI http://localhost:8080 | head -3

Expected output:

HTTP/1.1 200 OK
content-type: text/html; charset=utf-8

Open http://localhost:8080 in your browser. First visit creates the admin account — this is the only account with full access. Create it immediately before anyone else accesses the instance.


For persistent configuration, HTTPS, and easier updates, use Docker Compose with Nginx:

mkdir -p ~/open-webui && cd ~/open-webui

cat > docker-compose.yml << 'EOF'
name: open-webui

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "127.0.0.1:8080:8080"    # Only accessible via Nginx
    volumes:
      - open-webui-data:/app/backend/data
    environment:
      # Connect to Ollama on the host
      - OLLAMA_BASE_URL=http://host.docker.internal:11434
      # Security
      - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY:-change_me_to_32_random_chars}
      - WEBUI_AUTH=true          # Require login (disable only for trusted local network)
      # Performance
      - ENABLE_RAG_WEB_SEARCH=true
      - RAG_WEB_SEARCH_ENGINE=duckduckgo   # Privacy-respecting search
    extra_hosts:
      - "host.docker.internal:host-gateway"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 30s

volumes:
  open-webui-data:
EOF

cat > .env << 'EOF'
WEBUI_SECRET_KEY=generate_32_random_chars_here_openssl_rand_hex_16
EOF

docker compose up -d
docker compose ps

Expected output:

NAME         IMAGE                               STATUS
open-webui   ghcr.io/open-webui/open-webui:main  Up 30 seconds (healthy)

Add Nginx for HTTPS (optional but recommended for team use):

sudo tee /etc/nginx/sites-available/open-webui << 'EOF'
server {
    listen 80;
    server_name chat.example.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    http2 on;
    server_name chat.example.com;

    ssl_certificate     /etc/letsencrypt/live/chat.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/chat.example.com/privkey.pem;

    client_max_body_size 100M;   # Allow large document uploads

    location / {
        proxy_pass         http://127.0.0.1:8080;
        proxy_set_header   Host              $host;
        proxy_set_header   X-Real-IP         $remote_addr;
        proxy_set_header   X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;

        # WebSocket support (for streaming responses)
        proxy_http_version 1.1;
        proxy_set_header   Upgrade    $http_upgrade;
        proxy_set_header   Connection "upgrade";
        proxy_read_timeout 300s;
    }
}
EOF

sudo ln -sf /etc/nginx/sites-available/open-webui /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

Part 3: Model Management

Open WebUI lets you pull, manage, and switch between models from the UI — no terminal needed:

Pull models from the UI:

  1. Click your username → Admin PanelModels
  2. Click Pull a model from Ollama.com
  3. Type qwen3:14bPull Model

From the terminal (same result):

# These are equivalent — Ollama is the backend
docker exec open-webui curl -s -X POST http://host.docker.internal:11434/api/pull \
  -d '{"name":"qwen3:14b"}' | python3 -c "
import json, sys
for line in sys.stdin:
    d = json.loads(line)
    if 'status' in d:
        print(d['status'], d.get('completed', ''))
" | tail -3

Expected output:

pulling manifest
verifying sha256 digest
success

Set a default model per user:

  1. Open a new chat → click the model selector dropdown at the top
  2. Select your preferred model
  3. Open SettingsGeneralDefault Model → set it permanently

Part 4: Document RAG (Chat with Your Files)

Upload documents and chat with them — all processed locally:

# Test document upload via API
curl -s -X POST http://localhost:8080/api/v1/files/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@/path/to/document.pdf" | python3 -m json.tool | grep '"id"'

From the UI:

  1. Open a new chat
  2. Click the paperclip icon in the message bar
  3. Upload a PDF, DOCX, TXT, or markdown file
  4. Ask questions about the document content

Expected behaviour:

You: What are the main recommendations in this document?
AI:  Based on the uploaded document, the main recommendations are:
     1. [Extracted from your document]
     2. [Extracted from your document]
     ...

The document is chunked, embedded using a local embedding model (by default nomic-embed-text:v1.5 via Ollama), and stored in a local vector database. Your document never leaves the machine.

Configure the embedding model:

  1. Admin Panel → SettingsDocuments
  2. Embedding Model → nomic-embed-text:v1.5 (pull it first: ollama pull nomic-embed-text:v1.5)

Part 5: Multi-User Setup

Open WebUI supports multiple accounts with role-based access:

User roles:

  • Admin: Full access — manage users, models, system settings, see all users’ chats
  • User: Chat with models, upload documents, manage own conversations

Add a new user (admin UI):

  1. Admin Panel → UsersAdd User
  2. Set email, name, password, and role
  3. Optionally set a usage limit (max tokens per day)

Disable new user self-registration (for private deployments):

# Environment variable to disable signup
# Add to docker-compose.yml environment:
- WEBUI_AUTH=true
- ENABLE_SIGNUP=false    # No new registrations without admin invitation

API key access for programmatic use:

  1. User SettingsAccountAPI KeysCreate new secret key
  2. Use with any OpenAI-compatible client:
from openai import OpenAI

client = OpenAI(
    base_url="https://chat.example.com/api",
    api_key="sk-your-open-webui-api-key"
)

response = client.chat.completions.create(
    model="qwen3:14b",
    messages=[{"role": "user", "content": "Hello from Python!"}]
)
print(response.choices[0].message.content)

Part 6: Sovereignty Verification

echo "=== OPEN WEBUI SOVEREIGNTY AUDIT ==="

echo ""
echo "[ Open WebUI container running ]"
docker ps --filter name=open-webui --format "{{.Names}}: {{.Status}}" | sed 's/^/  /'

echo ""
echo "[ Outbound connections from Open WebUI during chat ]"
docker exec open-webui ss -tnp state established 2>/dev/null | \
  grep -v "127.0.0.1\|172.17\|172.18\|10\." | grep -v "^Netid" || \
  echo "  ✓ No external connections during inference"

echo ""
echo "[ Data stored locally ]"
docker volume inspect open-webui-data 2>/dev/null | \
  python3 -c "import json,sys; d=json.load(sys.stdin)[0]; print('  Data path:', d['Mountpoint'])"

echo ""
echo "[ Ollama models stored locally ]"
du -sh ~/.ollama/models/ 2>/dev/null | sed 's/^/  Models: /'

Expected output:

=== OPEN WEBUI SOVEREIGNTY AUDIT ===

[ Open WebUI container running ]
  open-webui: Up 2 hours (healthy)

[ Outbound connections during chat ]
  ✓ No external connections during inference

[ Data stored locally ]
  Data path: /var/lib/docker/volumes/open-webui-data/_data

[ Ollama models stored locally ]
  Models: 28G	/root/.ollama/models/

Troubleshooting

Open WebUI can’t connect to Ollama

Cause: Ollama is running on the host but the Docker container can’t reach localhost. Fix: Use host.docker.internal as the Ollama URL in Docker Compose, and add extra_hosts: - "host.docker.internal:host-gateway". For --network=host deployments, localhost works directly.

Cannot read properties of undefined in the browser

Cause: Stale browser cache with old JavaScript after an update. Fix: Hard refresh: Ctrl+Shift+R (Linux/Windows) or Cmd+Shift+R (macOS). If that doesn’t work: docker compose pull && docker compose up -d --force-recreate.

Document RAG returns irrelevant results

Cause: The embedding model doesn’t match the document language, or chunk size is too large. Fix: In Admin Panel → Settings → Documents → reduce Chunk Size from 1500 to 500–800 tokens. For non-English documents, use a multilingual embedding model (mxbai-embed-large works well for EU languages).


Conclusion

Open WebUI is running as your sovereign ChatGPT replacement: browser interface, model switching, conversation history, document RAG, and multi-user support — all on your own hardware. The Qwen3 14B model provides capabilities comparable to GPT-3.5 for most tasks, at zero per-query cost after the initial GPU investment.

Connect this to the complete Build a Sovereign Local AI Stack guide for the full Docker Compose stack integrating Ollama, Open WebUI, and pgvector, or see LangChain and LangGraph with Ollama to use the same local models in agent pipelines.


People Also Ask

Is Open WebUI the same as ChatGPT?

Open WebUI replicates ChatGPT’s user interface and core features (conversational chat, model switching, conversation history, document upload, image generation via plugins), but uses local Ollama models instead of OpenAI’s API. The quality of responses depends on the local model — Qwen3 14B is comparable to GPT-3.5/early GPT-4 for most tasks. The key differences: Open WebUI is open-source, runs on your hardware, stores no data on external servers, and has zero per-query cost. It is not a drop-in replacement for GPT-4o’s capabilities on complex reasoning tasks, but for daily use (writing, coding assistance, Q&A), the gap is small for most users.

Yes — Open WebUI has built-in web search integration. In Admin Panel → Settings → Web Search, enable web search and configure a search engine. Privacy-respecting options include DuckDuckGo (no registration required), Brave Search, and SearXNG (self-hosted). When web search is enabled, the AI can retrieve current information from the web during chat. This is the main capability gap versus a fully offline deployment — enable it when you need current information, disable it when you want guaranteed data isolation.

How do I update Open WebUI when a new version releases?

# Pull the latest image and recreate the container
docker compose pull
docker compose up -d

# Or for the standalone docker run approach:
docker stop open-webui && docker rm open-webui
docker pull ghcr.io/open-webui/open-webui:main
# Run the original docker run command again

Your conversation history and settings persist in the open-webui-data Docker volume — they are not affected by container recreation.


Further Reading


Tested on: Ubuntu 24.04 LTS (Hetzner CX32 + RTX 4090), macOS Sequoia 15.4 (M3 Max). Open WebUI 0.5.20, Ollama 0.5.12. Last verified: April 28, 2026.

Kofi Mensah

About the Author

Inference Economics & Hardware Architect

Electrical Engineer | Hardware Systems Architect | 8+ Years in GPU/AI Optimization | ARM & x86 Specialist

Kofi Mensah is a hardware architect and AI infrastructure specialist focused on optimizing inference costs for on-device and local-first AI deployments. With expertise in CPU/GPU architectures, Kofi analyzes real-world performance trade-offs between commercial cloud AI services and sovereign, self-hosted models running on consumer and enterprise hardware (Apple Silicon, NVIDIA, AMD, custom ARM systems). He quantifies the total cost of ownership for AI infrastructure and evaluates which deployment models (cloud, hybrid, on-device) make economic sense for different workloads and use cases. Kofi's technical analysis covers model quantization, inference optimization techniques (llama.cpp, vLLM), and hardware acceleration for language models, vision models, and multimodal systems. At Vucense, Kofi provides detailed cost analysis and performance benchmarks to help developers understand the real economics of sovereign AI.

View Profile

Further Reading

All Dev Corner

Comments