Dream Server Mode Switch

One-command switching between local, cloud, and hybrid LLM modes.

Quick Start

# Check current mode
dream mode

# Switch to local mode (llama-server, requires GPU)
dream mode local

# Switch to cloud mode (LiteLLM + API keys, no GPU needed)
dream mode cloud

# Switch to hybrid mode (local primary, cloud fallback)
dream mode hybrid

# Restart to apply
dream restart

How It Works

One env var (LLM_API_URL) controls where all services send LLM requests. Three modes are user-selectable via dream mode; a fourth (lemonade) is auto-configured by the installer on AMD hardware — see Lemonade Mode below.

Mode	`LLM_API_URL`	`DREAM_MODE`	LiteLLM config
local	`http://llama-server:8080`	`local`	`config/litellm/local.yaml`
cloud	`http://litellm:4000`	`cloud`	`config/litellm/cloud.yaml`
hybrid	`http://litellm:4000`	`hybrid`	`config/litellm/hybrid.yaml`

All compose files reference ${LLM_API_URL:-http://llama-server:8080}, so existing installs work without changes.

Modes

Local Mode (default)

All inference runs on your hardware via llama-server.

Aspect	Details
LLM	llama-server (GGUF models)
Cost	$0 (electricity only)
Requires	GPU or CPU with sufficient RAM
Web Search	via SearXNG

dream mode local

Cloud Mode

LLM requests routed through LiteLLM to cloud APIs.

Aspect	Details
LLM	Claude, GPT-4o, MiniMax via LiteLLM
Cost	~$0.003-0.06/1K tokens
Requires	Internet, API keys
GPU	Not needed

dream mode cloud

Required .env variables:

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...

Hybrid Mode

Local llama-server as primary, cloud APIs as fallback via LiteLLM.

Aspect	Details
LLM	Local first, cloud on failure
Cost	$0 normally, cloud rates on fallback
Requires	GPU + API keys (recommended)

dream mode hybrid

Lemonade Mode (AMD — auto-configured)

Not user-switchable. This mode is automatically set by the installer on AMD hardware. dream mode does not accept lemonade as an argument — only the installer sets it.

All LLM traffic routes through the LiteLLM proxy, which delegates to the Lemonade SDK (lemonade-server). The dashboard API uses a distinct /api/v1 URL prefix in this mode (instead of /v1).

Aspect	Details
LLM	Lemonade SDK via LiteLLM proxy
Cost	$0 (local inference)
Requires	AMD GPU (auto-detected at install time)
Set by	Installer (Phase 06), not `dream mode`

For AMD Strix Halo performance tuning (GRUB, kernel module, sysctl settings), see config/system-tuning/README.md.

Existing Lemonade SDK installs on Linux AMD hosts can be wrapped without letting Dream Server manage the Lemonade runtime. See Lemonade SDK Compatibility.

.env Variables

Variable	Default	Description
`DREAM_MODE`	`local`	Active mode: `local`, `cloud`, or `hybrid`; `lemonade` is auto-set on AMD (not user-switchable)
`LLM_API_URL`	`http://llama-server:8080`	Where services send LLM requests
`ANTHROPIC_API_KEY`	(empty)	Anthropic API key (cloud/hybrid)
`OPENAI_API_KEY`	(empty)	OpenAI API key (cloud/hybrid)
`TOGETHER_API_KEY`	(empty)	Together AI API key (optional)
`MINIMAX_API_KEY`	(empty)	MiniMax API key (optional, cloud/hybrid)

Installer: `--cloud` Flag

Install in cloud mode (skips GPU detection and model download):

./install-core.sh --cloud

This sets DREAM_MODE=cloud, LLM_API_URL=http://litellm:4000, and auto-enables the LiteLLM extension.

Model Management

# Show current model
dream model current

# List available tiers
dream model list

# Swap to a different tier
dream model swap T3

For Dashboard downloads, loading catalog models, and manual GGUF swaps, see MODEL-MANAGEMENT.md.

Architecture

Local Mode

User -> Open WebUI -> llama-server (local) -> Response

Cloud Mode

User -> Open WebUI -> LiteLLM -> Cloud APIs (Claude/GPT-4o)

Hybrid Mode

User -> Open WebUI -> LiteLLM -> llama-server (local) -> Response
                                      |
                                 [On timeout/error]
                                      |
                                 Cloud APIs (fallback)

Files

File	Purpose
`config/litellm/local.yaml`	LiteLLM config for local mode
`config/litellm/cloud.yaml`	LiteLLM config for cloud mode
`config/litellm/hybrid.yaml`	LiteLLM config for hybrid mode
`scripts/mode-switch.sh`	Backend script for mode switching
`.env`	Stores `DREAM_MODE`, `LLM_API_URL`, API keys

Data Safety

All modes share the same data volumes:

./data/open-webui/ -- Conversations, users
./data/qdrant/ -- Vector database
./data/models/ -- Downloaded GGUF models

Switching modes preserves all data. Only the LLM routing changes.

Mode Comparison

Feature	Local	Cloud	Hybrid	Lemonade (AMD)
Internet required	No	Yes	Yes (for fallback)	No
API keys required	No	Yes	Recommended	No
GPU required	Yes	No	Yes	Yes (AMD)
Response quality	Good	Best	Best of both	Good
Cost	$0	$$$	$0 or $$$	$0
Privacy	100% local	Data to cloud	Local unless fallback	100% local

CLI Reference

# Mode commands
dream mode              # Show current mode
dream mode local        # Switch to local mode
dream mode cloud        # Switch to cloud mode
dream mode hybrid       # Switch to hybrid mode

# Model commands
dream model current     # Show current model
dream model list        # List available tiers
dream model swap T2     # Switch model tier

# Shorthand
dream m local           # Shorthand for mode local

Troubleshooting

Cloud mode: "No API keys found"

# Add your API keys to .env
dream config edit
# Add: ANTHROPIC_API_KEY=sk-ant-...
dream restart

Local mode: llama-server won't start

# Check GPU status
nvidia-smi
# Check model is downloaded
ls -la data/models/*.gguf
# Check logs
dream logs llama-server

Mode switch not taking effect

# Verify .env
grep DREAM_MODE .env
grep LLM_API_URL .env
# Restart all services
dream restart

Rollback

If anything breaks, restore default behavior:

dream mode local
dream restart

Or manually edit .env:

DREAM_MODE=local
LLM_API_URL=http://llama-server:8080

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dream Server Mode Switch

Quick Start

How It Works

Modes

Local Mode (default)

Cloud Mode

Hybrid Mode

Lemonade Mode (AMD — auto-configured)

.env Variables

Installer: `--cloud` Flag

Model Management

Architecture

Local Mode

Cloud Mode

Hybrid Mode

Files

Data Safety

Mode Comparison

CLI Reference

Troubleshooting

Cloud mode: "No API keys found"

Local mode: llama-server won't start

Mode switch not taking effect

Rollback

FilesExpand file tree

MODE-SWITCH.md

Latest commit

History

MODE-SWITCH.md

File metadata and controls

Dream Server Mode Switch

Quick Start

How It Works

Modes

Local Mode (default)

Cloud Mode

Hybrid Mode

Lemonade Mode (AMD — auto-configured)

.env Variables

Installer: --cloud Flag

Model Management

Architecture

Local Mode

Cloud Mode

Hybrid Mode

Files

Data Safety

Mode Comparison

CLI Reference

Troubleshooting

Cloud mode: "No API keys found"

Local mode: llama-server won't start

Mode switch not taking effect

Rollback

Installer: `--cloud` Flag