Gateflow is a local-first LLM gateway for Ollama clusters and agentic clients.
It exposes an OpenAI-compatible chat completions endpoint, resolves client model aliases to real Ollama models, routes traffic across local backends, preserves streaming, records approximate usage, exports Prometheus metrics, and includes a local admin UI for day-to-day gateway operations.
Put screenshots in docs/screenshots/ with these filenames:
Automated installer for macOS / Linux, arm64 / amd64:
curl -fsSL https://raw.githubusercontent.com/dotlabshq/gateflow/main/install.sh | bashSkip confirmation:
curl -fsSL https://raw.githubusercontent.com/dotlabshq/gateflow/main/install.sh | bash --yesBuild from source:
make buildmake build builds the Next.js admin UI first, embeds the static export into the Go binary, and then produces a self-contained gateflow binary. End users do not need Node.js, pnpm, or Next.js to run /ui.
Verify:
gateflow versionOn first run, Gateflow creates ~/.gateflow/config.yaml if it does not already exist. The generated config points at local Ollama, enables the UI, enables the LLM gateway, and keeps MCP plus the agent/skill registry disabled by default.
Initialize SQLite control-plane data, sync Ollama models, and create a default credential:
gateflow initStart the gateway:
gateflow serveOpen the admin UI:
http://localhost:8080/ui
Default admin login:
username: admin
password: admin
Send a chat request:
curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer gfsk-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "code", "messages": [{"role": "user", "content": "hello"}], "stream": true}'Start Gateflow with Prometheus and Grafana:
make docker-upThe Docker config stores runtime data in the gateflow-data volume and points the default Ollama backend at http://host.docker.internal:11434/v1.
It seeds a local development credential so Prometheus can scrape the protected /metrics endpoint:
gfsk-docker-local-dev-key-change-me
Open:
Gateflow: http://localhost:8080
Admin UI: http://localhost:8080/ui
Prometheus: http://localhost:9090
Grafana: http://localhost:3000
Follow Gateflow logs:
make docker-logsStop the stack:
make docker-downgateflow version
gateflow serve
gateflow init
gateflow status
gateflow status --live
gateflow config validate
gateflow org list|add|remove
gateflow project list|add|remove
gateflow app list|add|remove
gateflow credential list|add|remove|explain-key|generate
gateflow model list|sync
gateflow backend list|drain|undrain
gateflow reloadAll commands default to ~/.gateflow/config.yaml. Use -c <path> to target another config file, or set GATEFLOW_CONFIG:
export GATEFLOW_CONFIG=/etc/gateflow/config.yamlFor Kubernetes, mount the config file and set the env var instead of passing -c:
env:
- name: GATEFLOW_CONFIG
value: /etc/gateflow/config.yamlUseful examples:
gateflow org add --name "Acme"
gateflow project add --org <organization-id> --name "Backend"
gateflow app add --project <project-id> --name "Code Agent"
gateflow credential add --app <app-id> --name "Code Agent Key"
gateflow credential remove <credential-id>
gateflow credential generate \
--org acme --project backend --app code-agent \
--name "Code Agent Key"
gateflow credential explain-key gfsk-YOUR_KEY
gateflow model sync
GATEFLOW_API_KEY=gfsk-YOUR_KEY gateflow reload
gateflow backend drain m3-studio-01
gateflow backend undrain m3-studio-01gateflow reload calls the protected POST /-/reload endpoint. It refreshes runtime identity, model, policy, and registry indexes from the current config and SQLite store without restarting the process. Listener address, backend definitions, routing settings, and usage storage still require a restart.
Backend drain mode updates bootstrap config so a backend stops receiving new requests after restart. Existing streams are not interrupted by the router while the process is running.
MVP endpoints:
POST /v1/chat/completions
GET /v1/models
POST /-/reload
GET /api/admin/*
GET /metrics
GET /healthz
GET /readyz
/healthz is public for liveness probes. LLM and runtime endpoints require a bearer API key. Admin API endpoints use the local UI session cookie.
Core behavior:
- Custom streaming proxy for
/v1/chat/completions - API key resolves to credential, app, project, and organization
- Hierarchical model policy and limits
- Model aliases hide real Ollama model names from clients
- Warm-aware least-active backend routing
- Backend health probes, retries before stream start, circuit breaker, and drain flag
- Approximate token usage via character count / 4
- Async usage writes to JSONL or SQLite
- Prometheus metrics
- Guardrail hook points without a heavy rule engine
- Local admin UI for dashboard, models, providers, credentials, and scope configuration
The admin UI is a single Next.js app under ui/. It is intended for local-first operations, not as a SaaS control plane.
Current screens:
- Dashboard with live Prometheus metrics, auto-refresh, and lightweight client-side charts
- Models with Ollama sync metadata and capabilities
- Agents and skills as read-only registry views when the registry module is enabled
- Providers as read-only backend health/load views
- Credentials with app-scoped API key generation
- Apps, projects, and organizations with limits and hierarchical model allowlists
Credential flow remains opaque:
API Key -> Credential -> App -> Project -> Organization -> Policy
The UI only keeps short-lived chart samples in browser memory. Persistent time-series history belongs in Prometheus and Grafana.
Gateflow uses YAML for bootstrap settings:
server:
listen: ":8080"
storage:
type: sqlite
dsn: "~/.gateflow/data/gateflow.db"
usage:
mode: "sqlite_async"
path: "~/.gateflow/data/usage.jsonl"
routing:
default_strategy: "warm_aware_least_active"
max_retries: 1
backends:
- id: local-ollama
type: ollama
base_url: "http://127.0.0.1:11434/v1"
drain: falseControl-plane data such as organizations, projects, apps, credentials, model aliases, and usage events lives in SQLite when storage.type: sqlite is enabled.
See config.example.yaml and docs/config-schema.md.
Gateflow exports Prometheus metrics at /metrics. The admin UI reads the same metrics through an authenticated admin endpoint for live operational charts.
Start Prometheus and Grafana:
make observability-upOpen:
Prometheus: http://localhost:9090
Grafana: http://localhost:3000
Stop:
make observability-downFrontend:
cd ui
pnpm install
pnpm devDuring UI development, Next.js proxies /api/admin/* to the Go server. Override the target if needed:
GATEFLOW_ADMIN_TARGET=http://127.0.0.1:8080 pnpm devBackend:
make test
make build
make releaseDirect Go commands:
go test ./...
go build -o bin/gateflow ./cmd/gateflowDirect go build is useful for backend-only development. It does not embed the admin UI unless internal/uiassets/out has been prepared and -tags ui_embed is passed. Use make build or make release for user-facing binaries.
Gateflow 0.3.0 is focused on the LLM Gateway, a local admin UI, and read-only registry surfaces. It does not include a SaaS control plane, billing, OAuth/OIDC, agent execution, workflow orchestration, or a full MCP gateway.
The Agent & Skill Registry is read-only and non-executing. MCP Gateway remains a future module, while the runtime stays local-first and the LLM gateway remains the core path.

