Skip to content

dotlabshq/gateflow

Repository files navigation

Gateflow

Gateflow is a local-first LLM gateway for Ollama clusters and agentic clients.

It exposes an OpenAI-compatible chat completions endpoint, resolves client model aliases to real Ollama models, routes traffic across local backends, preserves streaming, records approximate usage, exports Prometheus metrics, and includes a local admin UI for day-to-day gateway operations.

Screenshots

Put screenshots in docs/screenshots/ with these filenames:

Gateflow dashboard

Gateflow models

Installation

Automated installer for macOS / Linux, arm64 / amd64:

curl -fsSL https://raw.githubusercontent.com/dotlabshq/gateflow/main/install.sh | bash

Skip confirmation:

curl -fsSL https://raw.githubusercontent.com/dotlabshq/gateflow/main/install.sh | bash --yes

Build from source:

make build

make build builds the Next.js admin UI first, embeds the static export into the Go binary, and then produces a self-contained gateflow binary. End users do not need Node.js, pnpm, or Next.js to run /ui.

Verify:

gateflow version

Quick Start

On first run, Gateflow creates ~/.gateflow/config.yaml if it does not already exist. The generated config points at local Ollama, enables the UI, enables the LLM gateway, and keeps MCP plus the agent/skill registry disabled by default.

Initialize SQLite control-plane data, sync Ollama models, and create a default credential:

gateflow init

Start the gateway:

gateflow serve

Open the admin UI:

http://localhost:8080/ui

Default admin login:

username: admin
password: admin

Send a chat request:

curl http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer gfsk-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "code", "messages": [{"role": "user", "content": "hello"}], "stream": true}'

Docker

Start Gateflow with Prometheus and Grafana:

make docker-up

The Docker config stores runtime data in the gateflow-data volume and points the default Ollama backend at http://host.docker.internal:11434/v1. It seeds a local development credential so Prometheus can scrape the protected /metrics endpoint:

gfsk-docker-local-dev-key-change-me

Open:

Gateflow:   http://localhost:8080
Admin UI:   http://localhost:8080/ui
Prometheus: http://localhost:9090
Grafana:    http://localhost:3000

Follow Gateflow logs:

make docker-logs

Stop the stack:

make docker-down

CLI

gateflow version
gateflow serve
gateflow init
gateflow status
gateflow status --live

gateflow config validate

gateflow org list|add|remove
gateflow project list|add|remove
gateflow app list|add|remove
gateflow credential list|add|remove|explain-key|generate
gateflow model list|sync
gateflow backend list|drain|undrain
gateflow reload

All commands default to ~/.gateflow/config.yaml. Use -c <path> to target another config file, or set GATEFLOW_CONFIG:

export GATEFLOW_CONFIG=/etc/gateflow/config.yaml

For Kubernetes, mount the config file and set the env var instead of passing -c:

env:
  - name: GATEFLOW_CONFIG
    value: /etc/gateflow/config.yaml

Useful examples:

gateflow org add --name "Acme"
gateflow project add --org <organization-id> --name "Backend"
gateflow app add --project <project-id> --name "Code Agent"
gateflow credential add --app <app-id> --name "Code Agent Key"
gateflow credential remove <credential-id>

gateflow credential generate \
  --org acme --project backend --app code-agent \
  --name "Code Agent Key"

gateflow credential explain-key gfsk-YOUR_KEY
gateflow model sync
GATEFLOW_API_KEY=gfsk-YOUR_KEY gateflow reload
gateflow backend drain m3-studio-01
gateflow backend undrain m3-studio-01

gateflow reload calls the protected POST /-/reload endpoint. It refreshes runtime identity, model, policy, and registry indexes from the current config and SQLite store without restarting the process. Listener address, backend definitions, routing settings, and usage storage still require a restart.

Backend drain mode updates bootstrap config so a backend stops receiving new requests after restart. Existing streams are not interrupted by the router while the process is running.

Runtime

MVP endpoints:

POST /v1/chat/completions
GET  /v1/models
POST /-/reload
GET  /api/admin/*
GET  /metrics
GET  /healthz
GET  /readyz

/healthz is public for liveness probes. LLM and runtime endpoints require a bearer API key. Admin API endpoints use the local UI session cookie.

Core behavior:

  • Custom streaming proxy for /v1/chat/completions
  • API key resolves to credential, app, project, and organization
  • Hierarchical model policy and limits
  • Model aliases hide real Ollama model names from clients
  • Warm-aware least-active backend routing
  • Backend health probes, retries before stream start, circuit breaker, and drain flag
  • Approximate token usage via character count / 4
  • Async usage writes to JSONL or SQLite
  • Prometheus metrics
  • Guardrail hook points without a heavy rule engine
  • Local admin UI for dashboard, models, providers, credentials, and scope configuration

Admin UI

The admin UI is a single Next.js app under ui/. It is intended for local-first operations, not as a SaaS control plane.

Current screens:

  • Dashboard with live Prometheus metrics, auto-refresh, and lightweight client-side charts
  • Models with Ollama sync metadata and capabilities
  • Agents and skills as read-only registry views when the registry module is enabled
  • Providers as read-only backend health/load views
  • Credentials with app-scoped API key generation
  • Apps, projects, and organizations with limits and hierarchical model allowlists

Credential flow remains opaque:

API Key -> Credential -> App -> Project -> Organization -> Policy

The UI only keeps short-lived chart samples in browser memory. Persistent time-series history belongs in Prometheus and Grafana.

Configuration

Gateflow uses YAML for bootstrap settings:

server:
  listen: ":8080"

storage:
  type: sqlite
  dsn: "~/.gateflow/data/gateflow.db"

usage:
  mode: "sqlite_async"
  path: "~/.gateflow/data/usage.jsonl"

routing:
  default_strategy: "warm_aware_least_active"
  max_retries: 1

backends:
  - id: local-ollama
    type: ollama
    base_url: "http://127.0.0.1:11434/v1"
    drain: false

Control-plane data such as organizations, projects, apps, credentials, model aliases, and usage events lives in SQLite when storage.type: sqlite is enabled.

See config.example.yaml and docs/config-schema.md.

Observability

Gateflow exports Prometheus metrics at /metrics. The admin UI reads the same metrics through an authenticated admin endpoint for live operational charts.

Start Prometheus and Grafana:

make observability-up

Open:

Prometheus: http://localhost:9090
Grafana:    http://localhost:3000

Stop:

make observability-down

Development

Frontend:

cd ui
pnpm install
pnpm dev

During UI development, Next.js proxies /api/admin/* to the Go server. Override the target if needed:

GATEFLOW_ADMIN_TARGET=http://127.0.0.1:8080 pnpm dev

Backend:

make test
make build
make release

Direct Go commands:

go test ./...
go build -o bin/gateflow ./cmd/gateflow

Direct go build is useful for backend-only development. It does not embed the admin UI unless internal/uiassets/out has been prepared and -tags ui_embed is passed. Use make build or make release for user-facing binaries.

Scope

Gateflow 0.3.0 is focused on the LLM Gateway, a local admin UI, and read-only registry surfaces. It does not include a SaaS control plane, billing, OAuth/OIDC, agent execution, workflow orchestration, or a full MCP gateway.

The Agent & Skill Registry is read-only and non-executing. MCP Gateway remains a future module, while the runtime stays local-first and the LLM gateway remains the core path.

About

Local-first LLM gateway for Ollama clusters and agentic clients, with OpenAI-compatible streaming, model routing, Prometheus metrics, and an embedded admin UI.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors