Operational reference for running and debugging the stack on a workstation. The Environment variables and Test / skill env files sections below carry binding force — they are the canonical contracts those rules point at.
Bootstrap the infrastructure dependencies once with ./scripts/dev-start.sh: it
brings up PostgreSQL, Valkey, and NATS, runs the migrations, and seeds the
database (--force-reset wipes all local data, including traffic_event). Then
run each service in the foreground:
cd packages/<svc> && go run ./cmd/<svc>/ -config <svc>.dev.yamlLocal ports: Hub 3060, Control Plane 3001, AI Gateway 3050, Compliance
Proxy 3128, and the Control Plane UI on 3000 (npm run dev:control-plane-ui).
Logging is configured per service (LogConfig):
level—trace|debug|info|warn|error(defaultinfo); theLOG_LEVELenvironment variable overrides it.format—json|text(defaultjson).file— optional path to tee logs to (also via theLOG_FILEenv var); otherwise logs go to stdout.stackOnError— attach a goroutine stack on error-level logs (envLOG_STACK_ON_ERROR).
Because local services run in the foreground, their logs stream to the terminal
on stdout as JSON by default; set LOG_FILE to also capture them to a file. The
log level is additionally hot-reloadable through the Hub shadow (the log_level
config key), so verbosity can be raised on a running service without a restart.
A local service is just a foreground process. Restart it by stopping the process
(Ctrl-C, or kill it) and re-running its go run command. ./scripts/dev-start.sh
refreshes the infrastructure dependencies and seed when those need to be reset.
tests/lib/auth.sh provides the Control Plane admin-API helpers, all driven from
the loaded environment:
cp_login— drives the OAuth + PKCE flow once and caches the access token (idempotent):/oauth/authorize→/authserver/password→/oauth/token. It readsNEXUS_CP_URL,NEXUS_ADMIN_EMAIL,NEXUS_ADMIN_PASSWORD,NEXUS_OAUTH_CLIENT_ID(defaultcp-ui), andNEXUS_OAUTH_REDIRECT_URI.cp_curl <path> [args…]— calls<path>with the cached bearer token (GET, or any method via-X).cp_curl_code <path>— returns just the HTTP status code.cp_curl_full <path>— returns the body plus a---HTTP_STATUS---line and the code.cp_token— returns the cached token, logging in first if needed.
When the AI Gateway runs at LOG_LEVEL=debug (or trace), the dispatcher wraps
the upstream response body in a debugBody reader. On close it emits a single
upstream stream body record at DEBUG level — a snapshot of what the provider
actually sent over the wire — with these fields:
format— the wire format being read.bytes_captured— how many bytes were accumulated.capped— whether the capture hit its byte limit (thebodythen carries a(truncated)suffix).body— the captured bytes (or(empty — no bytes read from stream body)).
This is gated behind a DEBUG-enabled check and never runs in production.
Local development loads secrets and configuration from a repo-root .env file
through bootenv (packages/shared/core/bootenv/). The contract:
- Application code reads every secret via
os.Getenvonly; the.envfile is a convenience loader, not something the code reads directly. - Loading is non-overload: a variable already present in the process
environment wins over the
.envvalue. That lets both a shared.envand an explicitMY_VAR=x ./svcinvocation work, with the explicit value taking precedence. - The loader is silent when no
.envexists — production has none (its environment comes from systemd / Kubernetes), so the boot log stays clean; parse errors of an.envthat does exist are surfaced.
Every variable is documented in the repo-root .env.example. Secrets are
env-only: no secret field (auth tokens, HMAC keys, credential-encryption keys,
internal-service tokens, DB passwords) belongs in committed YAML — YAML carries
only non-secret shape and tunings. Cross-service shared secrets are tagged
[MUST MATCH] in .env.example; drift between consumers of a [MUST MATCH]
value is the most common source of inter-service 403s. In production these
variables are delivered via a systemd EnvironmentFile= or a Kubernetes Secret.
Tests and the prod-* skills read their configuration from tests/.env.<target>,
where target is one of local, dev, or prod, loaded by
tests/lib/loadenv.sh:
- Layering. The loader parses
.env.<target>.examplefirst, then.env.<target>on top, with the same non-overload semantics — a value already in the process environment is preserved, and the files only fill gaps. It fails with a copy-from-example hint if neither file exists. - Target selection. The target is the first argument if given, else
$NEXUS_TEST_TARGET, elselocalon an interactive TTY. A non-interactive context (CI, cron) must setNEXUS_TEST_TARGETexplicitly — it will not silently default. - Fail-closed safety guards. With
target=local, everyNEXUS_*_URLmust referencelocalhost/127.0.0.1; a.env.localaccidentally pointing at production fails fast. Withtarget=prod,NEXUS_CP_URLmust not be a loopback address, catching a freshly-copied.env.prodleft with localhost defaults. This keeps a state-mutating test or skill from running against the wrong target. - Adding a new variable. Add it to the committed
.env.<target>.examplewith a placeholder or safe default — that file is what documents the variable — and put the real value in the gitignored.env.<target>. The example-then-user layering means a variable documented in the example is always present.
scripts/dev-start.sh— local stack bootstrappackages/shared/core/bootenv/— repo-root.envloader.env.example— the documented variable catalog and[MUST MATCH]tagstests/lib/loadenv.sh— thetests/.env.<target>loader and safety guardstests/lib/auth.sh— thecp_login/cp_curladmin-API helperspackages/ai-gateway/internal/config/— the service logging configurationpackages/ai-gateway/internal/providers/dispatch/spec_adapter.go— the debug body capture