fix(security): stop ANTHROPIC_BASE_URL settings overrides redirecting the agent off the PostHog gateway by gewenyu99 · Pull Request #703 · PostHog/wizard

gewenyu99 · 2026-06-21T19:10:28Z

What the user saw

A prod, interactive npx @posthog/wizard run whose agent went to https://api.code-relay.com. The error string isn't in our source — it came from claude-code/the relay, i.e. the spawned agent was actually pointed there.

Investigation (measured, not assumed)

Shell ANTHROPIC_BASE_URL is NOT the vector. The wizard overrides it at agent-interface.ts:564. Measured: with ANTHROPIC_BASE_URL=https://api.code-relay.com exported, the agent subprocess receives https://gateway.us.posthog.com/wizard. So the "inherited shell env var" theory doesn't hold against current code.
ANTHROPIC_AUTH_TOKEN / ANTHROPIC_API_KEY change auth, not the host → they 401, they don't redirect.
The vector is ANTHROPIC_BASE_URL in a Claude Code settings file (env block). claude-code applies settings-env over the process env, which is why the wizard removes/blocks the file rather than just env-overriding.

The interactive path already blocks the settings files it detects (SettingsOverrideScreen backs the project file up; ManagedSettingsScreen exits). So the leak is a detection gap, and there are two:

Bug 1 (the interactive/prod leak): managed-settings detection was macOS-only

MANAGED_SETTINGS_PATHS hardcoded only /Library/Application Support/ClaudeCode/managed-settings.json. On Linux (/etc/claude-code/...) or Windows (C:\ProgramData\...), an org/MDM-managed env.ANTHROPIC_BASE_URL (a corporate relay) was never detected → no conflict screen → agent launches → claude-code applies the managed override → every call redirected, even interactively. Fix: check all three platform paths.

Bug 2 (the `--ci` leak): `LoggingUI.showSettingsOverride` was a no-op

A --ci run detected the conflict and then return Promise.resolve()'d — launching the agent with the override in place. Fix: remove the writable (project) override; reject (abort before launch) on anything non-removable.

Before / after — driven through the new e2e control-plane harness

SCENARIO: a Linux dev box with org-managed Claude Code settings:
  /etc/claude-code/managed-settings.json -> env.ANTHROPIC_BASE_URL = https://api.code-relay.com
  (a normal INTERACTIVE `npx @posthog/wizard` run)

===== BEFORE (detection hardcoded to the macOS path) =====
  checkAllSettingsConflicts -> []
  => NOT detected. The wizard shows no conflict screen and launches the agent.
  => claude-code applies the managed override -> ALL model calls go to api.code-relay.com.  LEAK.

===== AFTER (detection checks the platform managed path) =====
  checkAllSettingsConflicts -> [{"source":"managed","keys":["ANTHROPIC_BASE_URL"],"writable":false}]
  => detected (managed, non-writable). The interactive run now shows a blocking screen and refuses:

  driver read_state.currentScreen = managed-settings  (agent never launches)
      ╭──────────────────────────────────────────────────────────────╮
      │                     ⚠ Settings conflict                       │
      │   These Claude Code settings override credentials and prevent │
      │   the Wizard from reaching the PostHog LLM Gateway.           │
      │   Organization-managed settings                               │
      │   /etc/claude-code/managed-settings.json                      │
      │     • ANTHROPIC_BASE_URL                                       │
      │     ▸     Exit [Esc]                                           │
      ╰──────────────────────────────────────────────────────────────╯

The AFTER panel is the real Ink screen, reconstructed from store state and rendered offline by the wizard-ci-tools control plane (renderFrame) — no agent, no network. (Same harness renders the --ci refusal for Bug 2.)

Tests

src/lib/agent/__tests__/managed-settings-crossplatform.test.ts — BEFORE (macOS-only) misses the Linux managed file; AFTER detects it (managed, non-writable); default paths cover Linux+Windows.
src/ui/__tests__/logging-ui-settings-guard.test.ts — CI refuses non-removable overrides, removes the writable one, resolves when clean.

🤖 Generated with Claude Code

…cts the gateway LoggingUI.showSettingsOverride was a no-op (`return Promise.resolve()`), so a non-interactive `--ci` run that detected a Claude Code settings conflict ignored it and launched the agent anyway. If that settings file carried `env.ANTHROPIC_BASE_URL` (e.g. a third-party relay like api.code-relay.com), every model call was silently redirected off the PostHog LLM Gateway — the TUI screens (SettingsOverride / ManagedSettings) refuse, but CI leaked. Now CI enforces the same guarantee: remove the writable (project) override via backupAndFix; reject (so the runner aborts before the agent starts) on any override we can't remove (managed / global / project-local). Regression test in src/ui/__tests__/logging-ui-settings-guard.test.ts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-21T19:10:37Z

🧙 Wizard CI

Run the Wizard CI and test your changes against wizard-workbench example apps by replying with a GitHub comment using one of the following commands:

Test all apps:

/wizard-ci all

Test all apps in a directory:

/wizard-ci basic-integration
/wizard-ci error-tracking-upload-source-maps
/wizard-ci misc
/wizard-ci revenue

Test an individual app:

/wizard-ci basic-integration/android
/wizard-ci basic-integration/angular
/wizard-ci basic-integration/astro

Show more apps

/wizard-ci basic-integration/django
/wizard-ci basic-integration/fastapi
/wizard-ci basic-integration/flask
/wizard-ci basic-integration/javascript-node
/wizard-ci basic-integration/javascript-web
/wizard-ci basic-integration/laravel
/wizard-ci basic-integration/next-js
/wizard-ci basic-integration/nuxt
/wizard-ci basic-integration/python
/wizard-ci basic-integration/rails
/wizard-ci basic-integration/react-native
/wizard-ci basic-integration/react-router
/wizard-ci basic-integration/sveltekit
/wizard-ci basic-integration/swift
/wizard-ci basic-integration/tanstack-router
/wizard-ci basic-integration/tanstack-start
/wizard-ci basic-integration/vue
/wizard-ci error-tracking-upload-source-maps/android
/wizard-ci error-tracking-upload-source-maps/cicd-docker-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-github-actions-docker-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-github-actions-nested-docker-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-github-actions-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-gitlab-node-raw
/wizard-ci error-tracking-upload-source-maps/cicd-ssh-vps-node-raw
/wizard-ci error-tracking-upload-source-maps/flutter
/wizard-ci error-tracking-upload-source-maps/ios
/wizard-ci error-tracking-upload-source-maps/next
/wizard-ci error-tracking-upload-source-maps/next-no-posthog
/wizard-ci error-tracking-upload-source-maps/node-raw
/wizard-ci error-tracking-upload-source-maps/node-rollup
/wizard-ci error-tracking-upload-source-maps/node-rollup-typescript-plugin
/wizard-ci error-tracking-upload-source-maps/node-webpack
/wizard-ci error-tracking-upload-source-maps/nuxt-3-6
/wizard-ci error-tracking-upload-source-maps/nuxt-4-3
/wizard-ci error-tracking-upload-source-maps/react-native
/wizard-ci error-tracking-upload-source-maps/react-vite
/wizard-ci error-tracking-upload-source-maps/rust
/wizard-ci misc/quack-quack
/wizard-ci revenue/stripe

Results will be posted here when complete.

…tforms The interactive (TUI) leak: detection hardcoded the macOS managed-settings path (/Library/Application Support/ClaudeCode/managed-settings.json), so a managed (org/MDM) `env.ANTHROPIC_BASE_URL` on Linux (/etc/claude-code/...) or Windows (C:\ProgramData\...) went undetected. Claude Code applies managed settings regardless of settingSources, so every model call was redirected off the PostHog gateway (e.g. to a relay like api.code-relay.com) — even in a normal interactive run, with no conflict screen shown. Check all three platform managed paths (a non-current-platform path simply won't exist, so this is safe). Now the existing ManagedSettingsScreen / CI guard fire on every OS. Cross-platform regression test added. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ts the spawn env scripts/precedence.no-jest.ts — two local listeners, a project .claude/settings.json that sets ANTHROPIC_BASE_URL, and a real claude-code query with the GATEWAY in the spawn env (exactly as the wizard passes it). Result: the /v1/messages call goes to the SETTINGS host, not the spawn-env host — confirming env-override alone is insufficient and the wizard must detect/remove the settings file. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 · 2026-06-21T21:40:35Z

Repro + verification on real production code (`runAgent`)

Companion PR — the new test harness: #702 (the wizard-ci-tools control plane: src/lib/ci-driver/ — WizardCiDriver, renderFrame/replay.ts, recorder). How it was used here, and its honest limits, are spelled out below.

How the new test harness (#702) was used — the interactive-path side

A normal npx @posthog/wizard user is on the interactive (Ink) path. To show what that path does when the wizard detects the override, I used the #702 control plane to render the real Ink screen offline — no agent, no network:

WizardCiDriver.readState().currentScreen  →  "managed-settings"   (the run is blocked)

renderFrame(frame)  →  reconstructs a throwaway store from state and mounts the real screen:
              ╭──────────────────────────────────────────────────────────────────────╮
              │                         ⚠ Settings conflict                          │
              │   These Claude Code settings override credentials and prevent the    │
              │   Wizard from reaching the PostHog LLM Gateway.                      │
              │   Organization-managed settings                                      │
              │   /etc/claude-code/managed-settings.json                             │
              │     • ANTHROPIC_BASE_URL                                             │
              │     ▸     Exit [Esc]                                                 │
              ╰──────────────────────────────────────────────────────────────────────╯

Honest scope of the harness here: it drives the wizard through InkUI, whose conflict screens refuse — so by design it cannot exhibit the leak, only the refusal. The leak lives in the non-interactive LoggingUI path. So the leak/fix repro below uses the real runAgent directly, not the harness. The harness proves the interactive refusal; runAgent proves the leak and the fix.

Leak/fix on the real production path (`runAgent`)

Driven through the wizard's actual runAgent (agent-runner → bootstrap → initializeAgent → the real integration agent) — not a hand-rolled query(). A throwaway project carries a Claude Code settings override; a single localhost listener stands in for the relay (127.0.0.1:9002), so nothing leaves the machine. The wizard sets the real gateway (gateway.us.posthog.com/wizard) in the spawn env. We watch which host the agent's /v1/messages actually hits.

project/.claude/settings.json → { "env": { "ANTHROPIC_BASE_URL": "http://127.0.0.1:9002" } }

BEFORE — `LoggingUI.showSettingsOverride` no-op (origin/main)

│  Using provided API key (CI mode - OAuth bypassed)
◇  Initializing Claude agent...
✔  Agent initialized. Let's get cooking!
◌  Writing your PostHog setup with events, error capture and more...

>>> LEAK CONFIRMED: the wizard's agent sent /v1/messages to the RELAY (127.0.0.1:9002)

Verbose agent log proves it's the real prod agent:

Configured LLM gateway: https://gateway.us.posthog.com/wizard
✔  Framework: Next.js 15.3.0
STEP 1: Call load_skill_menu (from the wizard-tools MCP server) ...
STEP 2: Call install_skill (from the wizard-tools MCP server) ...

→ wizard set the gateway in the spawn env, but the settings override redirected the actual model call to the relay. Leak reproduced on prod code.

AFTER — the fix

✔  Agent initialized. Let's get cooking!
◌  Writing your PostHog setup with events, error capture and more...
◌  [0/5] Plan event tracking
◇  Checking project structure.
◇  Verifying PostHog dependencies.
◇  Generating events based on project.

>>> NO LEAK: 75s elapsed with no /v1/messages to the relay — the wizard removed/refused the override.

→ same real agent, now running its integration steps against the gateway, zero relay hits. The fix's backupAndFix() removed the override before the agent launched.

Verdict

	prod `runAgent`	`/v1/messages` →
BEFORE (no-op)	real integration agent	RELAY 🔴 leak
AFTER (fix)	real integration agent	gateway 🟢 no leak

Cause of "override not removed" in the wild

Claude Code settings-env beats the spawn env (confirmed directly: scripts/precedence.no-jest.ts — /v1/messages hits the settings host). It goes undetected when it lives in a managed file on a platform the old detection didn't check — MANAGED_SETTINGS_PATHS was macOS-only, so a Linux /etc/claude-code/managed-settings.json returned checkAllSettingsConflicts → []. Fixed here to check all three platform paths.

Reproduce

scripts/relay-prod.no-jest.ts (in this PR):

env -u CLAUDECODE -u CLAUDE_CODE_SDK_HAS_OAUTH_REFRESH -u CLAUDE_CODE_SDK_HAS_HOST_AUTH_REFRESH \
  POSTHOG_PERSONAL_API_KEY=<phx> npx tsx scripts/relay-prod.no-jest.ts

Run on origin/main → LEAK; on this branch → NO LEAK.

🤖 Generated with Claude Code

scripts/relay-prod.no-jest.ts — runs the wizard's REAL runAgent against a project with a Claude Code settings ANTHROPIC_BASE_URL override and a localhost relay listener. On origin/main the integration agent's /v1/messages hits the relay (leak); with the fix it hits the gateway (no leak). scripts/precedence.no-jest.ts isolates the mechanism (settings env beats spawn env). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…nt behavior - README: add "Explore with an agent" under Running locally → Testing (was wrongly placed in the workbench README). - scripts/README: drop the cross-PR pointer to the #703 repro scripts. - Trim header/inline comments across the harness + scripts to concise descriptions of what the code does now — no history, no change-rationale. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gewenyu99 changed the title ~~fix(ci): refuse to launch when a Claude Code settings override redirects the gateway~~ fix(security): stop ANTHROPIC_BASE_URL settings overrides redirecting the agent off the PostHog gateway Jun 21, 2026

gewenyu99 and others added 3 commits June 21, 2026 17:40

Merge branch 'main' into fix-ci-settings-override-leak

70e75e6

style: prettier-format managed-settings-crossplatform test

1404e0a

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): stop ANTHROPIC_BASE_URL settings overrides redirecting the agent off the PostHog gateway#703

fix(security): stop ANTHROPIC_BASE_URL settings overrides redirecting the agent off the PostHog gateway#703
gewenyu99 wants to merge 6 commits into
mainfrom
fix-ci-settings-override-leak

gewenyu99 commented Jun 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 21, 2026

Uh oh!

gewenyu99 commented Jun 21, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gewenyu99 commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What the user saw

Investigation (measured, not assumed)

Bug 1 (the interactive/prod leak): managed-settings detection was macOS-only

Bug 2 (the --ci leak): LoggingUI.showSettingsOverride was a no-op

Before / after — driven through the new e2e control-plane harness

Tests

Uh oh!

github-actions Bot commented Jun 21, 2026

🧙 Wizard CI

Uh oh!

gewenyu99 commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Repro + verification on real production code (runAgent)

How the new test harness (#702) was used — the interactive-path side

Leak/fix on the real production path (runAgent)

BEFORE — LoggingUI.showSettingsOverride no-op (origin/main)

AFTER — the fix

Verdict

Cause of "override not removed" in the wild

Reproduce

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gewenyu99 commented Jun 21, 2026 •

edited

Loading

Bug 2 (the `--ci` leak): `LoggingUI.showSettingsOverride` was a no-op

gewenyu99 commented Jun 21, 2026 •

edited

Loading

Repro + verification on real production code (`runAgent`)

Leak/fix on the real production path (`runAgent`)

BEFORE — `LoggingUI.showSettingsOverride` no-op (origin/main)