[WEBXP-747] circleci signup: server-side handshake polling#1199
Open
Fab10-CircleCi wants to merge 7 commits intoCircleCI-Public:mainfrom
Open
[WEBXP-747] circleci signup: server-side handshake polling#1199Fab10-CircleCi wants to merge 7 commits intoCircleCI-Public:mainfrom
Fab10-CircleCi wants to merge 7 commits intoCircleCI-Public:mainfrom
Conversation
6db9f76 to
352a958
Compare
Opens browser directly to /cli-auth with CLI params as top-level query params. The frontend handles session detection: authenticated users get instant PAT creation; unauthenticated users are redirected to signup first. Bypasses auth-svc return-to bug entirely. Security hardening: - Strict Origin validation (reject missing/wrong with 403) - Constant-time state comparison via crypto/subtle - CORS pinned to https://app.circleci.com (static, never reflected) - Access-Control-Allow-Private-Network for Chrome PNA - Method validation (GET only on /token) - 127.0.0.1 binding only Features: - Unique PAT label (hostname + timestamp) prevents 422 duplicates - Already-authenticated guard with --force bypass - --no-browser fallback for headless/SSH - JSON responses for structured error handling - Human-readable URL display in terminal - 24 unit tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the localhost HTTP callback server with a client-initiated polling
loop against the auth-svc handshake endpoint. The CLI generates a UUID v4
handshake_id, opens the browser to /cli-auth?handshake_id=..., and polls
GET /api/v1/cli-handshake/{id} every 3 seconds until the token arrives
(200), the handshake expires (404), or the 10-minute timeout elapses.
This eliminates the cross-origin / Private-Network / same-browser / MFA-
interrupt failure modes of the prior localhost approach — the browser and
CLI never communicate directly, so any device can complete auth.
- Remove localhost listener, CORS/PNA middleware, state-nonce machinery,
and the cli_port/cli_state/cli_label URL params.
- Add UUID-based handshake_id, pollHandshake with retry on transient
network errors, ctx-based Ctrl+C cancellation, 10-minute overall
deadline, and a CIRCLECI_APP_URL env override for enterprise/testing.
- --no-browser now prints the URL for cross-device use instead of
prompting for a manual token paste (the handshake makes manual paste
unnecessary).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
352a958 to
d249bc6
Compare
Auth-svc returns 202 on cache miss, not 404 — the dedicated "authentication expired" branch could never fire in normal operation, and its message was misleading since the server has no 404-for-expiry path. A truly unexpected 404 (e.g. routing misconfiguration) still surfaces via the default "unexpected response: 404" handler, which is honest about what happened without inventing a reason. Timeout is the sole expiry path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three PR-review follow-ups: 1. Reuse cfg.HTTPClient instead of constructing a bare http.Client. The configured client carries the custom CA pool built from cfg.TLSCert plus cfg.TLSInsecure — without it, self-hosted / enterprise users who point CIRCLECI_APP_URL at their own app host would hit TLS verification failures. Per-request deadlines now come from context.WithTimeout(ctx, requestTimeout), so the shared client's Timeout field stays untouched. 2. De-globalize handshakePollWait. The poll interval and per-request timeout are now function parameters on pollHandshake; production passes the package constants, tests pass shortened durations. Removes the var + restore-defer pattern that would have raced under `ginkgo -p` parallel execution. 3. Guard against non-JSON 200 bodies (e.g. Cloudflare / gateway HTML error pages that return 200 on the happy-path URL). handshakePoll now checks Content-Type on a 200 and returns a readable error with a truncated body snippet instead of surfacing the stdlib's "invalid character '<'" JSON error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The CLI's 10-minute poll deadline is coupled to auth-svc's handshake cache TTL (CLI_HANDSHAKE_TTL, default 10m, authentication-service PR #1711). Document the constraint so a future change to either value can't silently regress the laptop-sleep / Docker-pause case where the server evicts the entry before the CLI finishes polling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two changes following the auth-svc decision to revert the Redis handshake TTL from 10 min back to 60s (authentication-service PR #1711 commit ddfcd807): 1. Rewrite the handshakeTimeout inline comment. The prior version asserted that the CLI timeout must stay ≤ the auth-svc TTL — that invariant no longer holds and never really did. The only structural requirement is that the server TTL exceed handshakePollWait; the client-side timeout is an independent UX knob. 2. Treat 429 Too Many Requests and 5xx responses as transient rather than hard-failing on the default "unexpected response" branch. Rate limiting is expected once WEBXP-751 lands Gubernator on the unauthenticated GET, and 5xx covers backend blips. They share the existing handshakeMaxNetErrs budget (3 consecutive) and the counter still resets on any 202. Two new specs cover the retry-then-succeed and sustained-429 paths; the existing "unexpected status" spec is switched from 500 → 418 now that 5xx is transient. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewrites
circleci signupto use the server-side handshake pattern. Instead of running a localhost HTTP server, the CLI generates a UUID v4handshake_id, opens the browser to/cli-auth?handshake_id=UUID, and polls the backend until the token is delivered.Key changes
handshake_idis a UUID v4 generated client-sideapp.circleci.com/cli-auth?handshake_id=UUID(respectsCIRCLECI_APP_URLenv override for enterprise / testing)GET /api/v1/cli-handshake/{handshake_id}every 3s, 10-minute overall client-side timeout (independent from the server-side Redis TTL — see "Server coupling" below)cfg.HTTPClient: preserves the custom CA bundle (cfg.TLSCert) and transport config set up at startup — required for self-hosted installs. Per-request deadlines come fromcontext.WithTimeoutso the shared client'sTimeoutfield is never mutatedinvalid character '<'ginkgo -pcontext.CancelFuncConstants
handshakeTimeout= 10 minutes (user-facing wait budget; independent from server TTL)handshakePollWait= 3 secondshandshakeHTTPTO= 10 seconds (per-request deadline, applied viacontext.WithTimeout)handshakeMaxNetErrs= 3Server coupling
The CLI's 10-minute timeout and the auth-svc Redis TTL are independent contracts:
The only structural requirement is that the server TTL comfortably exceed
handshakePollWait(3s) so a transient blip + one retry still finds the entry. The shorter server TTL minimizes at-rest exposure in Redis; the longer client timeout accommodates slow user journeys (email verification, MFA).Known edge case accepted by tech leads: if the CLI is fully suspended (laptop lid closed, Docker paused) at the exact moment the browser POSTs, the 60s server TTL may expire before the CLI resumes. The CLI will see 202s until
handshakeTimeoutfires; the user re-runscircleci signup. Low-frequency, no code change required.Files changed
cmd/signup.go— full rewrite of signup commandcmd/signup_unit_test.go— 16 specs covering polling (202→200, 429/5xx retry, sustained 429, unexpected status, cancellation, timeout, sustained network errors, non-JSON 200 body), base URL resolution, already-authenticated guard, save-token, telemetry eventRelated
/cli-authpage)CLI_HANDSHAKE_ENABLED=trueenv flip → web-ui#6629 → this CLI release (last)Test plan
go test ./cmd -ginkgo.focus "Signup")go vet ./...cleango tool golangci-lint run ./...cleango build ./cmd/...succeedscircleci signup, verify browser opens withhandshake_idin URL~/.circleci/cli.ymlCIRCLECI_APP_URLto an enterprise host with a custom CA — verify TLS succeeds (exercises thecfg.HTTPClientreuse)🤖 Generated with Claude Code