fix(security): pre-release hardening batch — 8 SDD-specced fixes#584
Merged
Conversation
Pre-release security review finding (HIGH, anonymous-exploitable): GetAuditEvents had ZERO authorization — any unauthenticated caller could read the entire audit trail (actor ids, IPs, resource ids, action detail) with a single GET. The endpoint forgot the auth.EnforcePermission call every sibling data handler makes, and no spec AC required it, so it shipped. SDD fix: - spec api-audit-events-query v1.1.0: add C-06 (MUST require audit:read; anonymous/unauthorized -> 403, no events) + AC-11 (deny-path AC). - handler: EnforcePermission(AuditRead) as the first statement. - tests: existing AC-01..10 now authenticate as auditor (they relied on the endpoint being open); new AC-11 asserts anonymous -> 403 with no list leaked, auditor -> 200.
…hin caller's grant) Pre-release finding (HIGH): POST /tokens gated only on token:write and passed role_id straight through, so a security_admin (holds token:write, not the admin grant) could mint a role_id=admin bearer token and act as full admin. No subset check existed anywhere. SDD fix: - spec api-tokens v1.1.0: C-03 (requested role's permissions MUST be a subset of the caller's; else 403, no token) + AC-05. - internal/auth.RoleGrantsWithin(caller, requested): reusable anti- escalation primitive (also applies to role assignment + custom roles). - PostAPIToken: 403 authz.role_exceeds_grant when the requested role exceeds the caller's permissions. - test AC-05: security_admin→admin denied 403; admin→security_admin 201.
Pre-release finding (Medium/High): POST /users/{id}/roles:assign gated on
role:assign but never checked that the assigned role was within the
granter's own permissions. Only admin holds role:assign today (and admin
covers everything), so this is defense-in-depth now — but it becomes a live
escalation path the moment any lower/custom role is granted role:assign.
SDD fix:
- spec api-users v1.1.0: C-05 + AC-13 (assigned role's permissions MUST be
a subset of the caller's; else 403, no row).
- handler: reuse auth.RoleGrantsWithin; 403 on exceeds-grant.
- AC-13 test asserts the helper denies security_admin->admin and allows
admin->security_admin (within-grant 204 covered by AC-08).
…me-deny) Pre-release finding (HIGH): the server set NO security headers. On an origin serving both the SPA and the API with no edge proxy, that means clickjacking (no X-Frame-Options/frame-ancestors), SSL-strip on first contact (no HSTS), MIME-sniffing (no nosniff), and no CSP defense-in-depth. SDD fix: - spec system-http-server v1.1.0: C-12 + AC-17 (HSTS, CSP w/ frame-ancestors 'none' + default-src 'self', nosniff, X-Frame-Options DENY, Referrer-Policy no-referrer; /docs gets a Swagger-compatible CSP that still denies framing). C-11/AC-15/16 left for the open static-delivery PR. - securityHeaders middleware mounted right after correlation. - test AC-17 asserts all five headers + the /docs CSP relaxation. NOTE: the app CSP is SPA-safe in theory (Vite external bundles + MUI inline styles) but should be smoke-tested against the running SPA and /docs before release; it's a single tunable const.
…l corpus) Pre-release finding (HIGH): every production users.NewService(pool, nil) passed a nil breach corpus, and ValidatePassword skips the breach check when the corpus is nil — so the HaveIBeenPwned apparatus was dead code in prod. Users/admins could set known-compromised passwords (password123, etc.). SDD fix: - spec system-auth-identity v1.3.0: C-15 (prod MUST validate against a non-nil corpus) + AC-27. - identity.DefaultBreachCorpus(): an always-on embedded baseline of the most common compromised passwords (internal/identity/common_passwords.txt, 129 entries, SHA-1'd at load). Airgap-safe — no external file needed; operators can extend via a full HIBP dump (OPENWATCH_BREACH_CORPUS_FILE). - wired all 3 prod sites (server.New, newHandlers, admin-create CLI) to DefaultBreachCorpus() instead of nil. - test AC-27 rejects password123/qwerty123/Password1, accepts a strong pw. - .gitignore: negate the blanket *password* rule for the embedded corpus (it's a public list, not a secret, and go:embed requires it tracked).
Follow-up to the audit:read gate (finding #1): api_echo_test and api_signoff_test also queried GET /audit/events via the unauthenticated doGet helper and now correctly 403. Convert them to authenticate as auditor (a role holding audit:read).
…ict) Consolidated, risk-ranked review behind PR #584's fixes: the 5 fixed findings, the 3 open release blockers (CSRF, login rate-limiting, SSH host-key persistence), the medium/low backlog, and the verified strengths. Names per the *_SECURITY_REVIEW_*.md convention (gitignore-allowlisted).
Pre-release blocker (HIGH): POST /auth/login and /auth/mfa:verify had no throttle, no lockout — unlimited online password/OTP guessing and credential stuffing (flagged by both the auth and web audits). SDD fix: - spec system-http-server v1.2.0: C-13 + AC-18 (per-IP rate limit on the auth endpoints; 429 + Retry-After over the limit; IP from RemoteAddr, not a client-supplied forwarded header). - dependency-free sliding-window rateLimiter + rateLimitAuth middleware (20/min/IP), mounted after securityHeaders. Per-server-instance so it's test-isolated. - tests: limiter allows-to-limit/denies-limit+1/keys-independent, and an end-to-end POST /auth/login returns 429 after the budget. Follow-up: per-username lockout + trusted-proxy client-IP extraction (the X-Forwarded-For trust finding) are deferred.
Pre-release blocker (HIGH): state-changing endpoints authenticate via the openwatch_session cookie, and the frontend advertised a double-submit CSRF scheme — but NO server code set an XSRF-TOKEN cookie or validated X-CSRF-Token. Protection reduced to SameSite=Lax; the client-side half was theater. SDD fix: - spec system-http-server v1.3.0: C-14 + AC-19 (double-submit; issue non-HttpOnly XSRF-TOKEN at login/refresh; require matching X-CSRF-Token on unsafe cookie-auth requests; exempt Bearer/token, /api/v1/auth/*, and requests with no session cookie). - csrfProtect middleware (constant-time compare) mounted after the rate limiter; setCSRFCookie at login + refresh-cookie. - the frontend already echoes XSRF-TOKEN -> X-CSRF-Token (client.ts), so no frontend change was needed — only the server half. - tests: AC-19 deny (no/mismatched token -> 403 authz.csrf_invalid) + proceed with a matching token + no-session-cookie exemption. doReq now simulates the browser double-submit centrally so the suite stays green; the deny test bypasses it deliberately.
Pre-release blocker (Med-High): every production dial used ModeTOFU + NewMemoryStore(), so host keys were never persisted — TOFU re-trusted every host on each daemon restart. A network-positioned attacker could MITM the first scan after every restart and harvest the credentials presented to each host. SDD fix: - spec system-ssh-connectivity v1.3.0: C-13 (production TOFU MUST be durable across restarts) + AC-22. - migration 0036: ssh_known_hosts table (hostname PK, public_key, first/last seen). - internal/knownhosts.Store: a pgx-backed ssh.KnownHostsStore (kept out of internal/ssh so the dial layer stays DB-free). Get degrades to not-found on a DB blip (first-seen TOFU), never fails closed; a CHANGED key is still rejected by the dial layer (ErrHostKeyMismatch). - wired at all 4 production dial sites (scan worker + serve, discovery x2) in place of NewMemoryStore(). - test AC-22: Put/Get round-trip, absent→not-found, and a fresh Store on the same DB sees the key (durable TOFU across 'restart'). Follow-up (documented in the review): an optional ModeStrict deployment path (pre-provisioned keys) and folding internal/sshprivilege's InsecureIgnoreHostKey probe onto this same store.
remyluslosius
added a commit
that referenced
this pull request
Jun 17, 2026
…ides, future-proof upgrade test (#585) * docs(changelog): spec a human-readable changelog + populate [Unreleased] Add specs/release/changelog.spec.yaml (release-changelog): every entry is a user-facing sentence (>=5 words, no commit-subject prefix), Keep a Changelog categories, dated version sections, no emoji. Enforced by packaging/tests/changelog_test.go (source-inspection, AC-01..05), scoped to the actively-edited [Unreleased] section so history is grandfathered. Populate [Unreleased] with the post-rc.7 work in that style: Settings activation (users, notifications, security/SSO), per-host SSH auth/sudo learning, one-command safe upgrade, airgap + fresh-install fixes, and the pre-release security-hardening batch (#584). * docs: refresh hardening + install guides; de-hardcode upgrade test - SECURITY_HARDENING.md: drop the 'not yet implemented' rate-limit/headers callout (shipped in #584); document auth rate limiting, CSRF, and security headers as live; note always-on breach screening; add an outbound-SSH persistent known-hosts subsection; version line to 0.2.0 rc series. - install_guide.md: fix the stale DEB version example (rc.5 -> rc.7) and add an 'Upgrading' section documenting the one-command auto-migrate + backup + fail-safe path and /etc/openwatch/upgrade.conf. - upgrade-container-test.sh / run-upgrade-container-test.sh: stop hardcoding migration 34->35 and the host_connection_profile table. The host driver now derives the head migration's goose version_id and its -- +goose Down SQL and passes them in; the test reverses exactly the head migration and asserts a generic prior->head advance, so it survives every future migration. (CLAUDE.md was also refreshed locally for AI sessions but is intentionally gitignored per commit 7a73353, so it is not part of this commit.) * test(packaging): pin upgrade-test driver to the host RPM arch dist/ can hold a leftover cross-built arm64 RPM (the packaging Go suite cross-builds it for AC-14), and the rockylinux:9 container runs the host platform, so the previous `cp dist/openwatch-*-1.*.rpm` swept in both arches and `rpm -i` collided on /usr/bin/openwatch. Glob a single host-arch RPM. Validated end-to-end: the container test now passes — real `rpm -U` runs the %post helper ($1=2), which stops the service, takes a pg_dump restore point, applies the head migration (35 -> 36), and restarts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pre-release security hardening — 8 SDD-specced fixes
A complete, six-dimension security review (auth/session, authZ/RBAC,
crypto/secrets, injection/SSH/SSRF, web/HTTP/audit, supply-chain) was run
against the Go backend, the React frontend, and packaging/CI. Every
High/Critical finding was re-verified by hand against the source. All eight
actionable findings are fixed here under spec→test→code discipline (one commit
per finding); the full review is in
docs/PRE_RELEASE_SECURITY_REVIEW_2026-06-16.md.Fixes (each: spec bumped, test with @spec/@ac tokens, then code)
GET /api/v1/audit/eventshad zero authorization (anonymous audit-trail disclosure) → requireaudit:readapi-audit-events-queryC-06/AC-11token:writeholder could mint anadminbearer) →auth.RoleGrantsWithinsubset checkapi-tokensC-03/AC-05roles:assignhad no subset guard → sameRoleGrantsWithincheckapi-usersC-05/AC-13securityHeadersmiddlewaresystem-http-serverC-12/AC-17NewServicegot anilcorpus) → always-on embeddedDefaultBreachCorpus(airgap-safe) + HIBP file overridesystem-auth-identityC-15/AC-27csrfProtectconstant-time double-submit, gated on the session cookiesystem-http-serverC-14/AC-19/auth/login+/auth/mfa:verify(429 + Retry-After)system-http-serverC-13/AC-18KnownHostsStore(migration 0036) at all 4 dial sitessystem-ssh-connectivityC-13/AC-22Items 6/7/8 were the three release blockers; they are now closed.
Verification
gofmtclean ·go vet ./...clean ·go build ./...OKspecter check— all specs passinternal/serversuite green (CSRF + rate-limit middlewares exercised end-to-end); newinternal/knownhostsstore test passes against isolated PGNotes for review
/api/v1/auth/*, and cookieless requests are exempt — so the API-token path is unaffected.RemoteAddr(notX-Forwarded-For) so an attacker cannot rotate the bucket.go:embed), keeping airgap builds self-contained; operators can pointOPENWATCH_BREACH_CORPUS_FILEat a full HIBP list.🤖 Generated with Claude Code