fix(security): pre-release hardening batch — 8 SDD-specced fixes by remyluslosius · Pull Request #584 · Hanalyx/OpenWatch

remyluslosius · 2026-06-16T23:37:42Z

Pre-release security hardening — 8 SDD-specced fixes

A complete, six-dimension security review (auth/session, authZ/RBAC,
crypto/secrets, injection/SSH/SSRF, web/HTTP/audit, supply-chain) was run
against the Go backend, the React frontend, and packaging/CI. Every
High/Critical finding was re-verified by hand against the source. All eight
actionable findings are fixed here under spec→test→code discipline (one commit
per finding); the full review is in
docs/PRE_RELEASE_SECURITY_REVIEW_2026-06-16.md.

Fixes (each: spec bumped, test with @spec/@ac tokens, then code)

#	Sev	Finding	Spec
1	High	`GET /api/v1/audit/events` had zero authorization (anonymous audit-trail disclosure) → require `audit:read`	`api-audit-events-query` C-06/AC-11
2	High	API-token privilege escalation (`token:write` holder could mint an `admin` bearer) → `auth.RoleGrantsWithin` subset check	`api-tokens` C-03/AC-05
3	Med-High	`roles:assign` had no subset guard → same `RoleGrantsWithin` check	`api-users` C-05/AC-13
4	High	No security headers (HSTS/CSP/nosniff/frame-deny) on a one-origin SPA+API → `securityHeaders` middleware	`system-http-server` C-12/AC-17
5	High	Breach-password check dead (every `NewService` got a `nil` corpus) → always-on embedded `DefaultBreachCorpus` (airgap-safe) + HIBP file override	`system-auth-identity` C-15/AC-27
6	High	No CSRF enforcement (frontend double-submit was theater) → `csrfProtect` constant-time double-submit, gated on the session cookie	`system-http-server` C-14/AC-19
7	High	No login rate-limiting → per-IP sliding-window limiter on `/auth/login` + `/auth/mfa:verify` (429 + Retry-After)	`system-http-server` C-13/AC-18
8	Med-High	SSH host-key in-memory per-process TOFU (MITM on first scan after each restart) → PostgreSQL-backed `KnownHostsStore` (migration 0036) at all 4 dial sites	`system-ssh-connectivity` C-13/AC-22

Items 6/7/8 were the three release blockers; they are now closed.

Verification

gofmt clean · go vet ./... clean · go build ./... OK
specter check — all specs pass
Full internal/server suite green (CSRF + rate-limit middlewares exercised end-to-end); new internal/knownhosts store test passes against isolated PG

Notes for review

CSRF is gated on ambient authority: Bearer/token requests, /api/v1/auth/*, and cookieless requests are exempt — so the API-token path is unaffected.
Rate-limit keys on RemoteAddr (not X-Forwarded-For) so an attacker cannot rotate the bucket.
The breach corpus is embedded (go:embed), keeping airgap builds self-contained; operators can point OPENWATCH_BREACH_CORPUS_FILE at a full HIBP list.
Open medium/low findings (revocable JWTs, demo-cert handling, body-size limits, etc.) are catalogued in the review doc as fast-follow, not blockers.

🤖 Generated with Claude Code

Pre-release security review finding (HIGH, anonymous-exploitable): GetAuditEvents had ZERO authorization — any unauthenticated caller could read the entire audit trail (actor ids, IPs, resource ids, action detail) with a single GET. The endpoint forgot the auth.EnforcePermission call every sibling data handler makes, and no spec AC required it, so it shipped. SDD fix: - spec api-audit-events-query v1.1.0: add C-06 (MUST require audit:read; anonymous/unauthorized -> 403, no events) + AC-11 (deny-path AC). - handler: EnforcePermission(AuditRead) as the first statement. - tests: existing AC-01..10 now authenticate as auditor (they relied on the endpoint being open); new AC-11 asserts anonymous -> 403 with no list leaked, auditor -> 200.

…hin caller's grant) Pre-release finding (HIGH): POST /tokens gated only on token:write and passed role_id straight through, so a security_admin (holds token:write, not the admin grant) could mint a role_id=admin bearer token and act as full admin. No subset check existed anywhere. SDD fix: - spec api-tokens v1.1.0: C-03 (requested role's permissions MUST be a subset of the caller's; else 403, no token) + AC-05. - internal/auth.RoleGrantsWithin(caller, requested): reusable anti- escalation primitive (also applies to role assignment + custom roles). - PostAPIToken: 403 authz.role_exceeds_grant when the requested role exceeds the caller's permissions. - test AC-05: security_admin→admin denied 403; admin→security_admin 201.

Pre-release finding (Medium/High): POST /users/{id}/roles:assign gated on role:assign but never checked that the assigned role was within the granter's own permissions. Only admin holds role:assign today (and admin covers everything), so this is defense-in-depth now — but it becomes a live escalation path the moment any lower/custom role is granted role:assign. SDD fix: - spec api-users v1.1.0: C-05 + AC-13 (assigned role's permissions MUST be a subset of the caller's; else 403, no row). - handler: reuse auth.RoleGrantsWithin; 403 on exceeds-grant. - AC-13 test asserts the helper denies security_admin->admin and allows admin->security_admin (within-grant 204 covered by AC-08).

…me-deny) Pre-release finding (HIGH): the server set NO security headers. On an origin serving both the SPA and the API with no edge proxy, that means clickjacking (no X-Frame-Options/frame-ancestors), SSL-strip on first contact (no HSTS), MIME-sniffing (no nosniff), and no CSP defense-in-depth. SDD fix: - spec system-http-server v1.1.0: C-12 + AC-17 (HSTS, CSP w/ frame-ancestors 'none' + default-src 'self', nosniff, X-Frame-Options DENY, Referrer-Policy no-referrer; /docs gets a Swagger-compatible CSP that still denies framing). C-11/AC-15/16 left for the open static-delivery PR. - securityHeaders middleware mounted right after correlation. - test AC-17 asserts all five headers + the /docs CSP relaxation. NOTE: the app CSP is SPA-safe in theory (Vite external bundles + MUI inline styles) but should be smoke-tested against the running SPA and /docs before release; it's a single tunable const.

…l corpus) Pre-release finding (HIGH): every production users.NewService(pool, nil) passed a nil breach corpus, and ValidatePassword skips the breach check when the corpus is nil — so the HaveIBeenPwned apparatus was dead code in prod. Users/admins could set known-compromised passwords (password123, etc.). SDD fix: - spec system-auth-identity v1.3.0: C-15 (prod MUST validate against a non-nil corpus) + AC-27. - identity.DefaultBreachCorpus(): an always-on embedded baseline of the most common compromised passwords (internal/identity/common_passwords.txt, 129 entries, SHA-1'd at load). Airgap-safe — no external file needed; operators can extend via a full HIBP dump (OPENWATCH_BREACH_CORPUS_FILE). - wired all 3 prod sites (server.New, newHandlers, admin-create CLI) to DefaultBreachCorpus() instead of nil. - test AC-27 rejects password123/qwerty123/Password1, accepts a strong pw. - .gitignore: negate the blanket *password* rule for the embedded corpus (it's a public list, not a secret, and go:embed requires it tracked).

Follow-up to the audit:read gate (finding #1): api_echo_test and api_signoff_test also queried GET /audit/events via the unauthenticated doGet helper and now correctly 403. Convert them to authenticate as auditor (a role holding audit:read).

…ict) Consolidated, risk-ranked review behind PR #584's fixes: the 5 fixed findings, the 3 open release blockers (CSRF, login rate-limiting, SSH host-key persistence), the medium/low backlog, and the verified strengths. Names per the *_SECURITY_REVIEW_*.md convention (gitignore-allowlisted).

Pre-release blocker (HIGH): POST /auth/login and /auth/mfa:verify had no throttle, no lockout — unlimited online password/OTP guessing and credential stuffing (flagged by both the auth and web audits). SDD fix: - spec system-http-server v1.2.0: C-13 + AC-18 (per-IP rate limit on the auth endpoints; 429 + Retry-After over the limit; IP from RemoteAddr, not a client-supplied forwarded header). - dependency-free sliding-window rateLimiter + rateLimitAuth middleware (20/min/IP), mounted after securityHeaders. Per-server-instance so it's test-isolated. - tests: limiter allows-to-limit/denies-limit+1/keys-independent, and an end-to-end POST /auth/login returns 429 after the budget. Follow-up: per-username lockout + trusted-proxy client-IP extraction (the X-Forwarded-For trust finding) are deferred.

Pre-release blocker (HIGH): state-changing endpoints authenticate via the openwatch_session cookie, and the frontend advertised a double-submit CSRF scheme — but NO server code set an XSRF-TOKEN cookie or validated X-CSRF-Token. Protection reduced to SameSite=Lax; the client-side half was theater. SDD fix: - spec system-http-server v1.3.0: C-14 + AC-19 (double-submit; issue non-HttpOnly XSRF-TOKEN at login/refresh; require matching X-CSRF-Token on unsafe cookie-auth requests; exempt Bearer/token, /api/v1/auth/*, and requests with no session cookie). - csrfProtect middleware (constant-time compare) mounted after the rate limiter; setCSRFCookie at login + refresh-cookie. - the frontend already echoes XSRF-TOKEN -> X-CSRF-Token (client.ts), so no frontend change was needed — only the server half. - tests: AC-19 deny (no/mismatched token -> 403 authz.csrf_invalid) + proceed with a matching token + no-session-cookie exemption. doReq now simulates the browser double-submit centrally so the suite stays green; the deny test bypasses it deliberately.

Pre-release blocker (Med-High): every production dial used ModeTOFU + NewMemoryStore(), so host keys were never persisted — TOFU re-trusted every host on each daemon restart. A network-positioned attacker could MITM the first scan after every restart and harvest the credentials presented to each host. SDD fix: - spec system-ssh-connectivity v1.3.0: C-13 (production TOFU MUST be durable across restarts) + AC-22. - migration 0036: ssh_known_hosts table (hostname PK, public_key, first/last seen). - internal/knownhosts.Store: a pgx-backed ssh.KnownHostsStore (kept out of internal/ssh so the dial layer stays DB-free). Get degrades to not-found on a DB blip (first-seen TOFU), never fails closed; a CHANGED key is still rejected by the dial layer (ErrHostKeyMismatch). - wired at all 4 production dial sites (scan worker + serve, discovery x2) in place of NewMemoryStore(). - test AC-22: Put/Get round-trip, absent→not-found, and a fresh Store on the same DB sees the key (durable TOFU across 'restart'). Follow-up (documented in the review): an optional ModeStrict deployment path (pre-provisioned keys) and folding internal/sshprivilege's InsecureIgnoreHostKey probe onto this same store.

…closed)

…ides, future-proof upgrade test (#585) * docs(changelog): spec a human-readable changelog + populate [Unreleased] Add specs/release/changelog.spec.yaml (release-changelog): every entry is a user-facing sentence (>=5 words, no commit-subject prefix), Keep a Changelog categories, dated version sections, no emoji. Enforced by packaging/tests/changelog_test.go (source-inspection, AC-01..05), scoped to the actively-edited [Unreleased] section so history is grandfathered. Populate [Unreleased] with the post-rc.7 work in that style: Settings activation (users, notifications, security/SSO), per-host SSH auth/sudo learning, one-command safe upgrade, airgap + fresh-install fixes, and the pre-release security-hardening batch (#584). * docs: refresh hardening + install guides; de-hardcode upgrade test - SECURITY_HARDENING.md: drop the 'not yet implemented' rate-limit/headers callout (shipped in #584); document auth rate limiting, CSRF, and security headers as live; note always-on breach screening; add an outbound-SSH persistent known-hosts subsection; version line to 0.2.0 rc series. - install_guide.md: fix the stale DEB version example (rc.5 -> rc.7) and add an 'Upgrading' section documenting the one-command auto-migrate + backup + fail-safe path and /etc/openwatch/upgrade.conf. - upgrade-container-test.sh / run-upgrade-container-test.sh: stop hardcoding migration 34->35 and the host_connection_profile table. The host driver now derives the head migration's goose version_id and its -- +goose Down SQL and passes them in; the test reverses exactly the head migration and asserts a generic prior->head advance, so it survives every future migration. (CLAUDE.md was also refreshed locally for AI sessions but is intentionally gitignored per commit 7a73353, so it is not part of this commit.) * test(packaging): pin upgrade-test driver to the host RPM arch dist/ can hold a leftover cross-built arm64 RPM (the packaging Go suite cross-builds it for AC-14), and the rockylinux:9 container runs the host platform, so the previous `cp dist/openwatch-*-1.*.rpm` swept in both arches and `rpm -i` collided on /usr/bin/openwatch. Glob a single host-arch RPM. Validated end-to-end: the container test now passes — real `rpm -U` runs the %post helper ($1=2), which stops the service, takes a pg_dump restore point, applies the head migration (35 -> 36), and restarts.

remyluslosius added 6 commits June 16, 2026 19:13

github-actions Bot added security size/XL labels Jun 16, 2026

remyluslosius added 5 commits June 16, 2026 19:39

docs(security): mark B-1/B-2/B-3 fixed in the review (all 8 findings …

25b6bc2

…closed)

remyluslosius changed the title ~~fix(security): pre-release hardening batch — 5 SDD-specced fixes~~ fix(security): pre-release hardening batch — 8 SDD-specced fixes Jun 17, 2026

remyluslosius merged commit 1f6e72d into main Jun 17, 2026
19 checks passed

remyluslosius mentioned this pull request Jun 17, 2026

docs: release-readiness sweep — spec + populate changelog, refresh guides, future-proof upgrade test #585

Merged

remyluslosius deleted the feat/security-hardening branch June 17, 2026 03:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(security): pre-release hardening batch — 8 SDD-specced fixes#584

fix(security): pre-release hardening batch — 8 SDD-specced fixes#584
remyluslosius merged 11 commits into
mainfrom
feat/security-hardening

remyluslosius commented Jun 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

remyluslosius commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pre-release security hardening — 8 SDD-specced fixes

Fixes (each: spec bumped, test with @spec/@ac tokens, then code)

Verification

Notes for review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

remyluslosius commented Jun 16, 2026 •

edited

Loading