Skip to content

feat(packaging): macos pkg + .mobileconfig profile#123

Merged
laplaque merged 11 commits into
mainfrom
claude/macos-pkg-mobileconfig-Wll74
May 17, 2026
Merged

feat(packaging): macos pkg + .mobileconfig profile#123
laplaque merged 11 commits into
mainfrom
claude/macos-pkg-mobileconfig-Wll74

Conversation

@iamclaude697
Copy link
Copy Markdown
Collaborator

@iamclaude697 iamclaude697 commented May 17, 2026

Phase 2 of the UEM packaging workstream — macOS PKG installer (signed + notarized + stapled on tag builds) and a JAMF-deployable .mobileconfig profile.

What ships

  • packaging/macos/pkg/build.sh, distribution.xml, scripts/postinstall, scripts/preuninstall, notarize.sh, the in-package LaunchDaemon plist, the conffile defaults. build.sh and mobileconfig/build.sh both honor SKIP_SIGN=1 for the PR-event dry-run path.
  • LaunchDaemon at /Library/LaunchDaemons/com.ai-anonymizing-proxy.plist, RunAtLoad=true, KeepAlive={SuccessfulExit=false}
  • _aiproxy system user created in postinstall (UID auto-allocated 220–400, hidden, no shell)
  • CA injection into /Library/Keychains/System.keychain. Postinstall rollback verifies actual keychain state and exits with distinct codes — 1 = "service down, CA cleaned", 2 = "service down, CA STILL TRUSTED, manual remediation required" (H2).
  • packaging/macos/mobileconfig/ai-proxy.mobileconfig.tmpl + build.sh — CA trust + global HTTP proxy payloads, with ExceptionsList for 127.0.0.1, localhost, *.local, 169.254/16 (M3).
  • make package-macos, make package-macos-pkg, make package-macos-mobileconfig (refuse to run off Darwin)
  • .github/workflows/release-macos-pkg.yml:
  • docs/packaging/macos.md — install / configure / verify / upgrade / uninstall, blast-radius callout (H7), per-route rotation (M4), troubleshooting (L3), pre-tag verification ritual (H3).
  • Issue macos: verify .mobileconfig HTTPS interception on MDM-enrolled host before tag #125 — pre-tag MDM-host HTTPS-interception verification ritual.

Quality Gates

  • make check passed locally on 8e5c975. lint: 0 issues. / test: all 10 ok / security: Issues : 0 / vulncheck: No vulnerabilities found.
  • go test -race ./... passed on 8e5c975 — every package ok, no --- FAIL.
  • Coverage minimums met (table below). 95%+ on anonymizer / config / packs; 94.4% total.
  • Delta coverage ≥90% — N/A by gate (zero .go files in PR diff; script skip is documented behavior).
  • §6 Test Inventory baseline-vs-head completed.
  • No hacks — diff grep at head returns empty for TODO|FIXME|HACK|XXX|//nolint|t\.Skip|coverage-ignore.
  • All CI jobs green at the PR head SHA 8e5c975.

Coverage (per Go package)

Package Coverage Gate
./cmd/proxy/ 100.0%
./internal/anonymizer/ 95.6% ≥95% ✅
./internal/anonymizer/packs/ 97.6% ≥95% ✅
./internal/config/ 100.0% ≥95% ✅
./internal/domainmatch/ 100.0%
./internal/logger/ 86.4%
./internal/management/ 89.8%
./internal/metrics/ 100.0%
./internal/mitm/ 85.7%
./internal/proxy/ 93.0%
total 94.4% ≥85% ✅

Delta Coverage Report

go test -race -coverprofile=coverage.out -covermode=atomic ./...
bash .github/scripts/delta-coverage.sh coverage.out 90.0 origin/main

Output:

No changed Go source files — delta coverage check skipped.

PR diff contains zero .go files (git diff --name-only --diff-filter=ACMR origin/main...HEAD -- '*.go' is empty). The script's documented behavior on empty Go diffs is to skip with exit 0.

§6 Test Inventory — baseline vs head

Method per docs/test-plans/ai-proxy-test-method.md:

  1. Spec source: §6 table, organized per PII pack, mapping each pack to its unit test file and (where present) report test file.
  2. For each row, extracted top-level test names from the named file(s) with grep -oE '^func (Test[A-Za-z0-9_]+)', passed them to go test -race -count=1 -v -run "^(N1|N2|...)$" <pkg>.
  3. Counted --- PASS and --- FAIL lines top-level + subtests per CLAUDE.md gate spec — grep -cE '^[[:space:]]*--- PASS' (the leading-whitespace match captures indented subtest results, which the prior round's ^--- PASS missed; all previously-posted numbers were undercounted).
  4. Two independent worktrees (git worktree add /tmp/main-tree 6436ce0 and the PR head), separate go test invocations against separate trees. Not derived from git diff.
§6 row File(s) main (6436ce0) PASS / FAIL head (8e5c975) PASS / FAIL Δ
GLOBAL packs/global_test.go 22 / 0 22 / 0 0
DE packs/de_test.go 17 / 0 17 / 0 0
US packs/us_test.go 38 / 0 38 / 0 0
FR packs/fr_test.go 41 / 0 41 / 0 0
SECRETS (unit) packs/secrets_test.go 28 / 0 28 / 0 0
SECRETS (report) anonymizer/secrets_report_test.go 44 / 0 44 / 0 0
NL (unit) packs/nl_test.go 12 / 0 12 / 0 0
NL (report) anonymizer/nl_report_test.go 6 / 0 6 / 0 0
FINANCE_EU (unit) packs/finance_eu_test.go 15 / 0 15 / 0 0
FINANCE_EU (report) anonymizer/finance_eu_report_test.go 12 / 0 12 / 0 0
HEALTHCARE (unit) packs/healthcare_test.go 4 / 0 4 / 0 0
HEALTHCARE (report) anonymizer/healthcare_report_test.go 15 / 0 15 / 0 0
Cross-pack (all report tests) anonymizer/{secrets,nl,finance_eu,healthcare,secrets_priority}_report_test.go 90 / 0 90 / 0 0

diff /tmp/inv6-main.txt /tmp/inv6-head.txt returns empty. Zero deltas on every row.

§5 Known-issues pinning tests

The method's §5 names specific tests as pins for known issues. Each was executed independently by name on both worktrees:

Pinning test Issue main (6436ce0) head (8e5c975) Δ
TestDESteuerIDPattern #67 1 / 0 1 / 0 0
TestDESVNRPattern #67 1 / 0 1 / 0 0
TestUSAddressPattern (German* subtests) #68 7 / 0 7 / 0 0
TestSIREN_SSN_CrossPattern #69 4 / 0 4 / 0 0
TestFRSIRENPattern #69 1 / 0 1 / 0 0
TestSecretsPriorityOverGLOBAL #70 12 / 0 12 / 0 0
TestNLKvKPattern (design) 1 / 0 1 / 0 0
TestFINANCEEUSWIFTBICPattern (design) 1 / 0 1 / 0 0
TestHEALTHCAREICD10Pattern (design) 1 / 0 1 / 0 0
TestHEALTHCAREMRNPattern (design) 1 / 0 1 / 0 0

Issue #70 / TestSecretsPriorityOverGLOBAL directly pins CLAUDE.md Invariant #3 ("SECRETS must precede GLOBAL"). Verified PASS at head by name.

§2.3 Report-test config compliance

Every report test under internal/anonymizer/*_report_test.go declares the canonical configuration the method requires:

grep -n "EnabledPacks" internal/anonymizer/*_report_test.go

Returns EnabledPacks: []string{"SECRETS", "GLOBAL", "US", "DE", "FR", "NL", "FINANCE_EU", "HEALTHCARE"} in every report test, with UseAI: false and PackDecayRate: 0.0 adjacent. SECRETS precedes GLOBAL — Invariant #3 satisfied at the config level, in addition to the named-test pin above.

Supplemental: per-Go-package rollup

Package main (6436ce0) head (8e5c975) Δ
./cmd/proxy/ 17 / 0 17 / 0 0
./internal/anonymizer/ 202 / 0 202 / 0 0
./internal/anonymizer/packs/ 71 / 0 71 / 0 0
./internal/config/ 34 / 0 34 / 0 0
./internal/domainmatch/ 5 / 0 5 / 0 0
./internal/logger/ 11 / 0 11 / 0 0
./internal/management/ 38 / 0 38 / 0 0
./internal/metrics/ 17 / 0 17 / 0 0
./internal/mitm/ 30 / 0 30 / 0 0
./internal/proxy/ 59 / 0 59 / 0 0

Note: these per-package counts use go test -v -count=1 ./<pkg> (whole-package) with subtest-aware grep. They are a rollup, not the gate artifact — the §6 per-pack table above is the artifact the spec demands.

Review-feedback resolution

ID Round Fix
H1 R1 PR body disclosure + checklist filled.
H2 R1 (00d341a) postinstall rollback verifies cert is actually absent; distinct exit code 2 when CA STILL TRUSTED.
H3 R1 → R2 reformulated Trackable artifact triple: issue #125 + workflow YAML comment + docs § Pre-tag verification ritual.
H4 R1 (79c409b) Secret-verify + signing/notarize gated on IS_RELEASE.
H5 R1 (79c409b) P12 cleanup via trap EXIT.
H6 R1 (79c409b) Artifact upload explicit filenames.
H7 R1 (c219de1) Blast-radius callout in docs.
M1 R1 (b9ce7c6) macOS env file ports match plist.
M2 R1 (b9ce7c6) distribution.xml pkg-ref version rendered from $VERSION.
M3 R1 (b9ce7c6) .mobileconfig ExceptionsList.
M4 R1 (c219de1) Rotation per route.
L1 R1 (00d341a) bootstrap stderr captured.
L2 R1 (79c409b) concurrency block.
L3 R1 (c219de1) Troubleshooting section.
N1 R2 (274e319) PR-event dry-run exercises build mechanism.
Gate 5 §6 inventory R3 Per-pack execution; previously was per-Go-package.
Gate 5 subtest counting + §5 pins + §2.3 config R4 Subtest-aware grep (^[[:space:]]*--- PASS); §5 named-test pin table; §2.3 config compliance verified.

Verified

  • bash -n clean on every shell script
  • xmllint --noout clean on distribution.xml, the LaunchDaemon plist, and the .mobileconfig.tmpl
  • YAML parse clean on the workflow
  • JSON parse clean on proxy-config.json.default
  • make check clean on 8e5c975
  • go test -race ./... clean on 8e5c975
  • §6 inventory: two independent worktree executions, per-pack, subtest-aware, zero deltas
  • §5 known-issues pinning tests verified by name on both worktrees, zero deltas
  • §2.3 report-test config compliance grep'd from source
  • make -n package-macos-pkg gates on Darwin
  • PR-event macOS CI: dry-run path exercises payload staging + pkgbuild + productbuild + mobileconfig render unsigned

Pre-tag manual gate (issue #125): .mobileconfig HTTPS-interception verification on MDM-enrolled host — required before pushing any v* tag.

Out of scope

  • Homebrew tap (Phase 5a)
  • Windows MSI (Phase 3)
  • Sparkle / auto-update integration
  • JAMF policy templates wrapping the .mobileconfig (left to admins)

Related

claude added 4 commits May 17, 2026 19:03
Phase 2 packaging — macOS PKG installer.

- packaging/macos/pkg/build.sh builds a universal (arm64 + amd64)
  binary, signs it with Developer ID Application, then assembles a
  productbuild-signed distribution PKG.
- LaunchDaemon at /Library/LaunchDaemons/com.ai-anonymizing-proxy.plist
  runs as a dedicated hidden _aiproxy user with KeepAlive on non-zero
  exit and inline EnvironmentVariables (launchd does not source env files).
- postinstall allocates a free UID in 220–400, generates the CA if
  absent, trusts it in the System keychain, and bootstraps the daemon.
  Failure to load the daemon rolls back the CA trust, mirroring the
  Linux postinstall contract (CLAUDE.md Invariant #1).
- preuninstall stops the daemon and removes the CA. Shipped at
  /usr/local/share/ai-proxy/uninstall.sh since macOS PKGs have no
  native uninstall hook.
- ENABLED_PACKS default is SECRETS,GLOBAL,DE per the pack-order invariant.
- make package-macos-pkg wired in; refuses to run off Darwin.
Phase 2 packaging — JAMF / MDM declarative deployment artifact.

- packaging/macos/mobileconfig/ai-proxy.mobileconfig.tmpl carries two
  payloads: com.apple.security.root for CA trust and
  com.apple.proxy.http.global for the 127.0.0.1:18080 proxy.
- packaging/macos/mobileconfig/build.sh renders fresh UUIDs for each
  PayloadUUID, base64-encodes the release CA in DER form into the
  certificate payload, lints with plutil, and CMS-signs with the
  Developer ID Application identity.
- The .mobileconfig embeds the release CA at build time, so every host
  installed via the profile trusts the same CA. Rotation = re-issue the
  profile and re-deploy; documented in docs/packaging/macos.md.
Builds on macos-latest. Imports Developer ID Installer and Developer ID
Application certs into a temporary keychain, resolves the identity
strings dynamically (so the workflow does not hardcode developer-team
names), stages the release CA from secrets, builds PKG + .mobileconfig.

Notarization runs only on tag pushes — Apple rejects re-submissions of
identical content, so PR builds skip notarytool and spctl --assess but
still sign and lint. Stapling lets installed PKGs verify offline.

A separate release job downloads the artifacts, generates
SHA256SUMS-macos / SHA512SUMS-macos, cosign-signs each blob keyless via
the workflow OIDC identity, and uploads to the GitHub release. Elevated
permissions (contents: write, id-token: write) are scoped to the release
job only.

Workflow fails fast if any of the eight required Secrets is unset
(installer + application cert P12s and passwords, notarytool credentials,
keychain password, release CA cert + key).
…, jamf usage

- docs/packaging/macos.md mirrors linux.md: security posture callout,
  install / configure / verify / upgrade / uninstall flows, signature
  verification, plus macOS-specific sections covering Gatekeeper
  acceptance, launchctl print, System keychain inspection, profiles
  install/list/remove, JAMF deployment pattern, and the manual UID
  allocation fallback when 220–400 is exhausted.
- README.md row updated from 'planned' to 'shipped' with link to macos.md.
- Release CA rotation procedure documented inline (in-place rotation
  awaits Phase 6's --rotate-ca; until then, rotate via re-issue).
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 17, 2026

Benchmark results

Benchmark ns/op Budget Headroom Status
BenchmarkRegexPassEmail 1.6µs 2.5ms 99% ✅ PASS
BenchmarkRegexPassMultiple 2.4µs 2.5ms 99% ✅ PASS

Budget: regex/cache < 2.5ms (0.5% of 500ms baseline), streaming < 20ms (4% of 500ms baseline)

Copy link
Copy Markdown
Owner

@laplaque laplaque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — feat(packaging): macos pkg + .mobileconfig profile

Verdict: REQUEST_CHANGES.

Phase 2 ships a lot of new ground (macOS PKG + MDM .mobileconfig). Two structural gaps make this not ready to merge: the PR body diverges from .github/PULL_REQUEST_TEMPLATE.md and addresses none of the CLAUDE.md Quality Gates with author-offered evidence; and Build (macos-latest) — the workflow this PR introduces — is RED on the repo at the PR head SHA with no triage in the PR body. Beyond the gates, several security-floor concerns in the postinstall CA rollback path, the .mobileconfig payload, and the CI secret/keychain teardown warrant attention before this trust-anchoring change ships.

Evidence

  • Head SHA: fca9960a4ba0f4a265cdd7704c7c000e69f9a74b
  • Failing CI: workflow macOS PKG + .mobileconfig, job Build (macos-latest), run 25999938385. Other 18 jobs green (CI Lint/Test/Build/Security Scan/Benchmark, CodeQL, Linux packages — all install round-trips PASS).
  • Gate-defining docs: repo root CLAUDE.md § "Quality Gates" + .github/PULL_REQUEST_TEMPLATE.md.
  • Checked at (UTC): 2026-05-17, head SHA re-verified pre-post.

Quality Gates audit (verbatim against CLAUDE.md § "Quality Gates")

# Gate (verbatim) Score Notes on author-offered evidence
1 make check passes locally — aggregates lint, test, security, vulncheck; all four sub-gates exit 0. NEED-EVIDENCE — HIGH Body lists static checks only (bash -n, xmllint --noout, YAML/JSON parse, make -n). No per-sub-gate exit code. Carve-out for gate #1 (LOW with disclosure + CI green) does not apply: CI is RED at head.
2 go test -race ./... passes — including TestTokenFormatNonRetriggering for every PII type. NEED-EVIDENCE — HIGH No go test -race output in body. No per-PII-type TestTokenFormatNonRetriggering row.
3 Coverage minimums: 95% on internal/anonymizer, internal/config, internal/anonymizer/packs; 85% overall. NEED-EVIDENCE — HIGH No coverage numbers in body.
4 Delta coverage: 90% on all changed/new files (enforced via .github/scripts/delta-coverage.sh in the CI Test job). NEED-EVIDENCE — HIGH No .github/scripts/delta-coverage.sh raw output nor the per-function table the template requires. The PR touches packaging + docs only, but the gate's verbatim text requires the script's output, not reviewer-inferred N/A.
5 Test-plan inventory baseline-vs-head. Execute the docs/test-plans/ inventory on main AND on the PR head SHA. Document per-package PASS/FAIL counts in the PR body under a ## §6 Test Inventory — baseline vs head section. Diff reasoning ("no test files touched") does NOT satisfy this gate. FAIL — HIGH Body has no ## §6 Test Inventory — baseline vs head section. The body's implicit "static checks only because no Go code changed" is exactly the forbidden form.
6 No hacks. Never add code whose purpose is to make a check pass rather than to make the code correct — including //nolint directives without substantive reason, t.Skip() without a linked issue, hardcoded test-passing values, or // coverage-ignore annotations. PASS No //nolint, t.Skip, // coverage-ignore, or TODO/FIXME/HACK/XXX/KLUDGE markers in any of the 14 changed files.

Findings

HIGH

H1 — CI is RED at head; PR body does not surface or triage the failure.
Build (macos-latest) (run 25999938385) fails on the workflow this PR adds. The PR body offers no disclosure of the expected-or-unexpected state, no triage, and no follow-up reference. With CI not green, the gate-1 carve-out also fails (see audit).

H2 — postinstall rollback_ca_trust swallows security delete-certificate failure with || true.
packaging/macos/pkg/scripts/postinstall rollback_ca_trust(). The rollback's purpose is to prevent the failure mode the file's own comment names: # trusted CA without a running interceptor is the failure mode that lets PII flow out unanonymized. If deletion fails (partial keychain state, permission edge), the trust remains AND postinstall exits 1 stating the rollback ran. The operator's exit-code signal does not match System keychain reality. Security-floor: HIGH.

H3 — .mobileconfig HTTPS interception is asserted in the PR body but not evidenced.
packaging/macos/mobileconfig/ai-proxy.mobileconfig.tmpl, proxy payload. The payload specifies ProxyType=Manual + ProxyServer + ProxyServerPort only. The PR body and docs both claim .mobileconfig carries the proxy for HTTP/HTTPS. The proxy's purpose is HTTPS PII interception. The PR body offers no end-to-end interception result on a profile-enrolled host. Reviewer is not in a position to verify CFNetwork's behavior with that payload shape on the author's behalf. Security-floor: HIGH.

H4 — Workflow Verify required Secrets present runs unconditionally on pull_request.
.github/workflows/release-macos-pkg.yml. This PR's CI state is the demonstration: with the 10 secrets absent from the repo, every pull_request-triggered run on the packaging path turns RED at this step. Author needs to confirm whether RED-on-PR is the intended UX (signal the secrets are unset) or whether this gate belongs only on tagged release events.

H5 — P12 cert files cleaned up only on the happy path of the keychain step.
.github/workflows/release-macos-pkg.yml, Set up signing keychain step. installer.p12 and app.p12 are written, imported, then rm'd. If security import fails between write and rm (e.g., wrong password, corrupted cert), the next-line rm does not run. Keychain teardown is if: always(); the P12 cleanup is not. Security-floor: HIGH (cert material in workspace).

H6 — Workflow artifact upload globs dist/*; CA private key lives at build/release-ca/ca-key.pem.
.github/workflows/release-macos-pkg.yml, Upload artifacts step. Currently safe — the CA staging is build/, not dist/. Latent risk: any future build helper writing to dist/ (or any rename of the release-CA path) silently leaks CA material into the GitHub artifact store. Tighten the artifact scope so its contents are explicit, not glob-derived.

H7 — docs/packaging/macos.md security section understates .mobileconfig blast-radius.
The section describes per-host CA generation correctly for the PKG path. It does not surface that the .mobileconfig route trusts a CA backed by the single shared MACOS_RELEASE_CA_KEY repository secret. Compromise of that secret = ability to mint certs trusted by every fleet device simultaneously — materially different from the PKG path's per-host blast radius. Fleet operators reading this section before deploy need that comparison explicit. Security-floor: HIGH.

MEDIUM

M1 — ai-proxy.env ships Linux ports (PROXY_PORT=8080, MANAGEMENT_PORT=8081); plist + proxy-config.json.default use 18080/18081.
packaging/macos/pkg/ai-proxy.env. The banner correctly says the file is informational and launchd does not source it. The body argues for Linux-verbatim parity. Net effect: an admin reading the env file for macOS defaults gets the wrong ports. Author's call: correct, or cross-reference per variable to the plist override.

M2 — distribution.xml <pkg-ref version="0">.
packaging/macos/pkg/distribution.xml. The version attribute on <pkg-ref> is used by macOS Installer for upgrade detection. Hardcoded "0" defeats it.

M3 — .mobileconfig payload has no ExceptionsList.
packaging/macos/mobileconfig/ai-proxy.mobileconfig.tmpl, proxy payload. All traffic — including 127.0.0.1, localhost, *.local, link-local — routes through the proxy. For a daemon that itself listens on 127.0.0.1:18080, loopback-via-proxy risks a loop and breaks mDNS service discovery on enterprise LANs. Author's call on whether that's intentional.

M4 — docs/packaging/macos.md release-CA rotation step 3 is wrong for PKG hosts.
Rotation section step 3 claims cutting a new tag "rebuilds the .mobileconfig (and embeds the new CA in PKG-installed hosts' fallback path)." PKG hosts generate their own per-host CA at install time; the release CA never reaches them. The parenthetical is incorrect; the operator workflow for PKG hosts is already step 4 (uninstall + reinstall).

LOW

L1 — postinstall silences launchctl bootstrap stderr before legacy load -w fallback.
postinstall, daemon bootstrap block. 2>/dev/null on bootstrap discards the actual error before the legacy fallback runs. If both fail, the message lost was the diagnostic. Author's call.

L2 — Workflow has no concurrency: block.
.github/workflows/release-macos-pkg.yml. Two simultaneous tag pushes can race release uploads.

L3 — docs/packaging/macos.md has no troubleshooting section.
No guidance on reading launchctl print state fields, on the structured-log retrieval pattern (log show --predicate ... --last 1h), or on common failure modes (port in use, permission denied on /var/log/ai-proxy). The CA-rollback contract is documented but the operator's recovery path after a rollback isn't.

Path to Approval

Each Required bullet maps to a specific named deliverable the PR body must show after revision; nothing implicit.

Required before re-review

  1. ## §6 Test Inventory — baseline vs head (per-package PASS/FAIL table for both main and the PR head SHA, per docs/test-plans/ai-proxy-test-method.md §6, per the format in .github/PULL_REQUEST_TEMPLATE.md). Diff reasoning is excluded by the gate's literal text.
  2. make check evidence in the PR body — exit code per sub-gate (lint, test, security, vulncheck), at head SHA.
  3. go test -race ./... output in the PR body — including the TestTokenFormatNonRetriggering per-PII-type rows the gate names.
  4. ## Delta Coverage Report section filled per the template — raw .github/scripts/delta-coverage.sh output and per-function table.
  5. Coverage numbers in the PR body — per-package (internal/anonymizer, internal/config, internal/anonymizer/packs: 95% min) and overall (85% min).
  6. H1 — Build (macos-latest) RED at head: disclose state and triage in the PR body. Either confirm the RED-on-PR UX is intentional and link a follow-up for repo secret provisioning, or change the gate so PR-event runs can complete.
  7. H2 — Postinstall rollback exit-code accuracy: rollback's exit code must reflect actual System keychain state, not the script's intent.
  8. H3 — HTTPS interception assertion: PR body needs an end-to-end interception result (or equivalent) demonstrating .mobileconfig enrollment intercepts HTTPS LLM API traffic on macOS.
  9. H4 — Workflow secret-verify step UX: confirm intended behavior on pull_request events and reflect in the workflow / PR body.
  10. H5 — P12 cleanup on all exit paths from the keychain-setup step (current rm is bypassed when security import fails).
  11. H6 — Artifact upload scope: narrow so it can't accidentally include build/ contents or CA material on future writes.
  12. H7 — Security section blast-radius callout: .mobileconfig route trusts a CA backed by MACOS_RELEASE_CA_KEY; compromise = fleet-wide cert minting. Make that comparison explicit in docs/packaging/macos.md.

Recommended this round

M1 (env-file port mismatch), M2 (<pkg-ref version="0">), M3 (mobileconfig ExceptionsList), M4 (rotation step 3 wording for PKG hosts).

Author's discretion

L1 (silenced bootstrap stderr), L2 (workflow concurrency), L3 (troubleshooting section).

Notes

  • Quality-gate findings on ai-proxy default one severity higher than they otherwise would, per the project's established review convention.
  • AI attribution in the commit authors / PR body trailer is not a finding for a PR authored end-to-end by Claude Code Web.
  • Re-review on a new head: every clause above must be re-scored against the revised body; previously-passing gates are not assumed to still pass.

Comment thread .github/workflows/release-macos-pkg.yml
Comment thread .github/workflows/release-macos-pkg.yml
Comment thread .github/workflows/release-macos-pkg.yml
Comment thread packaging/macos/pkg/scripts/postinstall
Comment thread packaging/macos/mobileconfig/ai-proxy.mobileconfig.tmpl
Comment thread packaging/macos/pkg/ai-proxy.env Outdated
Comment thread packaging/macos/pkg/distribution.xml Outdated
Comment thread docs/packaging/macos.md Outdated
Comment thread docs/packaging/macos.md Outdated
claude added 4 commits May 17, 2026 19:56
…ain state (H2)

Previous rollback used '|| true' on security delete-certificate, then
logged 'rollback ran' and let the caller exit 1. If the delete failed
(partial keychain state, permission edge) the cert remained trusted
while the script signaled clean rollback — exactly the trust-without-
interception failure the function exists to prevent (CLAUDE.md
Invariant #1).

Rollback now verifies the cert is actually absent via security
find-certificate. Failure to delete produces a distinct postinstall
exit code (2) so operators can tell 'service down, CA cleaned' (1)
from 'service down, CA STILL TRUSTED' (2), with manual remediation
steps in the error message.

Also captures launchctl bootstrap/load stderr instead of silencing it,
so the diagnostic survives the legacy fallback (L1).
…onfig exceptions, env ports (M1-M3)

M2: distribution.xml's <pkg-ref version> was hardcoded '0', defeating
macOS Installer upgrade-vs-fresh detection. build.sh now renders the
file with the real VERSION before passing to productbuild.

M3: .mobileconfig proxy payload now carries an ExceptionsList for
127.0.0.1, localhost, *.local, and 169.254/16. Closes the loop risk
(the daemon's own loopback listener being routed back through itself)
and preserves mDNS/Bonjour discovery on enterprise LANs.

M1: ai-proxy.env now ships macOS defaults (PROXY_PORT=18080,
MANAGEMENT_PORT=18081) matching the plist and proxy-config.json.default.
Previous Linux-verbatim values misled admins reading the file as
'macOS defaults'.
… P12 trap, scoped artifacts (H4-H6)

H4: 'Verify required Secrets present' and every step that touches
signing keys, CA material, or notarization is now if: IS_RELEASE.
PR-event runs do a 'Validate (no secrets required)' step instead —
GOOS=darwin cross-build, bash -n on every shell script, plutil/xmllint
on plist+XML. Catches workflow + script + plist bugs without requiring
the 10 production secrets to be provisioned.

H5: P12 files are cleaned up via 'trap EXIT' instead of inline rm.
Previously, if security import failed, the script exited under set -e
before the next-line rm, leaving cert material in the workspace until
the (if: always()) keychain teardown step ran — and the P12 itself
was never removed.

H6: Artifact upload path: now an explicit list of expected filenames
(dist/ai-proxy-*-universal.pkg, dist/ai-proxy-*.mobileconfig) with
if-no-files-found: error, instead of the dist/* glob. Adjacent
material written to dist/ in the future can't silently ride into
the artifact store.

L2: concurrency block prevents two simultaneous tag pushes from
racing the release upload.
…oute, troubleshooting (H7, M4, L3)

H7: Security section now contrasts the two trust models explicitly:
PKG route's per-host CA limits a key compromise to one host;
.mobileconfig route's CA is backed by a single MACOS_RELEASE_CA_KEY
secret and compromise lets an attacker mint certs trusted by every
fleet device simultaneously. Operators reading this section before
deciding between routes now see the comparison directly.

M4: Release CA rotation section is split by deployment route. The
previous step 3 parenthetical 'embeds the new CA in PKG-installed
hosts' fallback path' was wrong — PKG hosts generate per-host CAs in
postinstall and the release CA never reaches them. PKG-route rotation
is now its own subsection with the uninstall + remove cert +
reinstall procedure.

L3: New Troubleshooting section: launchctl print/log show/stderr.log
commands, plus a symptoms table covering postinstall exit codes 1
and 2 (CA cleaned vs CA still trusted), Gatekeeper unsigned-developer
errors, missing keychain trust, and silent profile-install failures.
Copy link
Copy Markdown
Owner

@laplaque laplaque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R2 — re-review at c219de1

Verdict: REQUEST_CHANGES.

All R1 findings (7 HIGH / 4 MEDIUM / 3 LOW) verified resolved against the diff, not just against the reply text. Every Quality Gate now scores PASS at the new head. Two items remain: one re-formulation of R1's H3 (HTTPS-interception verification needs to be a trackable artifact, not an author recommendation), and one new finding the R1 fix introduced — the H4 gating change removed PR-event CI coverage of the PKG build mechanism itself.

Clean substantial work on the R1 round — fixes are surgical, the PR body now matches .github/PULL_REQUEST_TEMPLATE.md, and CI is green at head. The two remaining items are scoped, neither requires re-touching the resolved areas.

Evidence

  • Baseline head SHA: c219de157a3c27f109ef0bc2c5677dc7a0987f37
  • CI: 17/17 PASS at head; 2× Sign + publish correctly SKIPPED (tag-only). Build (macos-latest) runs the new Validate (no secrets required) path in 43s and goes green.
  • Gate-defining docs: repo root CLAUDE.md § "Quality Gates" + .github/PULL_REQUEST_TEMPLATE.md.
  • R1 reference: review id 4306197704 (CHANGES_REQUESTED) at fca9960.
  • Checked at (UTC): 2026-05-17, head SHA re-verified pre-post.

Quality Gates audit — re-walked verbatim against CLAUDE.md § "Quality Gates" (per re-review rule: "Never assume previously-passing gates still pass at a new head SHA")

# Gate Score Evidence at c219de1
1 make check passes locally — aggregates lint, test, security, vulncheck; all four sub-gates exit 0. PASS PR body shows per-sub-gate evidence: lint 0 issues., test all 10 packages ok, security Issues : 0, vulncheck No vulnerabilities found.
2 go test -race ./... passes — including TestTokenFormatNonRetriggering for every PII type. PASS PR body claim + §6 inventory shows zero --- FAIL across all 10 packages at head.
3 Coverage minimums: 95% on internal/anonymizer, internal/config, internal/anonymizer/packs; 85% overall. PASS internal/anonymizer 95.6% / internal/anonymizer/packs 97.6% / internal/config 100.0% / total 94.4%.
4 Delta coverage: 90% on all changed/new files (.github/scripts/delta-coverage.sh). PASS-by-N/A Script output "No changed Go source files — delta coverage check skipped." Verified: git diff --name-only --diff-filter=ACMR origin/main...HEAD -- '*.go' is empty (this PR's diff is packaging/macos/** + .github/workflows/** + docs/packaging/** + Makefile only).
5 Test-plan inventory baseline-vs-head — ## §6 Test Inventory — baseline vs head section with per-package PASS/FAIL counts. Diff reasoning does NOT satisfy. PASS Full 10-package table for both 6436ce0 (main) and c219de1 (head); zero failures, zero regressions, executed on both sides. Method statement names the share-an-identical-Go-tree caveat explicitly.
6 No hacks — //nolint, t.Skip() without linked issue, hardcoded test-passing values, // coverage-ignore. PASS grep -nE 'TODO|FIXME|HACK|XXX|KLUDGE|WORKAROUND|//nolint|t\.Skip|coverage-ignore' on the full diff at head returns empty.

R1 resolution map (verified against the diff at head, not just reply text)

R1 ID Status Verified by
H1 PR body discloses + checklist filled Resolved New PR body has full ## Quality Gates checklist with per-gate evidence, ## Delta Coverage Report, ## §6 Test Inventory — baseline vs head.
H2 postinstall rollback exit-code accuracy Resolved (00d341a) packaging/macos/pkg/scripts/postinstallrollback_ca_trust no longer has || true on delete-certificate; verifies absence via security find-certificate -Z $SHA after delete; returns 1 on rollback failure; caller exits with distinct code 2 so operator can tell "service down, CA cleaned" from "service down, CA STILL TRUSTED".
H3 HTTPS interception evidence 🟡 Reformulated (see Findings) Author transparency acknowledged; needs trackable artifact.
H4 secret-verify + signing gated on IS_RELEASE Resolved (79c409b) Every step touching secrets/signing/notarization has if: env.IS_RELEASE == 'true'. PR-events run the Validate (no secrets required) step that does GOOS=darwin go build, bash -n on all 5 shell scripts, plutil -lint on plist, xmllint --noout on distribution.xml + mobileconfig.tmpl. CI green at head. (See N1 — this fix introduced a coverage regression on the build mechanism itself.)
H5 P12 cleanup on all exit paths Resolved (79c409b) trap 'rm -f installer.p12 app.p12' EXIT is at the top of the Set up signing keychain run: block, before any P12 write. Covers security import failure and any set -e trip.
H6 artifact upload scope Resolved (79c409b) path: list is now literal dist/ai-proxy-*-universal.pkg + dist/ai-proxy-*.mobileconfig with if-no-files-found: error. No more dist/* glob.
H7 .mobileconfig blast-radius callout Resolved (c219de1) docs/packaging/macos.md security section explicitly contrasts PKG (per-host CA → blast radius = one host) vs .mobileconfig (CA backed by MACOS_RELEASE_CA_KEY secret → blast radius = entire fleet); names the threat model.
M1 ai-proxy.env macOS ports Resolved (b9ce7c6) PROXY_PORT=18080, MANAGEMENT_PORT=18081 — matches plist and proxy-config.json.default.
M2 <pkg-ref version="0"> Resolved (b9ce7c6) distribution.xml uses version="__PKG_VERSION__"; build.sh renders via sed "s/__PKG_VERSION__/${VERSION}/g" ... > $DIST_XML and passes $DIST_XML to productbuild --distribution.
M3 .mobileconfig ExceptionsList Resolved (b9ce7c6) Proxy payload now has ExceptionsList array containing 127.0.0.1, localhost, *.local, 169.254/16.
M4 rotation section split per route Resolved (c219de1) Two subsections: ### .mobileconfig / MDM route (1-4) and ### PKG route (correct uninstall + remove cert + reinstall). The erroneous parenthetical is gone.
L1 bootstrap stderr captured Resolved (00d341a) bootstrap_err=$(/bin/launchctl bootstrap system ... 2>&1) || bootstrap_rc=$? and same for load_err; both stderr strings surfaced in the failure message.
L2 workflow concurrency Resolved (79c409b) Top-level concurrency: { group: macos-pkg-${{ github.ref }}, cancel-in-progress: false }.
L3 troubleshooting section Resolved (c219de1) ## Troubleshooting section at the bottom of docs/packaging/macos.md.

Findings

N1 — HIGH (new in R2) — PR-event CI no longer exercises the build mechanism this PR adds.

.github/workflows/release-macos-pkg.yml. The H4 remediation correctly gated the secret-requiring steps on if: env.IS_RELEASE == 'true', which fixed the RED-on-PR problem. The collateral effect: on pull_request events, the workflow now runs only the Validate (no secrets required) step — GOOS=darwin go build, bash -n, plutil -lint, xmllint --noout. The Build PKG step (make package-macos-pkgbuild.shcodesign --sign + pkgbuild + productbuild --sign), the Build .mobileconfig step (make package-macos-mobileconfigmobileconfig/build.shsecurity cms -S), the lipo universal-binary path, and the rendered-distribution.xml substitution all gate on IS_RELEASE and therefore never execute on a PR event.

Why this matters for this PR specifically: the PR's deliverable IS the PKG build mechanism. The first execution of build.sh and mobileconfig/build.sh against signed certs is at the first v* tag push. Any bug in payload staging, identity resolution, lipo invocation, sed substitution, CMS signing of the profile, or productbuild argument ordering surfaces at tag time — not PR time. The PR introduces the script, marks it shipped in docs/packaging/README.md, and ships zero PR-event evidence that the script works mechanically.

Separate from H4's UX win (no more RED-on-PR for contributors without secrets), the property that needs to hold is: a future commit that breaks build.sh or mobileconfig/build.sh should fail before tag time. Today it does not. Author's call on the resolution mechanism — possible directions include a self-signed throwaway cert generated at workflow start for PR-only signing (notarization still gated on IS_RELEASE), an unsigned dry-run path that exercises payload staging and structural correctness, or an explicit documented accept-the-gap stance with a pre-tag verification checklist captured in the workflow YAML adjacent to the Build PKG step.

H3 (reformulated) — MEDIUM — .mobileconfig HTTPS-interception verification needs to be a trackable artifact.

The author's R1 reply transparently acknowledged that the Linux executor for this PR cannot produce an end-to-end MDM-host result, and recommended a Tart VM / profiles install round-trip "before tag." The PR body now correctly does not claim an interception result it doesn't have. That removes the over-claim, which is the right move.

What's missing for merge readiness is the artifact that captures the verification. A recommendation in a reply comment isn't trackable — a future operator (or a future revision of this PR) needs the verification ritual to live somewhere checkable. Resolution requires either (a) the MDM-host result itself, attached to the PR body or to a release-readiness issue and referenced from the PR body, OR (b) a follow-up issue filed against this repo + a workflow-YAML comment at the Notarize PKG step naming that issue + a callout in docs/packaging/macos.md § "Release CA rotation" or § "Install — .mobileconfig" describing the pre-tag verification ritual with its exact pass criterion (e.g., "on a profile-enrolled host, curl --proxy 127.0.0.1:18080 -k https://example.com reaches the proxy daemon's request log"). Either path converts "author recommends verifying before tag" into "workflow / docs / tracked-issue cannot be ignored before tag."

This is reformulated from R1's HIGH because the PR body no longer over-claims, and the author has been explicit about the executor limitation. It stays at MEDIUM because the verification gap on the proxy's primary purpose (HTTPS interception of LLM API traffic) shouldn't merge as a floating commitment.

Path to Approval

  1. N1 — close the PR-event coverage gap for the PKG build mechanism. Either exercise the build path on PR with a throwaway cert (or unsigned dry-run), or capture the explicit accept-the-gap stance in the workflow YAML at the Build PKG step with a named pre-tag verification checklist.
  2. H3 (reformulated) — capture the HTTPS-interception verification as a trackable artifact. Either attach the MDM-host result to this PR, or file a follow-up issue + cite it in the workflow YAML Notarize PKG step + document the verification ritual + pass criterion in docs/packaging/macos.md.

Notes

  • Quality-gate findings on ai-proxy default one severity higher than they otherwise would, per the project's established review convention. Neither N1 nor H3-reformulated is a project-gate finding; severities reflect concrete impact only.
  • Thread resolution on R1's 9 inline threads is the executor's job per your project convention — leaving them untouched.
  • Re-review walked every previously-passing gate per pr-review-reread-current-head-not-pinned; gate 6's PASS at R1 was independently re-verified at c219de1, not carried over.

Comment thread .github/workflows/release-macos-pkg.yml
Comment thread packaging/macos/mobileconfig/ai-proxy.mobileconfig.tmpl
claude added 3 commits May 17, 2026 21:07
…2 N1)

R1's H4 fix gated every signing/notarize step on IS_RELEASE so PR runs
go green without Apple secrets. R2 N1 caught the collateral: the build
scripts this PR adds (build.sh, mobileconfig/build.sh) now never execute
on a PR event. First run with real signing is at the first v* tag push;
any bug in payload staging, lipo, sed substitution, productbuild
arguments, or security cms -S would surface at tag time, not PR time.

Fix: SKIP_SIGN=1 support in both build scripts.
  - packaging/macos/pkg/build.sh: when SKIP_SIGN=1, omit codesign on the
    binary and emit an unsigned productbuild artifact. Skip
    pkgutil --check-signature at the end (would fail by design on an
    unsigned PKG).
  - packaging/macos/mobileconfig/build.sh: when SKIP_SIGN=1, leave the
    rendered profile unsigned and rename .unsigned to the final path.
    plutil -lint still runs (template substitution and base64 CA
    encoding are exercised).

Workflow PR-event step now generates a throwaway CA via openssl req
(payload content only; never used for signing), invokes both build
scripts with SKIP_SIGN=1, and structurally validates the resulting
artifacts:
  - dist/*.pkg present
  - pkgutil --payload-files shows /usr/local/bin/ai-proxy and
    /Library/LaunchDaemons/com.ai-anonymizing-proxy.plist
  - dist/*.mobileconfig present
  - plutil -lint clean on the final .mobileconfig

Artifacts are deleted at the end of the step so unsigned material
cannot ride into any later workflow step. The release path rebuilds
from scratch when IS_RELEASE=true.

Also addresses R2 H3 (reformulated MEDIUM) — captures the
.mobileconfig HTTPS-interception verification as a trackable artifact:
  - issue #125 documents the pre-tag verification ritual + exact pass
    criterion (curl -k without --proxy reaches the daemon's request
    log under the profile's policy)
  - YAML comment above 'Notarize PKG' step names issue #125 and
    states the human-gate constraint
  - docs/packaging/macos.md § 'Pre-tag verification ritual' documents
    the procedure and pass criterion inline
pkgutil --payload-files emits paths with a leading './' (e.g.,
'./usr/local/bin/ai-proxy'). The previous grep anchored on '/usr/...'
and never matched, causing Build (macos-latest) on c219de1 to fail
in the new dry-run step.

Grep now uses '$' anchor on the path suffix and unanchored leading
segment. Also captures payload into a var and prints it on failure
so the CI log surfaces what was actually produced if the
structural check ever fails again.
The macOS dry-run step calls build.sh and mobileconfig/build.sh
directly. Both read VERSION from the environment; PKG_VERSION is the
Makefile-level wrapper. The previous step exported PKG_VERSION and
build.sh failed at the first line with 'VERSION required'.
Copy link
Copy Markdown
Owner

@laplaque laplaque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R3 — at 8e5c975

Verdict: REQUEST_CHANGES.

Both R2 findings (N1, H3-reformulated) resolved in 274e319 + ed76471 + 8e5c975. Verified the dry-run path against the diff and CI; verified the H3 trackable-artifact triple (issue #125 + workflow comment + docs ritual). One open item remains, and it's a self-correction of an R2 mistake: Gate 5 (## §6 Test Inventory — baseline vs head) was scored PASS in R2 based on section + table presence, but the section's prose uses the gate's literal forbidden form. Earl flagged it after R2 posted. Honest answer to his follow-up — "did you check the tests themselves?" — was no in R2; I did before composing R3, and the check changed the scoping of one finding (see below).

Evidence

  • Baseline head SHA: 8e5c9758a705077c61f083cd8409912f975c57c6
  • CI: 17/17 PASS at head. Build (macos-latest) runs in 28s via the new dry-run path (was 43s in the lint-only R2 version). 2× Sign + publish correctly SKIPPED (tag-only).
  • R2 reference: review id 4306292364 (CHANGES_REQUESTED) at c219de1.
  • Checked at (UTC): 2026-05-17, head SHA re-verified pre-post.

R2 finding closure

N1 (HIGH) — PR-event CI exercises the build mechanism. ✅ Resolved. New Validate build mechanism (PR dry-run, no signing) step in .github/workflows/release-macos-pkg.yml runs on pull_request events. The step:

  • Generates a throwaway 1-day CA on the runner (never used for signing, only as the byte-content the .mobileconfig template base64-encodes).
  • Calls packaging/macos/pkg/build.sh directly with VERSION=0.0.0-dryrun SKIP_SIGN=1. The script's signing block is now if [ "$SKIP_SIGN" = "1" ]; then echo skipping ...; else codesign+productbuild --sign ...; fi — both codesign on the binary and productbuild --sign are skipped, but the lipo universal-binary path, payload staging (install -d ...), pkgbuild invocation, distribution.xml substitution (sed s/__PKG_VERSION__/.../), and the unsigned productbuild all run.
  • Calls packaging/macos/mobileconfig/build.sh directly with SKIP_SIGN=1 CA_CERT=<dry-run-ca>. The template sed substitution (UUIDs, base64 CA, version), plutil -lint on the unsigned profile, and the mv $OUT.unsigned $OUT rename all run.
  • Verifies the produced PKG with pkgutil --payload-files | grep -qE 'usr/local/bin/ai-proxy$' AND grep -qE 'Library/LaunchDaemons/com\.ai-anonymizing-proxy\.plist$' — structural check that the payload contains what it should.
  • Verifies the .mobileconfig with plutil -lint.
  • rm -rf dist build/dry-run-ca build/macos after — unsigned artifacts cannot ride into any later step or upload.

This closes the gap. A future commit that breaks payload staging, lipo invocation, pkgbuild arguments, or the sed substitution into distribution.xml will fail the PR-event run, not the first tag push. The release path (when IS_RELEASE=true) is unchanged.

H3 (MEDIUM, reformulated) — Trackable artifact for HTTPS-interception verification. ✅ Resolved. Three artifacts:

  1. Issue #125 ("macos: verify .mobileconfig HTTPS interception on MDM-enrolled host before tag") — open, with goal + ritual.
  2. New section ## Pre-tag verification ritual in docs/packaging/macos.md — 4-step procedure (install PKG + profile → confirm both active → unproxied curl -k https://httpbin.org/gettail /var/log/ai-proxy/stdout.log | grep httpbin.org) with explicit pass criterion ("step 4 returns at least one log line referencing httpbin.org"). Notes the macOS-version sensitivity for HTTPSEnable / HTTPSProxy keys.
  3. Workflow YAML comment at the Notarize PKG step naming issue #125 and the docs section, with the framing "Do not tag without a recorded pass."

Not enforceable by CI on a Linux runner — correctly captured as a human gate with all three pointers (issue, docs, workflow) cross-referencing each other.

R2 self-correction

R2 scored Gate 5 (§6 Test Inventory — baseline vs head) as PASS. That was wrong. The section is present and the table is filled, but the section's prose:

Baseline SHA 6436ce0 and head SHA c219de1 share an identical Go source tree (git diff origin/main...HEAD -- '*.go' is empty), so the counts are identical — both columns executed on the same worktree to confirm.

…is the gate's literal forbidden form. The argument is "diff says identical → counts identical → one worktree suffices." The gate's verbatim text: "Diff reasoning ('no test files touched') does NOT satisfy this gate." Section presence is not section compliance.

The author hasn't touched §6 in 274e319 / ed76471 / 8e5c975, so the gap persists. Severity-bumped to HIGH per the project's review convention for CLAUDE.md-gate findings.

While pre-composing this self-correction, Earl asked: "did you check against the tests themselves?" Honest answer: not in R2. I had drafted a second finding — Gate 2's sub-clause "including TestTokenFormatNonRetriggering for every PII type" — claiming the body didn't enumerate per-PII-type results. That finding was wrong. internal/anonymizer/anonymizer_test.go:486 defines TestTokenFormatNonRetriggering with for _, pt := range piiTypes (line 511) — the test iterates over PII types internally; the §6 inventory's package-level count (./internal/anonymizer/ 202 / 0) includes it. The gate's "for every PII type" clause is satisfied by the test's internal iteration, not by per-type rows in the PR body. Dropping that finding before posting; calling it out here so it's recorded.

Quality Gates audit — re-walked verbatim at 8e5c975

# Gate Score Evidence at head
1 make check — 4 sub-gates exit 0 PASS Per-sub-gate last-line evidence in body.
2 go test -race ./... passes — including TestTokenFormatNonRetriggering for every PII type PASS §6 inventory shows 0 fails across 10 packages. The TestTokenFormatNonRetriggering clause is satisfied by the test's for _, pt := range piiTypes loop at internal/anonymizer/anonymizer_test.go:486 — internal iteration covers "every PII type."
3 Coverage minimums (95% per-pkg / 85% total) PASS Per-package + total in body.
4 Delta coverage (90% / changed files) PASS-by-N/A Script-skip on empty *.go diff is the script's documented behavior.
5 §6 Test Inventory baseline-vs-head, executed on main AND head FAIL — HIGH Section + table present; prose uses the gate's literal forbidden form ("identical source tree → identical counts → one worktree to confirm"). Unchanged from c219de1 — not touched in the 3 commits since R2.
6 No hacks PASS Diff grep at head returns empty.

Path to Approval

Single item:

  1. Gate 5 — re-execute the §6 inventory. Check out a worktree at main (currently 6436ce0), run go test -race -count=1 -v <pkg> per package, record the counts. Check out a worktree at the PR head (currently 8e5c975), run the same per-package, record those counts. Replace the §6 prose — drop the "identical source tree, so identical counts, same worktree to confirm" framing. The gate is satisfied by two independent executions, not by inference from git diff. If the executions are genuinely identical (expected, since the diff doesn't touch *.go), the table content stays the same but the section's claim is now grounded in execution evidence rather than diff reasoning. Per the gate's source doc (docs/test-plans/ai-proxy-test-method.md §6) and .github/PULL_REQUEST_TEMPLATE.md.

Notes

  • Threads from R1 still the executor's job to resolve; same for R2's two threads.
  • Re-walked every gate at the new head per pr-review-reread-current-head-not-pinned; the four previously-PASS gates were independently re-verified, not carried over.
  • Earl's challenge surfaced a recurring failure mode in my methodology: scoring "section required by gate" clauses on structural presence instead of prose-compliance. Logged for self-improvement.
  • Quality-gate findings on ai-proxy default one severity higher than they otherwise would; Gate 5's FAIL is therefore HIGH.

Copy link
Copy Markdown
Owner

@laplaque laplaque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

R4 — at 8e5c975 (PR-body update post-R3)

Verdict: APPROVE.

Gate 5 (the R3 open item) now satisfied. All six CLAUDE.md Quality Gates score PASS at head. CI green (17/17 + 2× tag-only Sign + publish correctly SKIPPED). N1 (R2) and H3-reformulated (R2) verified resolved at the 3 commits between R2 and R3. R1 items all verified resolved at c219de1. PR is ready to merge.

Evidence

  • Head SHA: 8e5c9758a705077c61f083cd8409912f975c57c6 (no code change since R3 — this round is a PR-body revision).
  • CI at head: 17/17 PASS; 2× Sign + publish SKIPPED (tag-only, by design).
  • Gate-defining docs: CLAUDE.md § "Quality Gates" + .github/PULL_REQUEST_TEMPLATE.md + docs/test-plans/ai-proxy-test-method.md § 6 + § 5 + § 2.3.
  • Checked at (UTC): 2026-05-17, head re-verified pre-post.

Final Quality Gates audit

# Gate Score Verified at head
1 make check — 4 sub-gates exit 0 PASS Per-sub-gate last-line evidence in PR body (lint 0 issues., test all 10 ok, security Issues : 0, vulncheck No vulnerabilities found.).
2 go test -race ./... passes — including TestTokenFormatNonRetriggering for every PII type PASS §6 per-pack inventory shows 0 fail on every row; TestTokenFormatNonRetriggering exists at internal/anonymizer/anonymizer_test.go:486 with for _, pt := range piiTypes (line 511) — internal iteration covers "every PII type."
3 Coverage minimums (95% per-pkg / 85% total) PASS Per-package: internal/anonymizer 95.6%, internal/anonymizer/packs 97.6%, internal/config 100.0%. Total 94.4% (≥85%).
4 Delta coverage (90% / changed files) PASS-by-N/A Script-documented behavior on empty *.go diff. PR diff at head: git diff origin/main...HEAD -- '*.go' empty (verified).
5 §6 Test Inventory baseline-vs-head PASS See resolution detail below.
6 No hacks PASS grep -nE 'TODO|FIXME|HACK|XXX|KLUDGE|WORKAROUND|//nolint|t\.Skip|coverage-ignore' on the full diff at head: empty.

Gate 5 resolution (R3 → R4)

R3 flagged Gate 5 FAIL because the §6 prose used the gate's literal forbidden form ("identical source tree → identical counts → one worktree to confirm"). Earl's follow-up — "I still do not see the pack tests that are part of the §6 tests" — added the granularity requirement: the gate references docs/test-plans/ and that inventory is per-PACK (GLOBAL, DE, US, FR, SECRETS, NL, FINANCE_EU, HEALTHCARE, Cross-pack), not per-Go-package.

The revised §6 section addresses both readings:

  1. Methodology block explicitly describes two independent worktree executions (git worktree add /tmp/main-tree 6436ce0, separate go test -race -count=1 -v -run "^(N1|N2|...)$" <pkg> per row), and ends with "Not derived from git diff." Forbidden form is gone.
  2. Per-pack table maps row-for-row to docs/test-plans/ai-proxy-test-method.md § 6 — 9 inventory rows expanded to 13 rows because SECRETS/NL/FINANCE_EU/HEALTHCARE each have both unit and report test files. Pack columns: GLOBAL, DE, US, FR, SECRETS (unit+report), NL (unit+report), FINANCE_EU (unit+report), HEALTHCARE (unit+report), Cross-pack. Per-pack columns for main(6436ce0) and head (8e5c975); all deltas zero.
  3. §5 pinning tests — the named tests for issues #67-70 plus the by-design rows are executed by name on both worktrees. Includes TestSecretsPriorityOverGLOBAL (internal/anonymizer/secrets_priority_report_test.go) which pins CLAUDE.md Invariant #3 ("SECRETS must precede GLOBAL"). Confirmed PASS by name on head.
  4. §2.3 config compliancegrep "EnabledPacks" internal/anonymizer/*_report_test.go shows the canonical order in every report test (SECRETS first), with UseAI: false and PackDecayRate: 0.0. Invariant #3 verified at the config level in addition to the named-test pin.
  5. Supplemental per-Go-package rollup kept (matches the PR template's table format) — explicitly labeled "rollup, not the gate artifact — the §6 per-pack table above is the artifact the spec demands."
  6. Honest disclosure that prior rounds' per-package counts used ^--- PASS and missed indented subtests; new counts use ^[[:space:]]*--- PASS. Self-corrected with the explicit "all previously-posted numbers were undercounted" note.

Verified the inventory mapping: every row in the PR body's per-pack §6 table corresponds to a row in docs/test-plans/ai-proxy-test-method.md § 6, and the file paths match. Verified TestSecretsPriorityOverGLOBAL exists at the named file. Did not re-run the tests (Step 6.5 reviewer hard-stop — author's documented execution is the evidence).

Closure map across rounds

Round Finding Status
R1 H1 PR body discloses + checklist filled ✅ Closed at PR-body rewrite
R1 H2 postinstall rollback exit-code accuracy ✅ Closed at 00d341a
R1 H3 HTTPS interception evidence ✅ Reformulated at R2, closed at 8e5c975 (issue #125 + docs ritual + workflow comment)
R1 H4 secret-verify gated on IS_RELEASE ✅ Closed at 79c409b
R1 H5 P12 cleanup on all exit paths ✅ Closed at 79c409b
R1 H6 artifact upload scope ✅ Closed at 79c409b
R1 H7 mobileconfig blast-radius callout ✅ Closed at c219de1
R1 M1 ai-proxy.env macOS ports ✅ Closed at b9ce7c6
R1 M2 <pkg-ref version="0"> ✅ Closed at b9ce7c6
R1 M3 mobileconfig ExceptionsList ✅ Closed at b9ce7c6
R1 M4 rotation per route ✅ Closed at c219de1
R1 L1 bootstrap stderr ✅ Closed at 00d341a
R1 L2 workflow concurrency ✅ Closed at 79c409b
R1 L3 troubleshooting section ✅ Closed at c219de1
R2 N1 PR-event CI exercises build mechanism ✅ Closed at 274e319 + ed76471 + 8e5c975
R2 H3 (reformulated) Trackable verification artifact ✅ Closed at 8e5c975
R3 Gate 5 §6 forbidden-form prose + per-pack granularity ✅ Closed at this PR-body update

Pre-tag reminder (informational, not a gate)

The .mobileconfig HTTPS-interception verification ritual at docs/packaging/macos.md § "Pre-tag verification ritual" (tracked at issue #125, referenced from the workflow YAML Notarize PKG step) must be executed on an MDM-enrolled / profile-enrolled macOS host before pushing a v* tag. This is a human gate; CI cannot enforce it.

Notes

  • Threads from R1 / R2 still the executor's job to resolve per the project convention.
  • This review approves merge of the PR. Approval is for the code + docs at head SHA 8e5c975; further commits would re-open the review.

@laplaque laplaque merged commit f3817d1 into main May 17, 2026
19 checks passed
@laplaque laplaque deleted the claude/macos-pkg-mobileconfig-Wll74 branch May 17, 2026 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants