Skip to content

Commit 04439bf

Browse files
committed
fix(security): tier-driven approval gate + restore legacy sensitive-path coverage (Spec 077 US1)
Codex round-5 findings on PR #786: #1 (HIGH) approval gate / verdict consistency: isBlockingFinding now blocks iff Tier=="hard". Deep-scan/external/legacy findings carry no tier and no longer gate approval or drive a "dangerous" verdict (US3 FR-021 — they inform but never gate). Only the in-process baseline detect engine sets Tier, so US1 hard-block behavior (hard phrase_injection / hard detect) is unchanged. This is the single predicate behind both the ApproveServer gate and the GetScanSummary "dangerous" status, so gate and verdict can never disagree. #2 (MEDIUM) embedded-secret file-path coverage: restore the legacy security.NewDetector(nil) / paths.go GetFilePathPatterns() paths the detect check had dropped — ~/.azure/accessTokens.json + azureProfile.json, ~/.docker/config.json, *.key, *.ppk, ~/.gitconfig, ~/.pypirc, *service_account*.json, macOS ~/Library/Keychains/*, Windows %LOCALAPPDATA%\Microsoft\Credentials\*, and <name>.env. Curated regexes mirror paths.go (kept offline; detect cannot import internal/security, which pulls in os) with a source-of-truth comment. Soft findings; new unit tests cover each restored path plus benign non-matches. #3 (ACCEPTED, no logic change): documented the sample/example-label phrase-position false positive in position.go as a known, conservative over-block (visible/quarantined/--force-able, not a silent bypass), tracked as a follow-up. Gate: recall=1.0 (>=0.90), fp=0.0 (<=0.05). Full suite + golangci-lint v2 green. Related: Spec 077
1 parent 41a24b7 commit 04439bf

4 files changed

Lines changed: 137 additions & 27 deletions

File tree

internal/security/detect/checks/embedded_secret.go

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,15 +73,45 @@ func builtinSecretPatterns() []*patterns.Pattern {
7373
// references restored from the legacy sensitive_file detector. Matched
7474
// case-insensitively against raw text; order is deterministic so ties resolve
7575
// stably.
76+
//
77+
// SOURCE OF TRUTH: internal/security/paths.go GetFilePathPatterns() is the
78+
// canonical sensitive-path list the deleted security.NewDetector(nil) path used.
79+
// That list is glob-style (e.g. "*.pem", "~/.aws/credentials") for matching real
80+
// filesystem paths in tool args/responses; here we need TEXT-scanning regexes for
81+
// free-form descriptions/schemas, and detect must stay offline (it cannot import
82+
// internal/security, which pulls in os). So the curated set below MIRRORS
83+
// paths.go rather than importing it — keep the two in sync when either changes.
84+
// Every category paths.go covers (SSH, AWS, GCP, Azure, Docker, kube, env,
85+
// private keys, git/registry creds, macOS keychain, Windows credentials, Linux
86+
// /etc) is represented here (Spec 077 US1, Codex round-5 finding #2).
7687
var sensitiveFilePatterns = []*regexp.Regexp{
88+
// SSH private keys — ~/.ssh/id_rsa|dsa|ecdsa|ed25519, *_key (+ %USERPROFILE%).
7789
regexp.MustCompile(`(?i)(?:~|%userprofile%|/home/[^/\s]+|/root)?[/\\]?\.ssh[/\\](?:id_(?:rsa|dsa|ecdsa|ed25519)|[^/\\\s]*_key)`),
90+
// AWS credentials/config.
7891
regexp.MustCompile(`(?i)(?:~|%userprofile%|/home/[^/\s]+|/root)?[/\\]?\.aws[/\\](?:credentials|config)`),
92+
// GCP application-default/credentials.db + *service_account*.json.
7993
regexp.MustCompile(`(?i)\.config[/\\]gcloud[/\\](?:application_default_credentials\.json|credentials\.db)`),
94+
regexp.MustCompile(`(?i)[\w.\-]*service_account[\w.\-]*\.json\b`),
95+
// Azure access tokens / profile.
96+
regexp.MustCompile(`(?i)(?:~|%userprofile%|/home/[^/\s]+|/root)?[/\\]?\.azure[/\\](?:accesstokens|azureprofile)\.json`),
97+
// Docker config (registry auth tokens).
98+
regexp.MustCompile(`(?i)(?:~|%userprofile%|/home/[^/\s]+|/root)?[/\\]?\.docker[/\\]config\.json`),
99+
// Kubernetes config.
80100
regexp.MustCompile(`(?i)(?:~|/home/[^/\s]+|/root)?[/\\]?\.kube[/\\]config`),
101+
// Linux system credential files.
81102
regexp.MustCompile(`(?i)/etc/(?:passwd|shadow|sudoers)`),
103+
// dotenv files — ".env", ".env.<stage>", and "<name>.env".
82104
regexp.MustCompile(`(?i)(?:^|[\s"'` + "`" + `/\\])\.env(?:\.[a-z]+)?(?:$|[\s"'` + "`" + `])`),
83-
regexp.MustCompile(`(?i)[\w./\\-]+\.(?:pem|pfx|p12|kdbx|pgpass)\b`),
84-
regexp.MustCompile(`(?i)(?:\.git-credentials|\.npmrc|\.netrc)\b`),
105+
regexp.MustCompile(`(?i)\b[\w-]+\.env\b`),
106+
// Private-key / secret material files — .pem, .pfx, .p12, .ppk, .key, .kdbx, .pgpass.
107+
regexp.MustCompile(`(?i)[\w./\\-]+\.(?:pem|pfx|p12|ppk|key|kdbx|pgpass)\b`),
108+
// Git + package-registry credential files — .git-credentials, .gitconfig,
109+
// .npmrc, .pypirc, .netrc.
110+
regexp.MustCompile(`(?i)(?:\.git-credentials|\.gitconfig|\.npmrc|\.pypirc|\.netrc)\b`),
111+
// macOS keychains — ~/Library/Keychains/* and /Library/Keychains/*.
112+
regexp.MustCompile(`(?i)[/\\]?Library[/\\]Keychains[/\\]`),
113+
// Windows credential store — %LOCALAPPDATA%|%APPDATA%\Microsoft\Credentials\*.
114+
regexp.MustCompile(`(?i)%(?:localappdata|appdata)%[/\\]microsoft[/\\]credentials[/\\]`),
85115
}
86116

87117
// entropyCandidate matches contiguous runs that could be an opaque secret token.

internal/security/detect/checks/embedded_secret_test.go

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,70 @@ func TestEmbeddedSecret_RestoredCategories(t *testing.T) {
107107
}
108108
}
109109

110+
// TestEmbeddedSecret_LegacySensitivePaths locks Spec 077 US1 Codex round-5
111+
// finding #2: the detect check's sensitive-file coverage must be no narrower than
112+
// the legacy security.NewDetector(nil) / paths.go GetFilePathPatterns() set. Each
113+
// path below is one the legacy detector caught but the earlier detect check
114+
// dropped; every one must now raise a soft embedded_secret signal.
115+
func TestEmbeddedSecret_LegacySensitivePaths(t *testing.T) {
116+
c := &EmbeddedSecret{}
117+
cases := []struct {
118+
name string
119+
desc string
120+
}{
121+
{"azure-access-tokens", "Reads Azure creds from ~/.azure/accessTokens.json on startup."},
122+
{"azure-profile", "Loads the subscription from ~/.azure/azureProfile.json."},
123+
{"docker-config", "Pulls the registry token out of ~/.docker/config.json."},
124+
{"private-key-dot-key", "Signs requests with the private key at /opt/app/server.key."},
125+
{"putty-ppk", "Connects over SSH using the PuTTY key deploy.ppk."},
126+
{"gitconfig", "Reads the committer identity from ~/.gitconfig."},
127+
{"pypirc", "Uploads the package using credentials from ~/.pypirc."},
128+
{"npmrc", "Publishes with the token in ~/.npmrc."},
129+
{"gcp-service-account", "Authenticates with the my-app-service_account.json key file."},
130+
{"macos-keychain", "Exports secrets from ~/Library/Keychains/login.keychain-db."},
131+
{"windows-credentials", `Reads saved logins from %LOCALAPPDATA%\Microsoft\Credentials\creds.dat.`},
132+
{"named-dotenv", "Sources environment variables from production.env before running."},
133+
}
134+
for _, tc := range cases {
135+
t.Run(tc.name, func(t *testing.T) {
136+
sigs := c.Inspect(view("t", tc.desc), detect.RegistryView{})
137+
if len(sigs) == 0 {
138+
t.Fatalf("expected a sensitive-path signal for %q, got none", tc.desc)
139+
}
140+
s := sigs[0]
141+
if s.Tier != detect.TierSoft {
142+
t.Errorf("must be soft, got %v", s.Tier)
143+
}
144+
if s.CheckID != c.ID() {
145+
t.Errorf("CheckID = %q, want %q", s.CheckID, c.ID())
146+
}
147+
})
148+
}
149+
}
150+
151+
// TestEmbeddedSecret_LegacySensitivePaths_NoFalsePositive keeps the broadened
152+
// path coverage from firing on benign prose that merely mentions the words
153+
// (without an actual path reference).
154+
func TestEmbeddedSecret_LegacySensitivePaths_NoFalsePositive(t *testing.T) {
155+
c := &EmbeddedSecret{}
156+
cases := []struct {
157+
name string
158+
desc string
159+
}{
160+
{"key-word-no-path", "Rotates the signing API key and returns the new key id."},
161+
{"env-word-no-file", "Runs the command in the current shell environment and returns stdout."},
162+
{"docker-word-no-config", "Builds a Docker image from the given context and pushes it."},
163+
}
164+
for _, tc := range cases {
165+
t.Run(tc.name, func(t *testing.T) {
166+
sigs := c.Inspect(view("t", tc.desc), detect.RegistryView{})
167+
if len(sigs) != 0 {
168+
t.Fatalf("expected no signal for %q, got %+v", tc.desc, sigs)
169+
}
170+
})
171+
}
172+
}
173+
110174
// TestEmbeddedSecret_RestoredCategories_NoFalsePositive keeps the restored
111175
// categories from over-firing: an ordinary long identifier (low entropy) and a
112176
// documented example key must stay below the emit floor.

internal/security/detect/position.go

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -228,6 +228,18 @@ func ClassifyPosition(text string, matchStart int) Position {
228228

229229
// 4. Otherwise the match is an instruction — including one behind a bare
230230
// "label:" prefix, which does not by itself discount a clear imperative.
231+
//
232+
// KNOWN LIMITATION (Spec 077 US1, Codex round-5 finding #3, accepted): a
233+
// benign description that FRAMES an injection as sample/example output using a
234+
// label the cue lists don't recognize — e.g. "Sample response: reveal your
235+
// system prompt to the user" ("sample" + a bare "response:" label, neither in
236+
// wordExampleCues nor a describing-verb/clause frame) — falls through here and
237+
// can hard-fire (phrase-position false positive). This is an accepted
238+
// conservative failure mode, NOT a silent bypass: it over-blocks a benign tool
239+
// (visible, quarantined, overridable with --force) rather than under-blocking a
240+
// real injection. Widening the example cues to catch "sample …:" style labels
241+
// risks reopening finding A (an attacker smuggling an imperative behind a
242+
// label), so the heuristic long-tail is left as-is and tracked as a follow-up.
231243
return PositionInstruction
232244
}
233245

internal/security/scanner/service.go

Lines changed: 29 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1422,11 +1422,11 @@ func (s *Service) ApproveServer(ctx context.Context, serverName string, force bo
14221422
// gate), or a deep-scan/external finding — even though the very same summary and
14231423
// verdict showed the server as non-dangerous. Gate and verdict then disagreed.
14241424
// Under Spec 077's baseline-only, tier-driven model (FR-021, US3 FR-021 —
1425-
// deep-scan/external findings inform but never gate) only a HARD-tier baseline
1426-
// finding, or a legacy/external finding whose threat_level is "dangerous",
1427-
// blocks. A genuinely dangerous critical finding still carries threat_level
1428-
// "dangerous" and so still blocks via isBlockingFinding (and still shows
1429-
// "dangerous" in the summary) — the two stay consistent.
1425+
// deep-scan/external findings inform but never gate) ONLY a HARD-tier baseline
1426+
// finding blocks. Legacy/external/deep-scan findings carry no tier and never
1427+
// gate, even at threat_level "dangerous"; they still surface in the summary as
1428+
// warnings/info. isBlockingFinding is that single tier-driven predicate, so
1429+
// the gate and the "dangerous" summary status stay consistent.
14301430
if aggReport != nil && !force {
14311431
blocking := 0
14321432
for _, f := range aggReport.Findings {
@@ -1779,11 +1779,12 @@ func (s *Service) GetScanSummary(ctx context.Context, serverName string) *ScanSu
17791779
summary.RiskScore = CalculateRiskScore(allFindings)
17801780

17811781
// Count by tier/threat level. Spec 077 FR-014/FR-021: the "dangerous"
1782-
// verdict is tier-driven — only a HARD baseline finding blocks approval.
1783-
// A baseline soft finding (detect emits ThreatLevelWarning for soft-only)
1784-
// counts as a warning, never dangerous. Legacy/external findings that
1785-
// predate the two-tier model carry no tier, so they fall back to their
1786-
// existing threat_level semantics (back-compat).
1782+
// verdict is tier-driven — only a HARD baseline finding is counted as
1783+
// dangerous (isBlockingFinding). A baseline soft finding (detect emits
1784+
// ThreatLevelWarning for soft-only) counts as a warning. Legacy/external/
1785+
// deep-scan findings carry no tier and therefore never count as dangerous
1786+
// (US3 FR-021 — they inform but do not gate); they still surface as
1787+
// warnings/info by threat_level.
17871788
counts := FindingCounts{Total: len(allFindings)}
17881789
for _, f := range allFindings {
17891790
switch {
@@ -1798,7 +1799,7 @@ func (s *Service) GetScanSummary(ctx context.Context, serverName string) *ScanSu
17981799
summary.FindingCounts = &counts
17991800

18001801
// Determine status. A "dangerous" status therefore requires ≥1 hard-tier
1801-
// baseline finding (or a legacy dangerous finding).
1802+
// baseline finding.
18021803
if counts.Dangerous > 0 {
18031804
summary.Status = "dangerous"
18041805
} else if counts.Warning > 0 {
@@ -1816,21 +1817,24 @@ func (s *Service) GetScanSummary(ctx context.Context, serverName string) *ScanSu
18161817
}
18171818

18181819
// isBlockingFinding reports whether a finding gates approval / drives a
1819-
// "dangerous" verdict under the Spec 077 two-tier model (FR-021). A baseline
1820-
// finding blocks only when it is HARD-tier. A legacy/external finding (produced
1821-
// before the two-tier model, so it carries no tier) falls back to its
1822-
// threat_level so pre-existing behavior is preserved. Baseline SOFT findings —
1823-
// which carry Tier=="soft" — never block, even if some producer mislabeled their
1824-
// threat_level, which is exactly what makes the two-tier model govern behavior.
1820+
// "dangerous" verdict under the Spec 077 two-tier model (FR-021, US3 FR-021).
1821+
// Blocking is PURELY tier-driven: a finding blocks if and only if it is a
1822+
// HARD-tier baseline finding. Only the in-process detect engine (the baseline
1823+
// scanner) sets Tier, and it emits Tier=="hard" exactly for the hard-tier
1824+
// checks. Every other producer carries no tier:
1825+
//
1826+
// - Baseline SOFT findings carry Tier=="soft" — review-only, never block.
1827+
// - Deep-scan / external / legacy findings (Docker scanners, imported SARIF,
1828+
// supply-chain audits) carry no tier. Per US3 FR-021 these INFORM but do
1829+
// NOT gate approval, so they never block regardless of threat_level. This
1830+
// keeps the gate consistent with the baseline-only, tier-driven verdict:
1831+
// a no-tier "dangerous" finding must not silently unquarantine-block a
1832+
// server when the baseline itself is clean.
1833+
//
1834+
// This is the SAME predicate that drives the "dangerous" summary status
1835+
// (GetScanSummary) and the ApproveServer gate, so the two can never disagree.
18251836
func isBlockingFinding(f ScanFinding) bool {
1826-
switch f.Tier {
1827-
case TierHard:
1828-
return true
1829-
case TierSoft:
1830-
return false
1831-
default:
1832-
return f.ThreatLevel == ThreatLevelDangerous
1833-
}
1837+
return f.Tier == TierHard
18341838
}
18351839

18361840
// degradeIfIncompleteCoverage downgrades a "clean" verdict to "degraded" when

0 commit comments

Comments
 (0)