Skip to content

fix(builtins): match sensitive-data keywords on filename, not full path#110

Open
punk-dev-robot wants to merge 1 commit into
eqtylab:mainfrom
punk-dev-robot:fix/sensitive-data-basename-match
Open

fix(builtins): match sensitive-data keywords on filename, not full path#110
punk-dev-robot wants to merge 1 commit into
eqtylab:mainfrom
punk-dev-robot:fix/sensitive-data-basename-match

Conversation

@punk-dev-robot
Copy link
Copy Markdown

Problem

The sensitive_data_protection global builtin matched its keyword set
("private", "auth", "secret", "token", …) with contains() against the
entire canonical path:

is_sensitive_file(path) if {
	lower_path := lower(path)
	sensitive_keywords := { ..., "private", "auth" }
	some keyword in sensitive_keywords
	contains(lower_path, keyword)   # <-- whole path
}

On macOS the kernel canonicalizes /tmp/private/tmp and /var
/private/var. Rust preprocessing passes the canonical path as
resolved_file_path, so every file read under /tmp or /var contains the
substring private and is denied:

$ echo hi > /tmp/notes.txt   # canonical: /private/tmp/notes.txt
Read /tmp/notes.txt  ->  Deny "Blocked access to potentially sensitive file: [SENSITIVE-FILE: *.txt]"

Any path with a private/auth component hits the same footgun
(/Users/<author>/…, oauth/…, etc.). The [SENSITIVE-FILE: *.txt] label is
just the masking helper printing the extension; the real trigger is the keyword
clause.

Fix

Scope the keyword match to the file basename — the documented intent of this
clause ("Files with sensitive keywords"). Directory-scoped secrets remain covered
by the dedicated ssh / cloud / package / vcs clauses, which match on full path
prefixes and are unchanged.

Applied to the claude, factory, and opencode fixtures. cursor ships a
different ruleset (is_sensitive_path) without this clause and is unaffected.

Verification (opa 1.7.1)

path before after
/private/tmp/x/greet.txt denied allowed
/Users/x/private/report.txt denied allowed
secret.txt denied denied
.env denied denied
~/.ssh/id_rsa denied denied
my-passwords.kdbx denied denied

Notes / follow-ups

  • "auth" remains a broad keyword even at basename scope (matches oauth*); left
    as-is to keep this change minimal — worth a separate look.
  • The existing global-builtins integration tests use stub policies rather than the
    shipped fixtures, so this logic isn't covered by cargo test. Adding fixture-level
    coverage is a reasonable follow-up.

is_sensitive_file matched its keyword set ("private", "auth", "secret",
"token", ...) with contains() against the entire canonical path. On macOS
the kernel canonicalizes /tmp and /var under /private, so the "private"
keyword flagged EVERY file beneath /tmp and /var as sensitive and denied
reads of plainly benign files. Any path with a "private"/"auth" component
(e.g. /Users/<author>, oauth dirs) hit the same footgun.

Scope the keyword match to the file basename, which is the documented
intent ("Files with sensitive keywords"). Directory-scoped secrets remain
covered by the dedicated ssh/cloud/package/vcs clauses that match on full
path prefixes. Cursor ships a different ruleset (is_sensitive_path) without
this clause and is unaffected.

Verified with opa 1.7.1:
  /private/tmp/x/greet.txt     -> allowed (was denied)
  /Users/x/private/report.txt  -> allowed (was denied)
  secret.txt, .env, id_rsa,
  my-passwords.kdbx            -> still denied
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant