fix: redact code evidence in scanner findings#1495
Conversation
|
@codex review |
Performance BenchmarksCompared Top regressions:
Top improvements:
|
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1d441b08a2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
This looks materially heavier than the leak it is fixing; the benchmark comment shows a sizable warm-cache-rescan regression, so I am holding approval for now. |
|
Holding this for now: the shared redaction helper still has open review gaps, and the benchmark bot reports a +47.2% warm-cache regression on . |
|
Holding this for now: the shared redaction helper still has open review gaps, and the benchmark bot reports a +47.2% warm-cache regression on test_scan_warm_cached_repository_rescan. |
…de-evidence-redaction-c166 # Conflicts: # modelaudit/scanners/jax_checkpoint_scanner.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8d45fb8ce5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 174dc0f46e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ae11704d6b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3851c7c1ef
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…de-evidence-redaction-c166 # Conflicts: # modelaudit/scanners/_evidence_redaction.py # tests/scanners/test_evidence_redaction.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cd587b4e46
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a21815c65e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b573438660
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
modelaudit/modelaudit/scanners/_evidence_redaction.py
Lines 262 to 263 in 24182a7
When a sensitive query value itself contains a raw semicolon followed by =, for example https://example.com/hook?token=FIRSTSECRET;STILL=SECONDSECRET&ok=1, this normalization splits STILL=SECONDSECRET into a separate non-sensitive parameter before the token value is redacted. The serialized URL evidence then still contains SECONDSECRET; please avoid treating semicolons inside already-sensitive values as independent safe parameters, or fail closed on the remainder.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ed82d6be32
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 47adbc4d8b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…de-evidence-redaction-c166 # Conflicts: # modelaudit/scanners/_evidence_redaction.py # tests/scanners/test_evidence_redaction.py
|
Reviewed, rebased onto current QA fixes added during review:
Validation:
CI is running on the updated head; moving on to the next review and will circle back. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9dd0028e08
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…de-evidence-redaction-c166
Preserve safe context across preview boundaries, close indented and structured credential leaks, and keep adversarial redaction scans bounded without hiding benign code.
|
QA update after syncing with current main and pushing 2156afc:\n\n- Fixed the long R rightward raw-assignment preview regression that failed Linux, Windows, and Python 3.13 CI.\n- Addressed all four new review threads: indented Python routing, sensitive identifier subscripts (scoped to key-bearing containers), embedded name/value dicts, and generic name/value calls.\n- Added adversarial coverage for quoted/backticked targets, exact lookahead boundaries, reversed arguments, duplicate/case-variant and byte-string fields, declaration/dynamic-index false positives, and bounded malformed inputs.\n- Removed two quadratic scans; stress QA is ~1.05s for 16,000 ordinary assignments, ~1ms for a long sensitive call, and ~0.22s for a long benign call.\n\nValidation:\n- 224 passed: evidence redaction + R serialized scanner modules\n- 45 passed, 1778 deselected: redaction-specific JIT/network/JAX/PyTorch consumer tests\n- Ruff lint and format clean\n- mypy clean for the changed helper and associated tests\n- git diff --check clean\n\nAll review threads are resolved. Fresh CI is now running; moving on without waiting per the review loop. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2156afca4a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…de-evidence-redaction-c166
|
QA update for 9fd5e4f:
Focused validation:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9fd5e4f1cc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
Reviewed and pushed Addressed all four open review findings:
Also corrected the failing Keras CI expectation to the actual bounded 12-character value ( Validation:
CI is running on the pushed head; moving on without waiting. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7fcf2eec10
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…-pr-1495-c166 # Conflicts: # tests/detectors/test_jit_script_detector.py
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 311d47cf60
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 68aea213c1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| default_tokens = next( | ||
| (argument_value_tokens for keyword, argument_value_tokens in arguments if keyword == "default"), | ||
| None, |
There was a problem hiding this comment.
Redact const values for sensitive CLI options
When a sensitive argparse option uses const= instead of default=, for example parser.add_argument("--api-key", const="SECRETKEY1234567890", nargs="?"); eval("1"), this branch never selects that value because it only looks for the default keyword. The option name still identifies the literal as credential evidence, so the raw const can be serialized in JIT/Orbax code snippets; include argparse const in the value keywords for these sensitive option calls.
Useful? React with 👍 / 👎.
| for left_token in _significant_tokens(tokens[target_start_index:index]): | ||
| if left_token.type == tokenize.STRING: |
There was a problem hiding this comment.
Redact numeric literals before right-side sensitive targets
When the sensitive value is written on the right side of a comparison or membership check, this branch only replaces left-hand STRING tokens. A snippet such as 123456789012345 == api_key; eval("1") or [123456789012345] in api_keys therefore keeps the numeric credential in serialized code_snippet/restore_fn evidence, even though the sensitive-left case already redacts numeric operands; include NUMBER tokens in this right-target path as well.
Useful? React with 👍 / 👎.
| if token.type == tokenize.OP and ( | ||
| token.string == ";" or (token.string == "," and (operator_depth > 0 or stop_at_comma)) | ||
| ): | ||
| return index + 1 |
There was a problem hiding this comment.
Stop lambda default targets at the parameter name
When a lambda restore hook or code snippet has a detail-only sensitive default like lambda credentials="SECRETKEY1234567890": eval("1"), _is_lambda_default_operator() is true but this backward scan never stops at the lambda keyword, so the target becomes lambda credentials instead of credentials. That misses _is_sensitive_detail_key() and leaves the raw default serialized in restore_fn/code_snippet; stop the target scan at lambda for lambda defaults before classifying the parameter name.
Useful? React with 👍 / 👎.
Summary
Validation