fix: detect newline-separated picklescan calls#1481
Conversation
|
@codex review |
Performance BenchmarksCompared
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0247e2e2dc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 441ca7729a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return false; | ||
| } | ||
| } | ||
| _ => return false, |
There was a problem hiding this comment.
Treat subscripted follow-up calls as code
When the newline-separated statement starts with a subscripted callable, e.g. eval(payload)\ncallbacks[0](), this branch returns false after the first identifier because only ( and . are recognized. That sends control back to call_like_has_prose_suffix, which treats the alphabetic suffix as prose and skips the dangerous eval match, so a valid two-line code string can still scan clean.
Useful? React with 👍 / 👎.
| while chars.peek().is_some_and(|ch| is_python_word_char(*ch)) { | ||
| chars.next(); | ||
| } |
There was a problem hiding this comment.
Bound suffix identifier lookahead
For a literal such as eval(x)\n followed by a long alphanumeric run that is not a call, the new suffix parser consumes the entire remaining identifier before returning false. Because this runs inside contains_call_like for each candidate/pattern and string scanning can pass multi-megabyte windows, a crafted pickle can force repeated full-window scans just to classify the suffix as prose; capping this lookahead would preserve the existing bounded-resource behavior.
Useful? React with 👍 / 👎.
|
Potential regression: the new newline heuristic appears to widen false positives for prose-like parenthetical text, and I did not see coverage for that case. |
Summary
Validation