fix: avoid pickle meta-path source probing#1493
Conversation
|
@codex review |
Performance BenchmarksCompared Top regressions:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0633495cf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
ianw-oai
left a comment
There was a problem hiding this comment.
Reviewed; this looks acceptable.
…ckle-import-source-probing-c112
…ckle-import-source-probing-c112
|
QA/fix pass pushed in Key hardening:
Validation:
The prior review threads were already resolved. A broader pre-final API run was not used as validation because the main-branch merge left the local compiled Rust extension stale relative to its source, producing unrelated Unicode/nested-scan failures; per the PR loop policy I did not rebuild or run the full suite. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 618bf709e5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…ckle-import-source-probing-c112
|
Reviewed and pushed 8c83361. Findings and fixes:
Validation: all seven focused bypass tests passed; the complete call-graph module passed with 310 passed and 1 skipped under xdist; the associated API module passed in the combined targeted run; Ruff format/check and mypy passed for the affected files. CI is running on the pushed head. |
There was a problem hiding this comment.
💡 Codex Review
For dotted imports, the resolver reuses a package's submodule_search_locations but _has_untrusted_path_hook() only inspected entries currently in sys.path. If sys.path_importer_cache already contains a custom finder for the package directory (which is normally not itself in sys.path), Python will use that cached finder for pkg.sub, while this code constructs a fresh FileFinder for the same location and can analyze a benign sub.py as clean instead of failing closed for the custom import path.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
|
|
||
| def _is_standard_path_hook(hook: object) -> bool: | ||
| return hook is zipimporter or any(hook is trusted_hook for trusted_hook in _TRUSTED_PATH_HOOKS) |
There was a problem hiding this comment.
Reject preexisting custom path hooks
If an application installs a custom sys.path_hooks entry before importing modelaudit_picklescan, this identity check permanently treats that hook as standard. When a later sys.path entry handled by that hook also contains a benign .py file, _has_untrusted_path_hook() returns false and the resolver reads the benign file with a fresh FileFinder, even though Python's import machinery would use the custom hook for the pickle-selected module; this can turn an unanalyzable reference into a clean call-graph result instead of failing closed.
Useful? React with 👍 / 👎.
| for entry in sys.path: | ||
| cache_key = entry or os.getcwd() | ||
| finder = sys.path_importer_cache.get(cache_key) | ||
| if finder is not None and not isinstance(finder, (FileFinder, zipimporter)): |
There was a problem hiding this comment.
Reject cached FileFinder subclasses
When sys.path_importer_cache already contains a custom finder that subclasses FileFinder, isinstance(..., FileFinder) treats it as trusted, but real imports will dispatch to that cached finder while this resolver ignores it and constructs a new plain FileFinder for the same path. In a path entry ahead of a benign same-named .py, that lets the scan analyze the benign file and return no source_unavailable notice even though the actual import would be controlled by the cached custom finder.
Useful? React with 👍 / 👎.
| if loaded_spec.origin.endswith(tuple(SOURCE_SUFFIXES)): | ||
| loaded_source_path = Path(loaded_spec.origin) | ||
| if loaded_source_path.is_file(): | ||
| return loaded_source_path |
There was a problem hiding this comment.
Fail closed for preloaded module specs
When the scan process already has a module in sys.modules, this path trusts that module's __spec__.origin as the source to analyze. A plugin or custom loader can preload the pickle-selected module and set origin to a benign .py file; _resolve_module_source() then parses that file and can report the reference clean, while pickle.loads resolves to the already-loaded in-memory module rather than the benign source. Loaded specs from non-standard loaders need the same fail-closed treatment as custom finders.
Useful? React with 👍 / 👎.
| elif _untrusted_meta_path_finder_precedes(PathFinder, module_name) or _has_untrusted_path_hook(): | ||
| return None |
There was a problem hiding this comment.
Invalidate shared caches when import hooks change
_resolve_module_source() is cached and now depends on sys.meta_path/sys.path_hooks, but the shared-cache stability check still only tracks sys.path and source fingerprints. If a caller uses the public shared_source_sensitive_caches() context across scans, a module resolved cleanly before a custom finder is inserted remains cached as analyzable after the hook change, so the later scan can miss the required source_unavailable fail-closed notice until caches are manually cleared.
Useful? React with 👍 / 👎.
Summary
sys.meta_pathand path-entry finder execution from pickle call-graph source probingsource_unavailablewhen attacker-selected module source cannot be safely inspectedValidation
origin/main(372a72ab)676 passed, 1 skippedruff format --check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/ruff check modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/mypy modelaudit/ packages/modelaudit-picklescan/src packages/modelaudit-picklescan/tests tests/(454source files)99%with no failures twice, then reproduced an idle xdist teardown stall locally; fresh GitHub CI is running on the pushed headgit diff --check