Skip to content

fix: classify PyTorch binary code patterns as findings#1497

Draft
mldangelo-oai wants to merge 1 commit into
mainfrom
mdangelo/codex/fix-pytorch-binary-security-decision-c132
Draft

fix: classify PyTorch binary code patterns as findings#1497
mldangelo-oai wants to merge 1 commit into
mainfrom
mdangelo/codex/fix-pytorch-binary-security-decision-c132

Conversation

@mldangelo-oai
Copy link
Copy Markdown
Contributor

Summary

  • emit active PyTorch binary code-pattern detections as warning findings instead of info-only checks
  • make aggregate security/exit logic return exit 1 for raw binary eval-like payloads
  • preserve benign float tensor-like .bin handling

Validation

  • original probe before fix: severity info, exit 0
  • original probe after fix: severity warning, exit 1
  • PYTHONPATH=/private/tmp/modelaudit-c132 PROMPTFOO_DISABLE_TELEMETRY=1 /Users/mdangelo/code/modelaudit/.venv/bin/pytest tests/scanners/test_pytorch_binary_scanner.py -q: 21 passed, 2 skipped
  • ruff format/check for touched files
  • mypy modelaudit/scanners/pytorch_binary_scanner.py tests/scanners/test_pytorch_binary_scanner.py
  • git diff --check

@mldangelo-oai
Copy link
Copy Markdown
Contributor Author

@codex review

@github-actions
Copy link
Copy Markdown
Contributor

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 579.78ms -> 580.07ms (+0.1%).

Workload Benchmark Target Size Files Baseline Current Change Status
clean-training-checkpoint tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint safe_large 278.2 KiB 1 13.29ms 13.85ms +4.2% stable
chunked-upload-stream tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream chunked_stream 278.2 KiB 1 15.69ms 16.30ms +3.9% stable
suspicious-pickle-intake tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake suspicious-intake 183.8 KiB 4 79.41ms 76.69ms -3.4% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64] nested_base64 98 B 1 319.1us 311.7us -2.3% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw] nested_raw 78 B 1 314.0us 307.5us -2.1% stable
direct-malicious-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload malicious_reduce 52 B 1 1.13ms 1.11ms -1.5% stable
single-checkpoint-preflight tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load single_checkpoint.pkl 183.0 KiB 1 30.44ms 30.89ms +1.5% stable
nested-payload-review tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex] nested_hex 130 B 1 320.3us 315.8us -1.4% stable
warm-cache-rescan tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan release-candidate 547.3 KiB 32 50.26ms 49.68ms -1.2% stable
mixed-model-repository tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository release-candidate 547.3 KiB 32 222.96ms 225.16ms +1.0% stable
padded-multi-stream-upload tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload multi_stream_padded 4.1 KiB 1 1.18ms 1.17ms -0.9% stable
duplicate-heavy-registry tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot registry-snapshot 915.2 KiB 13 164.47ms 164.29ms -0.1% stable

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant