fix: fail closed on embedded Python JIT budget gaps by mldangelo-oai · Pull Request #1502 · promptfoo/modelaudit

mldangelo-oai · 2026-05-31T05:14:05Z

Summary:\n- emit analysis_incomplete findings when embedded Python/JIT byte caps or snippet budgets are exhausted\n- mark scanner bridge outcomes inconclusive for those detector findings\n- add over-budget byte/snippet regressions plus benign within-budget guards\n\nValidation:\n- targeted new tests: 4 passed\n- focused JIT/PyTorch ZIP suites: 362 passed, 5 warnings\n- ruff format/check, mypy, git diff --check

mldangelo-oai · 2026-05-31T05:14:12Z

@codex review

github-actions · 2026-05-31T05:15:46Z

Workflow run and artifacts

Performance Benchmarks

Compared 12 shared benchmarks with a regression threshold of 15%.
Status: 0 regressions, 0 improved, 12 stable, 0 new, 0 missing.
Aggregate shared-benchmark median: 767.77ms -> 777.09ms (+1.2%).

Workload	Benchmark	Target	Size	Files	Baseline	Current	Change	Status
`direct-malicious-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_direct_malicious_upload`	`malicious_reduce`	52 B	1	1.49ms	1.44ms	-3.1%	stable
`suspicious-pickle-intake`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_suspicious_pickle_intake`	`suspicious-intake`	183.8 KiB	4	101.74ms	104.44ms	+2.6%	stable
`mixed-model-repository`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_release_candidate_repository`	`release-candidate`	547.3 KiB	32	294.08ms	299.45ms	+1.8%	stable
`chunked-upload-stream`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_chunked_upload_stream`	`chunked_stream`	278.2 KiB	1	22.88ms	22.58ms	-1.3%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_hex]`	`nested_hex`	130 B	1	414.8us	419.8us	+1.2%	stable
`padded-multi-stream-upload`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_padded_multi_stream_upload`	`multi_stream_padded`	4.1 KiB	1	1.54ms	1.52ms	-1.2%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_raw]`	`nested_raw`	78 B	1	408.8us	404.3us	-1.1%	stable
`single-checkpoint-preflight`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_single_checkpoint_before_load`	`single_checkpoint.pkl`	183.0 KiB	1	40.71ms	41.05ms	+0.8%	stable
`duplicate-heavy-registry`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_duplicate_registry_snapshot`	`registry-snapshot`	915.2 KiB	13	218.47ms	220.06ms	+0.7%	stable
`clean-training-checkpoint`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_clean_training_checkpoint`	`safe_large`	278.2 KiB	1	19.90ms	19.76ms	-0.7%	stable
`warm-cache-rescan`	`tests/benchmarks/test_scan_benchmarks.py::test_scan_warm_cached_repository_rescan`	`release-candidate`	547.3 KiB	32	65.73ms	65.56ms	-0.3%	stable
`nested-payload-review`	`tests/benchmarks/test_picklescan_benchmarks.py::test_picklescan_nested_payload_review[nested_base64]`	`nested_base64`	98 B	1	408.9us	408.0us	-0.2%	stable

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 080407ef58

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-31T05:19:55Z

        )
        if index >= _MAX_DEFAULT_EMBEDDED_PYTHON_SNIPPETS:
            if not has_priority_marker:
+                omitted_budgeted_candidates += 1


Count only uncovered snippets as omitted

When a normal source-like member has more than 10 def/class/import starts, the first selected candidate often spans from the first marker through the rest of the file, so the later candidates are already covered by a parsed selected span. Incrementing the omitted counter here still emits an analysis_incomplete finding, which makes fully covered files inconclusive and can turn an otherwise clean scan into a failed/exit-1 result. Please only count candidates whose spans are not covered by an already selected span.

Useful? React with 👍 / 👎.

fix: fail closed on embedded python budget exhaustion

080407e

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fail closed on embedded Python JIT budget gaps#1502

fix: fail closed on embedded Python JIT budget gaps#1502
mldangelo-oai wants to merge 1 commit into
mainfrom
mdangelo/codex/fix-embedded-python-snippet-budgets-c159

mldangelo-oai commented May 31, 2026

Uh oh!

mldangelo-oai commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mldangelo-oai commented May 31, 2026

Uh oh!

mldangelo-oai commented May 31, 2026

Uh oh!

github-actions Bot commented May 31, 2026

Performance Benchmarks

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant