Add early-exit option for RDKit benchmarks that operates on a specific time budget by scal444 · Pull Request #180 · NVIDIA-Digital-Bio/nvMolKit

scal444 · 2026-05-26T19:46:07Z

Calculates throughput after running for up to X amount of time.

Refactor of substruct rdkit code to reduce churn, but it's still unfortunately verbose due to indentation issues.

- Add --rdkit_max_seconds to etkdg/ff_optimize/substruct benches. When set, the RDKit comparison stops once wall-clock elapsed exceeds the cap; the reported timing is over molecules/pairs actually processed. - Replace 'speedup = baseline_ms / avg_ms' with throughput-based ratios (mols * confs / elapsed) so methods that processed different amounts of work are comparable. - CSV gains mols_processed/pairs_processed, rdkit_max_seconds, confs_per_second/pairs_per_second, vs_rdkit_throughput_ratio columns. - substruct: --rdkit_match_mode and --rdkit_threads become nargs='+'; main runs the cartesian product and picks the best-throughput variant as the validation/throughput baseline.

greptile-apps · 2026-05-26T19:54:21Z

Greptile Summary

This PR adds a --rdkit_max_seconds early-exit option to all three RDKit benchmarks (ETKDG, FF-optimize, substruct), backed by a new Deadline class and time_it_bounded helper in bench_utils/timing.py. It also refactors substruct_bench to support sweeping multiple --rdkit_match_mode and --rdkit_threads values in a single run, and adds unit tests for the new timing primitives.

time_it_bounded drives early exit in substruct_bench; etkdg_bench and ff_optimize_bench instead embed a per-run Deadline inside their run() closure and still call the existing time_it, so total benchmark time with max_seconds set is runs × max_seconds rather than approximately max_seconds.
CSV output now includes mols_processed, rdkit_max_seconds, throughput columns, and a vs_rdkit_throughput_ratio; rdkit_max_seconds is correctly gated to rdkit-only rows in all three scripts.

Confidence Score: 4/5

Safe to merge with one bug fix: time_it_bounded can return an inconsistent (avg_ms, last_progress) pair when complete runs are followed by a partial run.

time_it_bounded returns last_progress from a partial run paired with avg_ms computed from the prior complete runs. Any throughput calculation that divides last_progress by avg_ms will underestimate actual throughput, and the RDKit hit max_seconds budget diagnostic will fire spuriously even though complete runs occurred. The fix is a one-line change on the return statement.

benchmarks/bench_utils/timing.py — the time_it_bounded mixed complete+partial return path

Important Files Changed

Filename	Overview
benchmarks/bench_utils/timing.py	Adds Deadline, throughput_per_s, time_it_bounded, and add_rdkit_max_seconds_arg. time_it_bounded has a bug in the mixed complete+partial run path: it returns last_progress from the partial run paired with avg_ms from complete runs, causing underestimated throughput at the call site.
benchmarks/etkdg_bench.py	Adds per-run Deadline inside bench_rdkit and exposes --rdkit_max_seconds; uses time_it (not time_it_bounded), so all runs execute even when max_seconds is set, making total time runs×max_seconds rather than ~max_seconds.
benchmarks/ff_optimize_bench.py	Mirrors etkdg_bench changes: adds per-run Deadline, rdkit_max_seconds gating, and throughput columns. Same time_it-vs-time_it_bounded inconsistency as etkdg_bench.
benchmarks/substruct_bench.py	Significant refactor: supports multi-value --rdkit_match_mode/--rdkit_threads sweeps, extracts _validate_matches, uses time_it_bounded for early exit, and correctly gates rdkit_max_seconds to rdkit-only CSV rows.
benchmarks/tests/test_timing.py	New unit tests for Deadline, throughput_per_s, and time_it_bounded. Test comment for the budget-exhaustion test now correctly describes the run-2-completes / deadline-before-run-3 behaviour.
benchmarks/tests/conftest.py	New conftest that inserts the benchmarks/ directory into sys.path so bench_utils is importable from the tests/ subdirectory.
benchmarks/bench_utils/init.py	Re-exports the five new timing symbols (Deadline, add_rdkit_max_seconds_arg, throughput_per_s, time_it_bounded) and adds them to all.

_{Reviews (2): Last reviewed commit: "Address greptile comments" | Re-trigger Greptile}

greptile-apps · 2026-05-26T19:54:27Z

+                    pairs_done,
+                    args.rdkit_max_seconds,
+                    throughput,


rdkit_max_seconds is written as args.rdkit_max_seconds for every CSV row, including nvmolkit. Every other rdkit-specific field (rdkit_threads, rdkit_match_mode) is correctly gated to "N/A" for non-rdkit rows, so a consumer parsing the CSV will misread the nvmolkit row as if it was also bounded by the time budget.

Suggested change

pairs_done,

args.rdkit_max_seconds,

throughput,

pairs_done,

args.rdkit_max_seconds if is_rdkit else "N/A",

throughput,

@greptileai, this is fixed, please recheck

Confirmed — the fix is in place. rdkit_max_seconds is now assigned inside the if is_rdkit: ... else: rdkit_max_seconds = "N/A" block, consistent with how rdkit_match_mode and rdkit_threads are gated. Non-rdkit rows (e.g., nvmolkit) will correctly show N/A in the CSV column. This is good.

greptile-apps · 2026-05-26T19:54:28Z

+    # 5 runs * 50ms = 250ms total, but budget is only 60ms.
+    # First run completes; before run 2 the deadline check kicks in.


The comment says the deadline fires "before run 2", but with a 50 ms sleep and a 60 ms budget, run 2 will normally start (only ~50 ms elapsed at the pre-run-2 check), and the deadline fires between run 2 and run 3. The assertion 1 <= call_count[0] < 5 is intentionally loose and passes either way, but the incorrect comment may lead future readers to misunderstand when the outer deadline check fires relative to per-run inner deadlines.

Suggested change

# 5 runs * 50ms = 250ms total, but budget is only 60ms.

# First run completes; before run 2 the deadline check kicks in.

# 5 runs * 50ms = 250ms total, but budget is only 60ms.

# First run completes in ~50ms; the deadline check between runs may fire

# after run 1 or run 2 depending on system load (50ms vs 60ms is close),

# so call_count ends up between 1 and 4.

scal444 added 3 commits May 26, 2026 15:20

Consolidate testing

b4f07c8

Refactor substruct to reduce churn

5553ea1

greptile-apps Bot reviewed May 26, 2026

View reviewed changes

Address greptile comments

5d243aa

scal444 requested a review from evasnow1992 May 27, 2026 12:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add early-exit option for RDKit benchmarks that operates on a specific time budget#180

Add early-exit option for RDKit benchmarks that operates on a specific time budget#180
scal444 wants to merge 4 commits into
NVIDIA-Digital-Bio:mainfrom
scal444:split/throughput

scal444 commented May 26, 2026

Uh oh!

greptile-apps Bot commented May 26, 2026 •

edited

Loading

Greptile Summary

Uh oh!

greptile-apps Bot May 26, 2026

Uh oh!

scal444 May 27, 2026

Uh oh!

greptile-apps Bot May 27, 2026

Uh oh!

greptile-apps Bot May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		# 5 runs * 50ms = 250ms total, but budget is only 60ms.
		# First run completes; before run 2 the deadline check kicks in.

Conversation

scal444 commented May 26, 2026

Uh oh!

greptile-apps Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

greptile-apps Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

scal444 May 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented May 26, 2026 •

edited

Loading