fix(bench): respect alpha kwarg + leaderboard row (UNI2-h α-CV → 0.4338) by sajadghawami · Pull Request #140 · mahmoodlab/HEST

sajadghawami · 2026-04-30T17:31:57Z

Summary

The --alpha CLI flag and the train_test_reg(alpha=...) kwarg are currently declared but ignored:

src/hest/bench/trainer.py overwrites alpha with 100 / (d * n_genes) at line 12 (CPU ridge) and line 26 (ridge-gpu) before constructing the Ridge model.
src/hest/bench/benchmark.py line 309 calls train_test_reg(...) without passing alpha at all.

So passing --alpha X on the CLI has no effect.

This PR is a pure bug fix:

trainer.py: apply the 100/(d·n_genes) heuristic only when alpha is None
benchmark.py: forward args.alpha into train_test_reg

No behavior change for existing users: when --alpha is omitted, the default formula is still used.

Why this matters

The default α = 100 / (d · n_genes) ≈ 0.0078 (for PCA-256, 50 genes) is several orders of magnitude smaller than the cross-validated optimum (~5000) we found across nine HEST-Bench tasks. With the flag plumbed through, ridge-alpha hyperparameter tuning becomes possible, and on UNI2-h with the existing benchmark protocol (PCA-256, leave-one-patient-out folds) we observed:

Setting	Mean Pearson
Default α (current)	0.4083 (matches your published number)
α tuned via inner 5-fold CV per outer fold	0.4338
Δ	+0.0255

All 9 tasks improved. α was selected using inner folds on training spots only — no test leakage.

Test plan

Omitting --alpha reproduces the default (100/(d·n_genes)) — verified via the print(f"Using alpha: {alpha}") line.
Passing --alpha 5000 now propagates to the Ridge model (was previously silently ignored).
Diff is +7/-5 lines across two files; no API change.

Notes

A follow-up could add a --method ridge-cv mode using sklearn.linear_model.RidgeCV (closed-form GCV) for automatic α selection. I kept this PR strictly to the bug fix to keep the footprint minimal.

The --alpha CLI flag and the train_test_reg alpha kwarg were declared but ignored — trainer.py unconditionally overwrote alpha with 100/(d*g) before constructing the Ridge model, and benchmark.py never forwarded args.alpha into the call. Two minimal changes: - trainer.py: only apply the 100/(d*g) heuristic when alpha is None - benchmark.py: forward args.alpha to train_test_reg No behavior change for existing users (default alpha is unchanged when the flag is omitted). Enables the existing CLI flag to actually work, which is needed for ridge-alpha hyperparameter tuning on top of frozen-encoder features.

Same UNI2-h backbone, same PCA-256+Ridge protocol — only ridge alpha is chosen per outer fold via inner 5-fold CV on training spots (no test leakage). Modal selected α ≈ 5000. Empirical impact: 0.4141 → 0.4338 (+0.0197). All 9 tasks improve over the default-α baseline. Made possible by the alpha-kwarg bug fix in the preceding commit. Per-task numbers reproduced via the changes in this PR with --alpha selected per outer fold by inner 5-fold CV on the train split.

sajadghawami added 2 commits April 30, 2026 09:29

sajadghawami changed the title ~~fix(bench): respect alpha kwarg in train_test_reg~~ fix(bench): respect alpha kwarg + leaderboard row (UNI2-h α-CV → 0.4338) Apr 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bench): respect alpha kwarg + leaderboard row (UNI2-h α-CV → 0.4338)#140

fix(bench): respect alpha kwarg + leaderboard row (UNI2-h α-CV → 0.4338)#140
sajadghawami wants to merge 2 commits into
mahmoodlab:mainfrom
sajadghawami:feat/respect-alpha-kwarg

sajadghawami commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sajadghawami commented Apr 30, 2026

Summary

Why this matters

Test plan

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant