fix: use full test dataset for support overview training table by vishpillai123 · Pull Request #120 · datakind/edvise

vishpillai123 · 2026-03-03T23:10:54Z

No description provided.

- Updated run_predictions() to accept test_sample_cap=None for full dataset - Modified training_h2o.py to generate support_overview using full test dataset - Ensures support_overview matches ROC table in using complete test data - Previously used sampled 200 rows, now uses all test rows for accurate distribution

kaylawilding · 2026-05-01T03:10:53Z

@vishpillai123 Is this ready for review? If yes, can you add a comment for context and explanation? Then can you tag Jonathan?

vishpillai123 · 2026-05-01T14:15:42Z

@vishpillai123 Is this ready for review? If yes, can you add a comment for context and explanation? Then can you tag Jonathan?

Not ready yet since we need to backfill old support score tables for schools. The code itself is fine, I just need to backfill.

jnolendata

The change makes sense—using the full test set for the support overview ensures the numbers reflect reality. I noticed run_predictions is called twice: the first call does the usual predictions and SHAP, and the second call is mainly for generating support_score_distribution. The second call repeats model loading, imputation, prediction, and SHAP, which might be more than needed since we just want the support table. Maybe a lighter approach for the second call could work, but overall the logic is clear.

vishpillai123 changed the base branch from main to develop March 3, 2026 23:11

vishpillai123 and others added 2 commits March 3, 2026 18:11

Merge branch 'develop' into feat/support-overview-full-test-dataset

dbc4366

Merge branch 'develop' into feat/support-overview-full-test-dataset

679d6af

jnolendata reviewed May 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use full test dataset for support overview training table #120

fix: use full test dataset for support overview training table #120
vishpillai123 wants to merge 3 commits into
developfrom
feat/support-overview-full-test-dataset

vishpillai123 commented Mar 3, 2026

Uh oh!

kaylawilding commented May 1, 2026 •

edited

Loading

Uh oh!

vishpillai123 commented May 1, 2026

Uh oh!

jnolendata left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vishpillai123 commented Mar 3, 2026

Uh oh!

kaylawilding commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vishpillai123 commented May 1, 2026

Uh oh!

jnolendata left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaylawilding commented May 1, 2026 •

edited

Loading