fix: GroupBy(ValueDrift) widget title now reflects group-by context#1881
Open
SuryaSunil1326 wants to merge 2 commits into
Open
fix: GroupBy(ValueDrift) widget title now reflects group-by context#1881SuryaSunil1326 wants to merge 2 commits into
SuryaSunil1326 wants to merge 2 commits into
Conversation
The hardcoded fill value (0.0001 or min/1e6) for zero-probability bins in get_binned_data caused two problems: 1. When min non-zero percent <= 0.0001, filling with min/1e6 produced an astronomically small value that inflated KL divergence scores. 2. Reference and current distributions received different fill values, introducing asymmetry in divergence metrics. Fix: compute a single fill value as min_nonzero_both / 10, where min_nonzero_both is the smallest non-zero probability across both distributions. This guarantees the fill is always smaller than any genuine probability and is applied symmetrically. Fixes evidentlyai#334
When ValueDrift is wrapped in GroupBy, the counter widget label was
hardcoded as "Drift in column '<col>'" regardless of the GroupBy
label. Thread display_name() through _render() so the patched name
("... group by '<col>' for label: '<label>'") appears in the widget.
Fixes evidentlyai#1706
AI assistance: Written with Claude Code. All code reviewed, tested, and
verified by the author.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When
ValueDriftis wrapped inGroupBy, the counter widget in the HTML report always shows the hardcoded label"Drift in column '<col>'"— it never includes the group-by context (group by '<col>' for label: '<label>'). Reported in #1706.Root cause
ValueDriftCalculation._render()hardcoded theCounterDatalabel string independently ofdisplay_name(). TheGroupByMetricCalculationpatchesdisplay_nameon the inner calculation before callingcalculate(), but_render()ignored it.Fix
Add an optional
titleparameter to_render()and passself.display_name()when calling it fromcalculate(). Sincedisplay_nameis already patched to the full group-by string in theGroupBycontext, the widget label picks it up automatically.Standalone
ValueDrift: title ="Value drift for col1"GroupBy(ValueDrift): title ="Value drift for col1 group by 'group' for label: 'a'"Testing
Added
tests/future/metrics/test_group_by_metric.pywith two tests:test_group_by_value_drift_widget_title_includes_group_context— asserts the widget label contains"group by"and"for label:"when wrapped inGroupBytest_standalone_value_drift_widget_title— asserts standaloneValueDriftlabel contains the column name and no"group by"textAll 1414 existing future metrics tests pass.
AI assistance: Written with Claude Code. All code reviewed, tested, and verified by the author.