OddSHAP approximator by Sara-ne · Pull Request #522 · mmschlk/shapiq

Sara-ne · 2026-05-20T18:01:48Z

Motivation and Context

This PR adds a new OddSHAP approximator for estimating first-order Shapley values.
OddSHAP is based on the method by Fumagalli et al. (2026). The main idea is to estimate Shapley values through odd Fourier terms.

Implemented

OddSHAP-specific sampling weights
Regression weights for the weighted least-squares problem
Paired coalition sampling through the existing sampler
Used InterventionalTreeExplainer to select relevant higher-order odd interactions
Built the active odd Fourier support from:
- the empty interaction,
- all single-player interactions,
- selected higher-order odd interactions
Solved the constrained odd Fourier regression problem
Converted the fitted odd Fourier coefficients into first-order Shapley values
Added a ValueError when the budget is too small

Notes

This implementation doesn't add a separate ProxySPEX-style adapter. Instead, it reuses the existing tree interaction code through InterventionalTreeExplainer.
The current implementation only supports odd_only=True, because the final Shapley value computation is based on odd-cardinality Fourier terms.

Public API Changes

No Public API changes
Yes, Public API changes (Details below)

How Has This Been Tested?

Checklist

The changes have been tested locally.
Documentation has been updated (if the public API or usage changes).
An entry has been added to CHANGELOG.md (if relevant for users).
The code follows the project's style guidelines.
I have considered the impact of these changes on the public API.

Two free-function helpers used inside OddSHAP.approximate: - lgboost_to_fourier(model_dict): converts a fitted LightGBM model to its aggregated Fourier representation via per-tree DFS recursion (Gorji et al., arXiv:2410.06300). - top_k_interactions(coeffs, k, odd=True): selects the top-k interactions by |coefficient|, optionally restricted to odd cardinality (per the OddSHAP Theorem 3.2 restriction). Mirrors the interface that the OddSHAP paper code imports as `from oddshap.proxyspex import lgboost_to_fourier, top_k_interactions`. 14 unit tests cover: top-k selection logic (odd filter, magnitude sort, k-limit, edge cases); single-leaf / one-split / two-level-split tree recursions with hand-computed expected coefficients; end-to-end on fitted LightGBM (constant, linear, XOR targets); odd singletons recovery via the full pipeline.

Replaces Sara's TODO stub with a ProxySPEX-style screening pass: 1. lgboost_to_fourier(surrogate.booster_.dump_model()) converts the fitted LightGBM surrogate to its sparse Fourier representation (DFS recursion, one entry per encountered interaction). 2. top_k_interactions(coeffs, k=n_candidate_interactions, odd=False) keeps the top-k entries by |coefficient|. Pre-filtered to cardinality >= 3 odd interactions so the budget is not spent on singletons (those are added unconditionally by _build_support). Smoke test on SOUM(n in {6, 8, 10}) at full budget reaches the regression branch and returns sensible odd higher-order interactions (e.g. (1, 4, 5), (0, 3, 6), ...). All 14 existing adapter unit tests still pass.

50 tests covering OddSHAP's algorithmic guarantees: * Init contract — defaults, custom kwargs, attribute exposure * Coalition-size sampling weights — shape, sum=1, zero boundaries, symmetry, paper formula 1/((n-1)*C(n-2,k-1)) * approximate() return-value contract — InteractionValues fields, baseline equals v(empty), estimation_budget recorded, estimated flag * Constraint-system identities (exact, enforced by construction): - efficiency axiom: sum_i phi_i = v(N) - v(empty) - baseline: phi_empty = v(empty) * Determinism — same seed -> bit-identical output; sub-budget seeds differ * Branch routing via runtime_last_approximate_run keys * ProxySPEX adapter integration — output is higher-order odd only, respects k limit, handles zero budget and missing surrogate * _build_support invariants — empty + all singletons always present, even/singleton inputs dropped, unsorted tuples normalized * Game-property tests on DummyGame — symmetry / efficiency on a closed-form game * Convergence vs ExactComputer — xfail(strict=False) since OddSHAP is a sparse-recovery method (n=6 currently xpasses) * Efficiency persists at sub-budget — by construction Two xfails documented inline: - low-budget fallback path raises IndexError in shapiq.tree.explainer on constant LightGBM surrogates (tracked separately) - convergence on dense SOUM at full budget for n=8 (sparse-recovery method; tightens once SG-41 paired-sampling lands) Results: 47 passed, 2 xfailed, 1 xpassed (3.3 s).

… changes After Sara registered OddSHAP in shapiq.approximator.regression.__init__, the top-level import of TreeExplainer in oddshap.py triggers a circular import (regression -> oddshap -> tree.explainer -> explainer -> tree). Moved that import inside _approximate_via_fallback where it is actually used. Test alignment with Sara's API changes: * odd_only=False is now explicitly rejected -- split that into test_init_rejects_odd_only_false and dropped the kwarg from test_init_custom_kwargs. * _select_odd_interactions now takes a budget keyword and bypasses the top-k truncation when budget >= 2**n. Updated the 4 existing call sites to pass budget=, and added test_select_odd_interactions_full_budget_returns_all_higher_order_odd to document the new full-budget short-circuit. Result: 63 passed, 2 xfailed, 1 xpassed (no regressions; same 2 xfails as before -- TreeExplainer fallback crash and n=8 dense SOUM convergence).

Cast the parity matrix to float before applying the Fourier sign transform. This prevents uint8 underflow where -1 became 255 and restores full-budget consistency against ExactComputer. Also removes obsolete xfail markers for the fallback and full-budget convergence tests.

Correct the OddSHAP candidate interaction budget to follow the paper’s ceil(m / eta) rule and add coverage for the regression threshold boundary. Clean up OddSHAP implementation style and integration details, including stale comments, docstrings, unused compatibility kwargs, optional LightGBM import handling, and public approximator export ordering.

…r into oddshap.py, add method to seperate sampling weights from kernel weights

Sara's latest commits on oddshap_approximator implement Max's feedback: - removes runtime_last_approximate_run measurement - replaces the low-budget fallback with an explicit ValueError - default interaction_detection switches from ProxySPEX to ProxySHAP - sampling weights are now uniform over non-boundary sizes (paper's 1/((n-1)C(n-2,k-1)) formula moved to the new _init_regression_kernel_weights_static and is used as the LSQ kernel weight, equivalent to KernelSHAP weights up to a global scale) Test updates: * test_init_defaults: drop runtime_last_approximate_run assertion; expect interaction_detection == 'ProxySHAP'. * test_sampling_weights_match_paper_formula renamed into test_sampling_weights_uniform_over_non_boundary_sizes (new behaviour) plus a separate test_regression_kernel_weights_match_paper_formula that pins the paper formula on the LSQ kernel where it actually lives. * test_high_budget_takes_regression_path / test_low_budget_takes_fallback_path removed (runtime tracking gone, fallback path gone) and replaced by test_low_budget_raises_value_error. * Sara's added test_boundary_budget_takes_regression_path_... was still asserting on the deleted runtime dict — kept the n_candidate assertion, dropped the runtime check. Result: 59 passed, 0 failed, 0 xfailed in 1.2 s. All adapter tests (14) still pass. The convergence test that was previously xfailed on n=8 now passes cleanly thanks to Sara's Fourier sign fix.

Sara-ne · 2026-05-20T18:08:52Z

Hi @mmschlk, this PR is ready for review now. I added the approximator and updated the implementation based on the feedback. Thanks

Sara-ne and others added 16 commits May 10, 2026 16:21

push oddshap approximator

cd95ad4

cleanup

6718476

bug fixes and add oddshap to init

0499083

Align OddSHAP candidate budget with the paper

e538367

remove runtime measurement

a5e3f66

remove fallback logic and raise ValueError instead

26a873b

polish docstrings, add external helpers from _oddshap_proxyspe_adapte…

4c03938

…r into oddshap.py, add method to seperate sampling weights from kernel weights

use ProxySHAP-style tree interactions to select odd higher-order support

7f7268b

delete unused helpers from fallback logic

965af77

github-project-automation Bot added this to shapiq development May 20, 2026

Sara-ne marked this pull request as ready for review May 20, 2026 18:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OddSHAP approximator#522

OddSHAP approximator#522
Sara-ne wants to merge 16 commits into
mmschlk:mainfrom
FabianK-Dev:oddshap_approximator

Sara-ne commented May 20, 2026 •

edited

Loading

Uh oh!

Sara-ne commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Sara-ne commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Implemented

Notes

Public API Changes

How Has This Been Tested?

Checklist

Uh oh!

Sara-ne commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sara-ne commented May 20, 2026 •

edited

Loading