Enable Coefficients for DiscreteSumConstraint and from_simplex#786
Open
Scienfitz wants to merge 10 commits into
Open
Enable Coefficients for DiscreteSumConstraint and from_simplex#786Scienfitz wants to merge 10 commits into
DiscreteSumConstraint and from_simplex#786Scienfitz wants to merge 10 commits into
Conversation
Follows the ContinuousLinearConstraint pattern: coefficients default to all-ones (preserving existing behavior), are validated for length parity with parameters, and the weighted sum is evaluated via a single numpy matrix-vector product to avoid intermediate DataFrame copies.
Reworks the signature to make all optional arguments keyword-only (via *). Adds simplex_coefficients for a weighted simplex sum constraint. The incremental early-pruning algorithm is generalised to handle negative coefficients correctly by computing per-parameter weighted min/max contributions rather than assuming monotonicity, and by keeping nonzero cardinality tracking separate (raw parameter values, coefficient-sign independent). The weighted row-sum uses a single numpy matrix-vector product to avoid intermediate DataFrame copies.
…plex_coefficients Weighted-sum filtering correctness (default and custom coefficients) added to the existing discrete constraint test file, parametrized across all-ones, scaled, negative, and equality operator cases. Simplex coefficient tests (brute-force equivalence, mixed-sign, boundary_only, and equivalence with from_product+DiscreteSumConstraint) added to the existing from_simplex test file. Validation error tests for length mismatch added to the constraint validation test file.
… sum The previous approach (to_numpy() @ np.asarray(coefficients)) consolidates all referenced columns into a contiguous (N, k) array before computing the dot product. When the constraint parameters are non-adjacent columns in the DataFrame this forces a full (N, k) memory copy regardless. For the typical use case of sum constraints (k < 10 parameters), a column-by-column accumulation avoids this: each data[p].to_numpy() is a zero-copy view of a single contiguous column, the scalar multiply produces one (N,) temporary, and the built-in sum accumulates in-place. No (N, k) consolidation allocation is needed. Also removes the now-unused numpy import.
Replaces the pandas-based inner loop (pd.merge cross-join, pd.DataFrame, df.drop inplace) with raw numpy operations (np.repeat + np.tile + np.column_stack for cross-joins, boolean indexing for pruning). The DataFrame is created once at the end. This avoids per-iteration pandas overhead (index management, BlockManager, merge machinery) and reduces peak memory by eliminating duplicate DataFrame+numpy representations.
f8b3f17 to
cce898a
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends BayBE’s discrete constraint and search space construction capabilities by adding weighted-sum support to DiscreteSumConstraint and enabling weighted simplex construction via SubspaceDiscrete.from_simplex(simplex_coefficients=...). It also refactors the hot loop in from_simplex to use NumPy arrays for improved performance and memory usage.
Changes:
- Add
coefficientstoDiscreteSumConstraint(defaulting to all ones) and apply weighting in both pandas and polars evaluation paths. - Add keyword-only
simplex_coefficientstoSubspaceDiscrete.from_simplex(defaulting to all ones) and adjust pruning logic to work with weighted sums. - Expand test coverage for the new weighted behaviors and length-mismatch validation.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
baybe/constraints/discrete.py |
Adds DiscreteSumConstraint.coefficients with validation + weighted evaluation (pandas/polars). |
baybe/searchspace/discrete.py |
Adds simplex_coefficients, makes args keyword-only, and rewrites from_simplex construction loop using NumPy with weighted pruning. |
CHANGELOG.md |
Documents the new weighted features and the keyword-only breaking change for from_simplex. |
tests/validation/test_constraint_validation.py |
Adds validation test for DiscreteSumConstraint coefficients length mismatch. |
tests/hypothesis_strategies/constraints.py |
Updates Hypothesis discrete-constraint strategy generation to optionally include coefficients. |
tests/hypothesis_strategies/alternative_creation/test_searchspace.py |
Adds brute-force parity tests for weighted simplex generation and mismatch validation. |
tests/constraints/test_constraints_polars.py |
Adds parity test to ensure polars vs pandas agree for weighted sum constraints. |
tests/constraints/test_constraints_discrete.py |
Adds behavioral tests for weighted sum constraints in the discrete constraint suite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
466cf21 to
a232085
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
use-case motivated addition:
DiscreteSumConstraintgets acoefficientsthat works akin to whats been done in the continuous constraintfrom_simplexgets asimplex_coefficientskeyword that allows specifying coefficients for the simplex parameters. This is possible by changing the way the max/min incoming sums are assessed@forfrom_simplexbecause we ensure by construciton that the incoming array is contiguous and does not have to be copied for a reshape and multiplication. This contiguousness is not guaranteed fordata[params]inget_invalidinDiscreteSumConstraintso its more memory efficient to use per-column approaches in the assumption of a small countable amount of parameters (usually the case)Unrelated optimization
I also replaced the inner loop of
from_simplexto use numpy and not pandas. This is also motivated by memory and time efficiency. This commit can be dropped tho if undesired, but there are singificant time and mem savings:(p = simplex parameters, v = values)