Speed up MatrixExpr.add.reduce via quicksum (#1157)

Zeroto521 · web-flow · commit 4c3391d5e876 · 2026-01-20T13:07:24.000Z
* Speed up `np.add.reduce`

Implements the __array_ufunc__ method in MatrixExpr to handle numpy ufuncs, specifically enabling correct behavior for reductions like np.add.reduce by delegating to the sum method. This improves compatibility with numpy operations.

* BUG: `MatrixExpr.mean(axis=1)` will crush kernel

Moved the sum computation logic from MatrixExpr.sum and __array_ufunc__ into a new _core_sum function for better code reuse and maintainability.

* Add tests for matrix mean performance and type

Introduces tests to compare the performance of the mean operation on matrix variables and checks the return types of mean with and without axis argument.

* Update changelog for MatrixExpr.add.reduce optimization

Documented the speed improvement for MatrixExpr.add.reduce using quicksum in the changelog.

* Remove MatrixExpr.sum method and update _core_sum docstring

The sum method was removed from the MatrixExpr class, consolidating summation logic in the _core_sum function. The docstring for _core_sum was expanded to include detailed parameter and return value descriptions, improving code clarity and maintainability.

* Remove sum method from MatrixExpr stub

Deleted the type stub for the sum method in MatrixExpr, likely to reflect changes in the underlying implementation or to correct type information.

* Improve MatrixExpr ufunc handling and remove unused include

Enhanced the __array_ufunc__ method in MatrixExpr to ensure proper array conversion and consistent return types. Added the _ensure_array helper for argument handling. Also removed an unused include of matrix.pxi from expr.pxi.

* Fix MatrixExpr matmul return type and update tests

Updated MatrixExpr.__matmul__ to return the correct type when the result is not an ndarray. Adjusted tests to reflect the expected return type for 1D matrix multiplication and improved performance test timing logic.

* Refactor matrix sum performance tests timing logic

Simplifies timing measurement in matrix sum performance tests by directly calculating elapsed time instead of storing start and end times separately. This improves code readability and reduces variable usage.

* Update matmul return type assertion in test

Changed the expected type of 1D @ 1D matrix multiplication from MatrixExpr to Expr in test_matrix_matmul_return_type to reflect updated behavior.

* Refactor matrix variable tests to use view casting

Updated tests to use x.view(MatrixExpr) and x.view(np.ndarray) instead of direct subclass method calls. This clarifies the intent and ensures the correct method resolution for sum and mean operations in performance and result comparison tests.

* define the variable

* Refactor __array_ufunc__ in MatrixExpr for clarity

Reorganize logic in MatrixExpr.__array_ufunc__ to handle the 'reduce' method and argument conversion more clearly. This improves readability and ensures correct handling of the 'out' keyword and argument conversion only when necessary.

* Add type hints and docstring to __array_ufunc__ in MatrixExpr

Enhanced the MatrixExpr.__array_ufunc__ method with detailed type annotations and a comprehensive docstring. This improves code clarity and developer experience when working with custom NumPy ufunc behavior.

* Add comment clarifying 'reduce' method handling

A comment was added to explain that the 'reduce' method handles reductions like np.sum(a), improving code readability.

* Refactor matrix sum test for clarity and type consistency

Updated test_matrix_sum_result to extract and cast the result of the MatrixExpr sum operation before assertions. This improves clarity and ensures type consistency between the expected and actual results.

* Fix handling of 'out' kwarg in __array_ufunc__

Updates the __array_ufunc__ method in MatrixExpr to ensure all elements in the 'out' tuple are unboxed with _ensure_array, preventing recursion issues when 'out' is provided as a tuple.

* Fix comments to reference mean instead of sum

Updated comments in test_matrix_mean_performance to correctly refer to 'mean' instead of 'sum', reflecting the actual operations being performed in the test.

* Pass 'False' to _ensure_array
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -14,6 +14,7 @@
 - Fixed incorrect getVal() result when _bestSol.sol was outdated
 ### Changed
 - changed default value of enablepricing flag to True
+- Speed up MatrixExpr.add.reduce via quicksum
 - Speed up np.ndarray(..., dtype=np.float64) @ MatrixExpr
 - Minimum numpy version increased from 1.16.0 to 1.19.0
 ### Removed
diff --git a/src/pyscipopt/matrix.pxi b/src/pyscipopt/matrix.pxi
@@ -2,7 +2,6 @@
 # TODO Cythonize things. Improve performance.
 # TODO Add tests
 """
-
 from typing import Literal, Optional, Tuple, Union
 import numpy as np
 try:
@@ -71,7 +70,7 @@ class MatrixExpr(np.ndarray):
             The ufunc object that was called.
 
         method : {"__call__", "reduce", "reduceat", "accumulate", "outer", "at"}
-            A string indicating which UFunc method was called.
+            A string indicating which ufunc method was called.
 
         *args : tuple
             The input arguments to the ufunc.
@@ -81,77 +80,28 @@ class MatrixExpr(np.ndarray):
 
         Returns
         -------
-        Expr, GenExpr, MatrixExpr
+        Expr, MatrixExpr
             The result of the ufunc operation is wrapped back into a MatrixExpr if
             applicable.
 
         """
         res = NotImplemented
+        # Unboxing MatrixExpr to stop __array_ufunc__ recursion
+        args = tuple(_ensure_array(arg) for arg in args)
+        if method == "reduce":  # Handle reduction operations, e.g., np.sum(a)
+            if ufunc is np.add:
+                res = _core_sum(args[0], **kwargs)
+
         if method == "__call__":  # Standard ufunc call, e.g., np.add(a, b)
             if ufunc in {np.matmul, np.dot}:
-                res = _core_dot(_ensure_array(args[0]), _ensure_array(args[1]))
+                res = _core_dot(args[0], args[1])
 
         if res is NotImplemented:
-            # Unboxing MatrixExpr to stop __array_ufunc__ recursion
-            args = tuple(_ensure_array(arg) for arg in args)
+            if "out" in kwargs: # Unboxing MatrixExpr to stop __array_ufunc__ recursion
+                kwargs["out"] = tuple(_ensure_array(arg, False) for arg in kwargs["out"])
             res = super().__array_ufunc__(ufunc, method, *args, **kwargs)
         return res.view(MatrixExpr) if isinstance(res, np.ndarray) else res
 
-    def sum(
-        self,
-        axis: Optional[Union[int, Tuple[int, ...]]] = None,
-        keepdims: bool = False,
-        **kwargs,
-    ) -> Union[Expr, MatrixExpr]:
-        """
-        Return the sum of the array elements over the given axis.
-
-        Parameters
-        ----------
-        axis : None or int or tuple of ints, optional
-            Axis or axes along which a sum is performed. The default, axis=None, will
-            sum all of the elements of the input array. If axis is negative it counts
-            from the last to the first axis. If axis is a tuple of ints, a sum is
-            performed on all of the axes specified in the tuple instead of a single axis
-            or all the axes as before.
-
-        keepdims : bool, optional
-            If this is set to True, the axes which are reduced are left in the result as
-            dimensions with size one. With this option, the result will broadcast
-            correctly against the input array.
-
-        **kwargs : ignored
-            Additional keyword arguments are ignored. They exist for compatibility
-            with `numpy.ndarray.sum`.
-
-        Returns
-        -------
-        Expr or MatrixExpr
-            If the sum is performed over all axes, return an Expr, otherwise return
-            a MatrixExpr.
-
-        """
-        axis: Tuple[int, ...] = normalize_axis_tuple(
-            range(self.ndim) if axis is None else axis, self.ndim
-        )
-        if len(axis) == self.ndim:
-            res = quicksum(self.flat)
-            return (
-                np.array([res], dtype=object).reshape([1] * self.ndim).view(MatrixExpr)
-                if keepdims
-                else res
-            )
-
-        keep_axes = tuple(i for i in range(self.ndim) if i not in axis)
-        shape = (
-            tuple(1 if i in axis else self.shape[i] for i in range(self.ndim))
-            if keepdims
-            else tuple(self.shape[i] for i in keep_axes)
-        )
-        return np.apply_along_axis(
-            quicksum, -1, self.transpose(keep_axes + axis).reshape(shape + (-1,))
-        ).view(MatrixExpr)
-
     def __le__(self, other: Union[float, int, "Expr", np.ndarray, "MatrixExpr"]) -> MatrixExprCons:
         return _matrixexpr_richcmp(self, other, 1)
 
@@ -195,6 +145,7 @@ class MatrixExpr(np.ndarray):
 class MatrixGenExpr(MatrixExpr):
     pass
 
+
 class MatrixExprCons(np.ndarray):
 
     def __le__(self, other: Union[float, int, np.ndarray]) -> MatrixExprCons:
@@ -288,3 +239,62 @@ def _core_dot_2d(cnp.ndarray a, cnp.ndarray x) -> np.ndarray:
             res[i, j] = quicksum(a_view[i, idx] * x[idx, j] for idx in nonzero)
 
     return res
+
+
+def _core_sum(
+    cnp.ndarray a,
+    axis: Optional[Union[int, Tuple[int, ...]]] = None,
+    keepdims: bool = False,
+    **kwargs,
+) -> Union[Expr, np.ndarray]:
+    """
+    Return the sum of the array elements over the given axis.
+
+    Parameters
+    ----------
+    a : cnp.ndarray
+        A `np.ndarray` of type `object` and containing `Expr` objects.
+
+    axis : None or int or tuple of ints, optional
+        Axis or axes along which a sum is performed. The default, axis=None, will
+        sum all of the elements of the input array. If axis is negative it counts
+        from the last to the first axis. If axis is a tuple of ints, a sum is
+        performed on all of the axes specified in the tuple instead of a single axis
+        or all the axes as before.
+
+    keepdims : bool, optional
+        If this is set to True, the axes which are reduced are left in the result as
+        dimensions with size one. With this option, the result will broadcast
+        correctly against the input array.
+
+    **kwargs : ignored
+        Additional keyword arguments are ignored. They exist for compatibility
+        with `numpy.ndarray.sum`.
+
+    Returns
+    -------
+    Expr or np.ndarray
+        If the sum is performed over all axes, return an Expr, otherwise return
+        a np.ndarray.
+
+    """
+    axis: Tuple[int, ...] = normalize_axis_tuple(
+        range(a.ndim) if axis is None else axis, a.ndim
+    )
+    if len(axis) == a.ndim:
+        res = quicksum(a.flat)
+        return (
+            np.array([res], dtype=object).reshape([1] * a.ndim)
+            if keepdims
+            else res
+        )
+
+    keep_axes = tuple(i for i in range(a.ndim) if i not in axis)
+    shape = (
+        tuple(1 if i in axis else a.shape[i] for i in range(a.ndim))
+        if keepdims
+        else tuple(a.shape[i] for i in keep_axes)
+    )
+    return np.apply_along_axis(
+        quicksum, -1, a.transpose(keep_axes + axis).reshape(shape + (-1,))
+    )
diff --git a/src/pyscipopt/scip.pyi b/src/pyscipopt/scip.pyi
@@ -509,9 +509,6 @@ class MatrixConstraint(numpy.ndarray):
     def isStickingAtNode(self) -> Incomplete: ...
 
 class MatrixExpr(numpy.ndarray):
-    def sum(  # type: ignore[override]
-        self, axis: Incomplete = ..., keepdims: Incomplete = ..., **kwargs: Incomplete
-    ) -> Incomplete: ...
     def __add__(self, other: Incomplete) -> Incomplete: ...
     def __eq__(self, other: Incomplete)-> Incomplete: ...
     def __ge__(self, other: Incomplete) -> MatrixExprCons: ...
diff --git a/tests/test_matrix_variable.py b/tests/test_matrix_variable.py
@@ -248,49 +248,75 @@ def test_matrix_sum_axis():
 )
 def test_matrix_sum_result(axis, keepdims):
     # directly compare the result of np.sum and MatrixExpr.sum
-    _getVal = np.vectorize(lambda e: e.terms[CONST])
+    _getVal = np.vectorize(lambda e: e[CONST])
     a = np.arange(6).reshape((1, 2, 3))
 
     np_res = a.sum(axis, keepdims=keepdims)
-    scip_res = MatrixExpr.sum(a, axis, keepdims=keepdims)
-    assert (np_res == _getVal(scip_res)).all()
-    assert np_res.shape == _getVal(scip_res).shape
+    scip_res = _getVal(a.view(MatrixExpr).sum(axis, keepdims=keepdims)).view(np.ndarray)
+    assert (np_res == scip_res).all()
+    assert np_res.shape == scip_res.shape
 
 
 @pytest.mark.parametrize("n", [50, 100])
 def test_matrix_sum_axis_is_none_performance(n):
     model = Model()
     x = model.addMatrixVar((n, n))
 
-    # Original sum via `np.ndarray.sum`, `np.sum` will call subclass method
-    start_orig = time()
-    np.ndarray.sum(x)
-    end_orig = time()
+    # Original sum via `np.ndarray.sum`
+    start = time()
+    x.view(np.ndarray).sum()
+    orig = time() - start
 
     # Optimized sum via `quicksum`
-    start_matrix = time()
+    start = time()
     x.sum()
-    end_matrix = time()
+    matrix = time() - start
 
-    assert model.isGT(end_orig - start_orig, end_matrix - start_matrix)
+    assert model.isGT(orig, matrix)
 
 
 @pytest.mark.parametrize("n", [50, 100])
 def test_matrix_sum_axis_not_none_performance(n):
     model = Model()
     x = model.addMatrixVar((n, n))
 
-    # Original sum via `np.ndarray.sum`, `np.sum` will call subclass method
-    start_orig = time()
-    np.ndarray.sum(x, axis=0)
-    end_orig = time()
+    # Original sum via `np.ndarray.sum`
+    start = time()
+    x.view(np.ndarray).sum(axis=0)
+    orig = time() - start
 
     # Optimized sum via `quicksum`
-    start_matrix = time()
+    start = time()
     x.sum(axis=0)
-    end_matrix = time()
+    matrix = time() - start
+
+    assert model.isGT(orig, matrix)
+
+
+@pytest.mark.parametrize("n", [50, 100])
+def test_matrix_mean_performance(n):
+    model = Model()
+    x = model.addMatrixVar((n, n))
+
+    # Original mean via `np.ndarray.mean`
+    start = time()
+    x.view(np.ndarray).mean(axis=0)
+    orig = time() - start
+
+    # Optimized mean via `quicksum`
+    start = time()
+    x.mean(axis=0)
+    matrix = time() - start
+
+    assert model.isGT(orig, matrix)
+
+
+def test_matrix_mean():
+    model = Model()
+    x = model.addMatrixVar((2, 2))
 
-    assert model.isGT(end_orig - start_orig, end_matrix - start_matrix)
+    assert isinstance(x.mean(), Expr)
+    assert isinstance(x.mean(1), MatrixExpr)
 
 
 @pytest.mark.parametrize("n", [50, 100])