Speed up `MatrixExpr.sum(axis=...)` via `quicksum`

**Is your feature request related to a problem? Please describe.**

`np.ndarray.sum(axis=...)` will use `Expr.__add__` in it inner.
And `__add__` will return a new Expr instance each time.
If the matrix is a large dataset, so `.sum(axis=...)` will become much slower.
`quicksum` use `__iadd__` to avoid creating instance.

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

Inspired by [`np.apply_along_axis`](https://numpy.org/doc/stable/reference/generated/numpy.apply_along_axis.html).
We can apply `quicksum` to `.sum(axis=...)` too via `np.apply_along_axis`.

**Additional context**
Add any other context or screenshots about the feature request here.

A simple demo to show this feature.
It sppeds up **5.2x** than before in the shape of `(200, 200)`.

```python
import time

import numpy as np

from pyscipopt import MatrixExpr, Model, quicksum


def speed_up_sum(self, axis=None):
    """Implementation using np.apply_along_axis"""
    if axis is None:
        return quicksum(self.flat)
    return np.apply_along_axis(quicksum, axis, self).view(MatrixExpr)


if __name__ == "__main__":
    m = Model()
    n = 200
    x = m.addMatrixVar((n, n))

    print(f"Benchmarking sum(axis=1) on {n * n} elements...")
    start = time.time()
    speed_up_sum(x, axis=1)
    end = time.time()
    print(f"np.apply_along_axis: {end - start:.4f} s")
    # np.apply_along_axis: 0.1769 s

    start = time.time()
    x.sum(axis=1)
    end = time.time()
    print(f"np.ndarray.sum: {end - start:.4f} s")
    # np.ndarray.sum: 0.9329 s
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up `MatrixExpr.sum(axis=...)` via `quicksum` #1133

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Speed up MatrixExpr.sum(axis=...) via quicksum #1133

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Speed up `MatrixExpr.sum(axis=...)` via `quicksum` #1133