Adds `triu` and `tril` methods directly to ArrayBase #1386

akern40 · 2024-05-16T03:50:28Z

The implementation seems to work, but unsure about whether this should be a free function vs bound directly to ArrayBase vs a trait. Closes #1323.

src/tri.rs

akern40 · 2024-05-18T20:31:48Z

Hold off on any merge for a moment; tril panics when given a 0D array, working on a fix.

akern40 · 2024-05-18T20:56:18Z

Ok, fixed.

Not to move conversations, but two thoughts on the (now resolved) discussion about order of the input array:

Would there be appetite for {zeros|ones|full}_like public functions? That way users don't have to do the rather-convoluted self.raw_dim().set_f(is_layout_f(&self.dim, &self.strides)) that I'm doing. Happy to make a tracking issue and put in the PRs for that.
I did a quick benchmark of tri[ul]'s performance on c-order vs f-order, and it's not great; results are below. The slowdown is quite apparent at larger array sizes, as expected. The via_t lines are referring to the quick and dirty solution I could think of, which is x_tril = x.t().triu(0).t().to_owned().

running 3 tests (64 x 64)
test triu_c       ... bench:       1,681 ns/iter (+/- 45)
test triu_f       ... bench:       2,011 ns/iter (+/- 32)
test triu_f_via_t ... bench:       2,133 ns/iter (+/- 29)

running 3 tests (2048 x 2048)
test triu_c       ... bench:   2,418,649 ns/iter (+/- 320,729)
test triu_f       ... bench:  22,184,991 ns/iter (+/- 4,532,682)
test triu_f_via_t ... bench:   4,259,493 ns/iter (+/- 218,849)

nilgoyette · 2024-05-20T19:20:53Z

1. Would there be appetite for `{zeros|ones|full}_like` public functions? That way users don't have to do the rather-convoluted `self.raw_dim().set_f(is_layout_f(&self.dim, &self.strides))` that I'm doing. Happy to make a tracking issue and put in the PRs for that.

Well, I know that I'm interested in those functions, but it may not be what @bluss had in mind. The more methods we have, the harder it is to find them.

As for the benchmarks, it's kinda normal because those larger arrays don't fit in the cache and you're iterating in the wrong order. Seeing those results, I think adding a condition on the memory order is worth it. Some people (like me) often work in f-order :)

akern40 · 2024-05-20T21:19:28Z

Well, I know that I'm interested in those functions, but it may not be what @bluss had in mind. The more methods we have, the harder it is to find them.

This is true; I don't want to bloat the API. That being said, I think some (difficult and time-consuming and I don't mean to imply it's trivial) documentation work could make the docs easier to navigate and alleviate concerns about more functions.

Some people (like me) often work in f-order :)

Me too :) too many column-vector algorithms / math. I was interested in seeing how well NumPy does on the transposed version and I guess they didn't write one that dispatches on order:

In [1]: import numpy as np

In [2]: x = np.ones((2048, 2048))

In [3]: %timeit np.triu(x)
4.34 ms ± 53.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: x = np.ones((2048, 2048), order="F")

In [5]: %timeit np.triu(x)
19 ms ± 557 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

(Fun to see that ndarray is faster than NumPy here, though! Although that extra bit of time could just be Python overhead, rather than raw NumPy performance).

If we want to create a specialization, I think we can do it via the transpose-transpose method I mentioned above, with a slight tweak to avoid the copy: (EDITED)

pub fn triu(&self, k: isize) -> Array<A, D> {
    match self.ndim() > 1 && is_layout_f(&self.dim, &self.strides) {
        true => {
            let n = self.ndim();
            let mut x = self.view();
            x.swap_axes(n - 2, n - 1);
            let mut tril = x.tril(-k).into_shared();
            tril.swap_axes(n - 2, n - 1);
            tril.into_owned()
        },
        false => {
            let mut res = Array::zeros(self.raw_dim());
            Zip::indexed(self.rows())
                .and(res.rows_mut())
                .for_each(|i, src, mut dst| {
                    let row_num = i.into_dimension().last_elem();
                    let lower = max(row_num as isize + k, 0);
                    dst.slice_mut(s![lower..]).assign(&src.slice(s![lower..]));
                });

            res
        }
    }
}

which gets us the following performance: (EDITED)

running 3 tests (64 x 64)
test triu_c       ... bench:       1,709 ns/iter (+/- 58)
test triu_f       ... bench:       1,723 ns/iter (+/- 43)

running 3 tests (2048 x 2048)
test triu_c       ... bench:   2,429,621 ns/iter (+/- 171,843)
test triu_f       ... bench:   2,484,196 ns/iter (+/- 159,421)

If this gets some thumbs up I'll add it to the PR (and maybe squash at this point to reduce commits?)

Glad I got to explore the _owned methods - seems like it should be possible to do non-copy _owned on mutable views, but that's an improvement for another day / PR.

Includes branched implementations for f- and c-order arrays.

src/tri.rs

nilgoyette · 2024-05-27T13:51:11Z

Thank you for your contribution @akern40

nilgoyette reviewed May 16, 2024

View reviewed changes

src/tri.rs Show resolved Hide resolved

src/tri.rs Outdated Show resolved Hide resolved

src/tri.rs Outdated Show resolved Hide resolved

src/tri.rs Outdated Show resolved Hide resolved

akern40 changed the title ~~Adds triu and tril methods directly to ArrayBase to fix #1323~~ Adds triu and tril methods directly to ArrayBase May 18, 2024

akern40 force-pushed the triangular branch 2 times, most recently from 7d24c54 to 6316fbe Compare May 22, 2024 03:13

Adds triu and tril methods that mimic NumPy.

3986824

Includes branched implementations for f- and c-order arrays.

akern40 force-pushed the triangular branch from 6316fbe to 3986824 Compare May 22, 2024 03:14

Uses alloc:: instead of std:: for vec! import

449ad0e

nilgoyette reviewed May 22, 2024

View reviewed changes

src/tri.rs Outdated Show resolved Hide resolved

akern40 added 2 commits May 24, 2024 23:06

Uses initial check to clarify logic

9e9c3f9

Removes unecessary comment

84f0c80

nilgoyette added this pull request to the merge queue May 27, 2024

Merged via the queue into rust-ndarray:master with commit 72b0d09 May 27, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds `triu` and `tril` methods directly to ArrayBase #1386

Adds `triu` and `tril` methods directly to ArrayBase #1386

akern40 commented May 16, 2024 •

edited

akern40 commented May 18, 2024

akern40 commented May 18, 2024

nilgoyette commented May 20, 2024

akern40 commented May 20, 2024 •

edited

nilgoyette commented May 27, 2024

Adds triu and tril methods directly to ArrayBase #1386

Adds triu and tril methods directly to ArrayBase #1386

Conversation

akern40 commented May 16, 2024 • edited

akern40 commented May 18, 2024

akern40 commented May 18, 2024

nilgoyette commented May 20, 2024

akern40 commented May 20, 2024 • edited

nilgoyette commented May 27, 2024

Adds `triu` and `tril` methods directly to ArrayBase #1386

Adds `triu` and `tril` methods directly to ArrayBase #1386

akern40 commented May 16, 2024 •

edited

akern40 commented May 20, 2024 •

edited