ENH: stats: add array-API support to kstat/kstatvar #20634

j-bowhay · 2024-05-03T12:02:30Z

towards #20544

[skip ci]

scipy/stats/_morestats.py

mdhaber

Oops, I'm realizing that I probably shouldn't have put these on the initial list - they really should get native support for an axis argument before we add array API support. Would you add keyword argument axis with default None (for backward compatibility) to the signature and use it as appropriate?

mdhaber · 2024-05-05T18:41:53Z

Do you plan to add native axis support to these or should I do that?

j-bowhay · 2024-05-05T18:43:12Z

Do you plan to add native axis support to these or should I do that?

I was going to go for some of the lower-hanging fruit first so feel free to go for it in the mean time

mdhaber · 2024-05-06T07:43:58Z

Added in gh-20651. Sorry for the merge conflicts!

mdhaber · 2024-05-06T17:06:20Z

The conflicts are actually not as bad as I expected. It probably is worth it to merge and fix them rather than starting over. Please also convert the new tests added in gh-20651.

[skip ci]

j-bowhay · 2024-05-07T00:57:10Z

Ok almost there just down to the final failures of this style:

scipy/stats/tests/test_morestats.py:1703: in test_nan_input
    xp_assert_equal(stats.kstat(data), xp.asarray(xp.nan))
        data       = Array([ 0.,  1.,  2.,  3.,  4.,  5., nan,
        7.,  8.,  9.], dtype=array_api_strict.float64)
        self       = <scipy.stats.tests.test_morestats.TestKstat object at 0x7f85c7677850>
        xp         = <module 'array_api_strict' from '/home/jakeb/miniconda3/envs/scipy-dev-pytorch/lib/python3.11/site-packages/array_api_strict/__init__.py'>
scipy/stats/_axis_nan_policy.py:405: in axis_nan_policy_wrapper
    return hypotest_fun_in(*args, **kwds)
        _no_deco   = False
        args       = (Array([ 0.,  1.,  2.,  3.,  4.,  5., nan,
        7.,  8.,  9.], dtype=array_api_strict.float64),)
        default_axis = None
        hypotest_fun_in = <function kstat at 0x7f8671774400>
        is_too_small = <function _axis_nan_policy_factory.<locals>.is_too_small at 0x7f86717742c0>
        kwd_samples = []
        kwds       = {}
        msg        = 'Use of `nan_policy` and `keepdims` is incompatible with non-NumPy arrays.'
        n_outputs  = 1
        n_samples  = 1
        override   = {'nan_propagation': True, 'vectorization': False}
        paired     = False
        result_to_tuple = <function <lambda> at 0x7f8671774220>
        temp       = Array([ 0.,  1.,  2.,  3.,  4.,  5., nan,
        7.,  8.,  9.], dtype=array_api_strict.float64)
        tuple_to_result = <function <lambda> at 0x7f8671774180>
scipy/stats/_morestats.py:307: in kstat
    S = [None] + [xp.sum(data**k, axis=axis) for k in range(1, n + 1)]
        N          = 10
        axis       = 0
        data       = array([ 0.,  1.,  2.,  3.,  4.,  5., nan,  7.,  8.,  9.])
        n          = 2
        xp         = <module 'array_api_strict' from '/home/jakeb/miniconda3/envs/scipy-dev-pytorch/lib/python3.11/site-packages/array_api_strict/__init__.py'>
scipy/stats/_morestats.py:307: in <listcomp>
    S = [None] + [xp.sum(data**k, axis=axis) for k in range(1, n + 1)]
        .0         = <range_iterator object at 0x7f85c75be580>
        axis       = 0
        data       = array([ 0.,  1.,  2.,  3.,  4.,  5., nan,  7.,  8.,  9.])
        k          = 1
        xp         = <module 'array_api_strict' from '/home/jakeb/miniconda3/envs/scipy-dev-pytorch/lib/python3.11/site-packages/array_api_strict/__init__.py'>
/home/jakeb/miniconda3/envs/scipy-dev-pytorch/lib/python3.11/site-packages/array_api_strict/_statistical_functions.py:100: in sum
    if x.dtype not in _numeric_dtypes:
        axis       = 0
        dtype      = None
        keepdims   = False
        x          = array([ 0.,  1.,  2.,  3.,  4.,  5., nan,  7.,  8.,  9.])
/home/jakeb/miniconda3/envs/scipy-dev-pytorch/lib/python3.11/site-packages/array_api_strict/_dtypes.py:24: in __eq__
    warnings.warn(
E   UserWarning: You are comparing a array_api_strict dtype against a NumPy native dtype object, but you probably don't want to do this. array_api_strict dtype objects compare unequal to their NumPy equivalents. Such cross-library comparison is not supported by the standard.
        other      = dtype('float64')
        self       = array_api_strict.float32
============================================================================================== short test summary info ==============================================================================================
FAILED scipy/stats/tests/test_morestats.py::TestKstat::test_moments_normal_distribution[array_api_strict] - UserWarning: You are comparing a array_api_strict dtype against a NumPy native dtype object, but you probably don't want to do this. array_api_strict dtype objects compare unequal to their NumPy equivalents. ...
FAILED scipy/stats/tests/test_morestats.py::TestKstat::test_nan_input[array_api_strict] - UserWarning: You are comparing a array_api_strict dtype against a NumPy native dtype object, but you probably don't want to do this. array_api_strict dtype objects compare unequal to their NumPy equivalents. ...

hopefully will be obvious when it isn't so late!

mdhaber · 2024-05-07T02:07:30Z

Line 296 and the next few don't look like they've been converted.

j-bowhay · 2024-05-07T08:54:15Z

Line 296 and the next few don't look like they've been converted.

Ha yes thanks, temporary post coursework submission blindness...

mdhaber

The statistic conversion looks pretty good, but the tests still need to become @array_api_compatible. We can add tests with the intent of checking the behavior for NumPy lists, but we also need to test behavior with backends other than NumPy.

scipy/stats/tests/test_morestats.py

mdhaber

After this, would you be willing to submit another PR that cleans up the notes of these? For example:

worse is that these correspond with:

but the implementation is:

mdhaber · 2024-05-08T05:15:37Z

scipy/stats/tests/test_morestats.py

        np.random.seed(32149)
-        data = np.random.randn(12345)
-        moments = [stats.kstat(data, n) for n in [1, 2, 3, 4]]
+        data = xp.asarray(np.random.randn(12345), dtype=xp.float64)


Even torch will produce a float64 Tensor if it is generated from a NumPy float64 array, right?

Yes, I can't quite remember why I did this, will have a look in the follow up

mdhaber · 2024-05-08T05:17:10Z

scipy/stats/tests/test_morestats.py

-        data[6] = np.nan
+    def test_nan_input(self, xp):
+        data = xp.arange(10.)
+        data[6] = xp.nan


Fine for now, but nice to save @lucascolley some trouble by generating these without mutation as we go forward. Is there a better way than?

Suggested change

data[6] = xp.nan

data = xp.where(data==6, xp.nan, data)

Obvously here this is fine but I do wonder if we encountered mutations in the in loop of a solver, for example, if all the new allocations might be a bit painful performance wise

Oh, I wouldn't change existing code like that all the time. Just easy cases like these tests, where it could make the difference between the test running with JAX or not. Cases do need to be considered individually.

ENH: stats: add array-API support to kstat/kstatvar

d7f7f03

[skip ci]

github-actions bot added scipy.stats CI Items related to the CI tools such as CircleCI, GitHub Actions or Azure enhancement A new feature or improvement labels May 3, 2024

j-bowhay commented May 3, 2024

View reviewed changes

scipy/stats/_morestats.py Outdated Show resolved Hide resolved

j-bowhay mentioned this pull request May 3, 2024

ENH: stats: add array API-support #20544

Open

68 tasks

j-bowhay added 2 commits May 3, 2024 15:55

Merge branch 'main' into xp_kstat

3dc78dd

MAINT: use _get_nan + a few tidy ups

318f998

j-bowhay marked this pull request as ready for review May 3, 2024 14:58

mdhaber reviewed May 3, 2024

View reviewed changes

j-bowhay marked this pull request as draft May 3, 2024 16:57

Merge branch 'main' into xp_kstat

47fbe5a

put axis addition changes

812f5e6

[skip ci]

finish converting

5737033

j-bowhay marked this pull request as ready for review May 7, 2024 08:54

fix typos

8180862

mdhaber reviewed May 7, 2024

View reviewed changes

scipy/stats/tests/test_morestats.py Outdated Show resolved Hide resolved

scipy/stats/tests/test_morestats.py Outdated Show resolved Hide resolved

j-bowhay added 3 commits May 7, 2024 18:48

Merge branch 'main' into xp_kstat

a3848ac

TST: add variation to TestCommonAxis

747e654

address review comments

1166d1c

mdhaber approved these changes May 8, 2024

View reviewed changes

mdhaber reviewed May 8, 2024

View reviewed changes

mdhaber merged commit 5b7e5a0 into scipy:main May 8, 2024
30 checks passed

dschmitz89 added this to the 1.14.0 milestone May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: stats: add array-API support to kstat/kstatvar #20634

ENH: stats: add array-API support to kstat/kstatvar #20634

j-bowhay commented May 3, 2024

mdhaber left a comment

mdhaber commented May 5, 2024

j-bowhay commented May 5, 2024

mdhaber commented May 6, 2024

mdhaber commented May 6, 2024

j-bowhay commented May 7, 2024

mdhaber commented May 7, 2024

j-bowhay commented May 7, 2024

mdhaber left a comment

mdhaber left a comment

mdhaber May 8, 2024

j-bowhay May 8, 2024

mdhaber May 8, 2024

j-bowhay May 8, 2024

mdhaber May 8, 2024 •

edited

ENH: stats: add array-API support to kstat/kstatvar #20634

ENH: stats: add array-API support to kstat/kstatvar #20634

Conversation

j-bowhay commented May 3, 2024

mdhaber left a comment

Choose a reason for hiding this comment

mdhaber commented May 5, 2024

j-bowhay commented May 5, 2024

mdhaber commented May 6, 2024

mdhaber commented May 6, 2024

j-bowhay commented May 7, 2024

mdhaber commented May 7, 2024

j-bowhay commented May 7, 2024

mdhaber left a comment

Choose a reason for hiding this comment

mdhaber left a comment

Choose a reason for hiding this comment

mdhaber May 8, 2024

Choose a reason for hiding this comment

j-bowhay May 8, 2024

Choose a reason for hiding this comment

mdhaber May 8, 2024

Choose a reason for hiding this comment

j-bowhay May 8, 2024

Choose a reason for hiding this comment

mdhaber May 8, 2024 • edited

Choose a reason for hiding this comment

mdhaber May 8, 2024 •

edited