-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: stats.moment: add array API support #20292
Conversation
[skip ci]
Add a new pytest call in the array API CI workflow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a few comments for now
@lucascolley Thanks for the quick reviews - I'm glad you're doing that! Just know that you will find a lot of things if you get to it before I am able to review my own code on GitHub. Seeing just the diff highlights these things, which I might not be so careful about when drafting the code. Re: importing I think this will be common enough that it does make sense to have a custom mark like the original |
@rgommers you originally wanted this marker removed, but I think it is the best way to deal with tests that involve masked/object/ |
I'm thinking about what will happen if and when we make the behavior proposed in gh-18286 the default. If something is simply skipped now, it may silently stop being tested or be blocking for changing the default behavior. I see that I already anticipated the issue with masked array support in
For the current behavior, yes it seems like there should be a way to skip tests that don't comply with the above. Ideally in a way that makes it easy to move over to the new defaults without anything going wrong silently.
|
I think I'm not seeing the problem. Suppose there were a decorator or mark Immediately, when the function is called with array API incompatible input:
Sometime before the behavior proposed in gh-18286 becomes the default, functions that currently accept array API invalid input need to begin emitting deprecation warnings when they receive array API invalid input. This can be a separate effort from adding the array API functionality, and it can be done after adding array API functionality. To be honest, At this point, when the function is called with array API incompatible input:
When we decide to pull the trigger and make the behavior proposed in gh-18286 (or something similar) the default and only behavior, we remove the decorator. Tests to which the decorator has been applied will begin to fail, but they are no longer relevant, so we can remove them. If there is something in these tests that is still relevant, we move those parts into separate tests. Maybe that should be done when the decorator is applied as I've already done here with the I don't necessarily mean for this to be a real proposal, but hopefully it's explicit enough about a potential path forward so we can identify problems with having a Footnotes
|
I think what Matt has written makes sense.
+1, I think we can word the warnings appropriately to avoid the concern in your footnote. |
[skip cirrus] [skip circle]
if xp.isdtype(a.dtype, 'integral'): | ||
a = xp.asarray(a, dtype=xp.float64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When skew
, kurtosis
, and any other functions that rely on _moment
are updated, this can be removed, since it's already in the public function.
Thanks Matt for thinking it through. I agree, that should work as a general strategy. The proposed name should work, but some hints as to why would be helpful, at least in the first usage in a set of functions. If we don't want specific decorators, then something like: @skip_if_array_api # uses masked arrays it's so much easier than reading the code to deduce what the reason for a skip is |
Thanks @rgommers @lucascolley. Does the reason need to be more specific than that the tested function is passed an array API invalid input, or do all the different types of invalid inputs used need to be specified? If the former, would And just to double check, should this be a decorator that applies a mark so that there is potential for the decorator to do more in the future (e.g. look for deprecation warnings, if that's possible)? |
That sounds good to me. In most cases it'll be clear enough with the test code right below. In case it's non-obvious (e.g. function generates object array internally), an extra comment as to why may be helpful.
I think so - that's the most efficient way of doing it anyway, right? |
Definitely. The only question is whether we would want to be more fine grained when looking for deprecation warnings. Some tests will have multiple calls to the function, and the decorator would not distinguish between the deprecation warning being emitted once or multiple times. But I think it would be safe enough to assume that all of them emit the warning if one of them does, especially if the warning is actually emitted by something central like |
Agreed, that seems safe enough to assume. |
inexact = (xp.isdtype(a.dtype, "real floating") | ||
or xp.isdtype(a.dtype, "complex floating")) | ||
if inexact: | ||
# The summation method avoids creating another (potentially huge) array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it can introduce NaNs where non existed before (see gh-20386). I propose getting rid of it and always use the other method: contains_nan = xp.isnan(xp.sum(a))
.
@lucascolley besides incorporating a fix for gh-20386, is there anything else you'd like to see here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
almost there!
error_type = TypeError if SCIPY_ARRAY_API else ValueError | ||
assert_raises(error_type, stats.shapiro, np.array([[], [2]], dtype=object)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I think this is the right pattern to use
scipy/_lib/tests/test__util.py
Outdated
|
||
from scipy._lib._array_api import xp_assert_equal | ||
from scipy._lib._array_api import xp_assert_equal, is_numpy, copy as xp_copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to changing the names of the helpers themselves to xp_copy
and xp_size
if that would be easier than having these aliases everywhere. size
is just exposed from array_api_compat
(no wrapper), and copy
was written by Andrew.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not push for it across the board, but it doesn't sound like a bad Idea.
OK, I think the last commit addresses the comments. I will change #20292 (review) in a separate PR once this is merged, as that issue really is orthogonal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good from my POV, thanks Matt!
Thanks very much @lucascolley! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great 👍 The implementation looks correct, same for the tests and the changes for Array API helpful and Lucas checked that part thoroughly 🙌
Feel free to go ahead @lucascolley
Thanks @lucascolley @rgommers @tupui! |
Reference issue
Towards gh-18867
What does this implement/fix?
Adds array API support to
stats.moment
.