Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT/TST: fft: remove xp backend skips, test fftfreq device #19900

Merged
merged 10 commits into from
May 18, 2024

Conversation

lucascolley
Copy link
Member

@lucascolley lucascolley commented Jan 17, 2024

Reference issue

Towards gh-17278 and gh-19257

What does this implement/fix?

numpy/numpy#25317 has allowed us to remove some skips.

Additional information

@github-actions github-actions bot added scipy.cluster scipy.fft maintenance Items related to regular maintenance tasks labels Jan 17, 2024
@rgommers
Copy link
Member

I'm pretty sure we get new fft failures with numpy main, and the latest released numpy doesn't contain array_api.fft yet. So this seems a little premature.

@lucascolley lucascolley marked this pull request as draft January 17, 2024 17:26
@lucascolley
Copy link
Member Author

lucascolley commented Jan 17, 2024

Fair - with NumPy nightly all tests pass for python dev.py test -s cluster -s fft -b all with these skips removed (and a workaround for the api_version=None thing).

Edit: let's keep this as draft until NumPy 2.0 is out.

@lucascolley lucascolley added the array types Items related to array API support and input array validation (see gh-18286) label Jan 17, 2024
@lucascolley lucascolley removed the request for review from peterbell10 January 21, 2024 12:29
@lucascolley lucascolley changed the title MAINT: fft: remove numpy.array_api skips MAINT: fft: remove array_api_strict skips Mar 5, 2024
@lucascolley lucascolley marked this pull request as ready for review March 5, 2024 21:29
@lucascolley
Copy link
Member Author

lucascolley commented Mar 6, 2024

Brainstorming how we should go about testing the device parameter. If we explicitly set SCIPY_DEVICE, then we're probably fine to just test with that, and the onus is on the person using the CLI / writing the workflow to use a device which the backends can use.

The question is what to do when SCIPY_DEVICE is not set. NumPy wants 'cpu', while array-api-strict wants 'CPU_DEVICE' can only accept None I think. Maybe SCIPY_DEVICE should remain None if not explicitly set, then we handle that case in setting the default device.

@tylerjereddy
Copy link
Contributor

Brainstorming how we should go about testing the device parameter. If we explicitly set SCIPY_DEVICE, then we're probably fine to just test with that, and the onus is on the person using the CLI / writing the workflow to use a device which the backends can use.

I'm trying to wrap my head around that. Maybe I'm missing something, but given that not all backends are guaranteed to have a mechanism for setting a global default device, how does the current diff here not pose a challenge to handling parametrized device scenarios across backends that do and do not support that mechanism? Or are we just deferring that until it actually happens with a backend we support?

@lucascolley
Copy link
Member Author

the diff here is just deferring yes in a sense. It should probably be reverted, along with adding actual testing of the device parameter, once we figure out how to make it work. (The reason for removal for now is that array-api-strict is not happy with device='cpu')

@lucascolley
Copy link
Member Author

I'll return to this after gh-20085 is in

@lucascolley lucascolley marked this pull request as draft March 22, 2024 15:13
@lucascolley lucascolley changed the title MAINT: fft: remove array_api_strict skips MAINT: fft: remove xp backend skips, test fftfreq device Mar 26, 2024
@lucascolley
Copy link
Member Author

lucascolley commented Mar 26, 2024

Update here:

  • I've revamped the {r}fftfreq tests - introducing a new helper to get available devices, and testing with all of them.
  • I've removed a bunch of skips that are no longer necessary thanks to work on the array-api-compat side.
  • I've made the multithreading tests CPU-only. Let's try to get your test runs down to 0 fails in this PR @tylerjereddy :)

@lucascolley lucascolley marked this pull request as ready for review March 26, 2024 22:14
Comment on lines 210 to 211
def is_array_api_strict(xp):
return xp.__name__ == 'array_api_strict'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this during development but didn't end up using it. Happy to keep or remove it.

scipy/_lib/_array_api.py Outdated Show resolved Hide resolved
@lucascolley lucascolley added this to the 1.14.0 milestone Mar 26, 2024
Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did check that full suite is passing for me locally on this branch:

SCIPY_DEVICE=cuda python dev.py test -j 32 -b all

48971 passed, 10621 skipped, 153 xfailed, 13 xpassed in 53.57s

vs. this on main: 4 failed, 48925 passed, 10658 skipped, 150 xfailed, 13 xpassed in 58.39s

I wasn't sure about some of the shims in test_helper though

x2 = xp.asarray([0, 1, 2, 3, 4, 5], dtype=xp.float64)

y = 9 * fft.rfftfreq(9, xp=xp_test)
assert array_namespace(y) is xp_test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on with the new xp_test thing above and below and this separate call instead of using the namespace checks we built in to xp_assert_.. functions? We had a way to disable the dtype checks as well I thought, so I guess I'm trying to wrap my head around the extra activity/coercion/checks.

If CuPy device handling is the issue should we just skip it for now instead of this dance?

 class TestRFFTFreq:
 
+    @skip_xp_backends("cupy",
+                      reasons=["no device arg support"])
     def test_definition(self, xp):
-        xp_test = array_namespace(xp.empty(0))
-
         x = xp.asarray([0, 1, 2, 3, 4], dtype=xp.float64)
         x2 = xp.asarray([0, 1, 2, 3, 4, 5], dtype=xp.float64)
 
-        y = 9 * fft.rfftfreq(9, xp=xp_test)
-        assert array_namespace(y) is xp_test
-        y = xp_test.astype(y, xp.float64)
-        xp_assert_close(y, x)
+        y = 9 * fft.rfftfreq(9, xp=xp)
+        xp_assert_close(y, x, check_dtype=False, check_namespace=True)
 
-        y = xp_test.astype(9 * xp.pi * fft.rfftfreq(9, xp.pi, xp=xp_test), xp.float64)
-        xp_assert_close(y, x)
+        y = 9 * xp.pi * fft.rfftfreq(9, xp.pi, xp=xp)
+        xp_assert_close(y, x, check_dtype=False)
 
-        y = xp_test.astype(10 * fft.rfftfreq(10, xp=xp_test), xp.float64)
-        xp_assert_close(y, x2)
+        y = 10 * fft.rfftfreq(10, xp=xp)
+        xp_assert_close(y, x2, check_dtype=False)
 
-        y = xp_test.astype(10 * xp.pi * fft.rfftfreq(10, xp.pi, xp=xp_test), xp.float64)
-        xp_assert_close(y, x2)
+        y = 10 * xp.pi * fft.rfftfreq(10, xp.pi, xp=xp)
+        xp_assert_close(y, x2, check_dtype=False)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's going on with the new xp_test thing above and below and this separate call instead of using the namespace checks we built in to xp_assert_.. functions? We had a way to disable the dtype checks as well I thought, so I guess I'm trying to wrap my head around the extra activity/coercion/checks.

I suppose we can just turn off the dtype check yeah. I was trying to check both the namespace and the dtype at once, but maybe we just introduce a separate test_dtypes.

I don't think we need to skip this for CuPy - we may need to skip the new test_device tests, if we decide to include CuPy support in the new helper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, in local testing, CuPy was failing because of device pass-through I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I'm away from my GPU, so feel free to post the errors. Or I can check later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the modified version of the test above, I see the failure below with the skip removed. My assumption was that all the extra shims in the test were guarding against the CuPy device stuff, so I think that's what I'm getting confused about re: performing testing gymnastics vs. actually just providing a support shim in the source proper if we want to test that functionality.

scipy/fft/tests/test_helper.py:431: in test_definition
    y = 9 * fft.rfftfreq(9, xp=xp)
        self       = <scipy.fft.tests.test_helper.TestRFFTFreq object at 0x7fc490f214d0>
        x          = array([0., 1., 2., 3., 4.])
        x2         = array([0., 1., 2., 3., 4., 5.])
        xp         = <module 'cupy' from '/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/cupy/__init__.py'>
scipy/fft/_helper.py:216: in rfftfreq
    return xp.fft.rfftfreq(n, d=d, device=device)
E   TypeError: rfftfreq() got an unexpected keyword argument 'device'
        d          = 1.0
        device     = None
        n          = 9
        xp         = <module 'cupy' from '/home/treddy/python_venvs/py_311_scipy_dev/lib/python3.11/site-packages/cupy/__init__.py'>
===================================================================================== short test summary info =====================================================================================
FAILED scipy/fft/tests/test_helper.py::TestRFFTFreq::test_definition[cupy] - TypeError: rfftfreq() got an unexpected keyword argument 'device'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we need to use xp=xp_test, see the comment I posted over in the JAX PR:

It's exposed that how we handle creation functions like fftfreq still isn't quite right. We either need to:

  1. require that the namespace passed to the xp kwarg is array API compatible, or
  2. generate a compatible namespace in fftfreq based on the passed namespace.

I've opted for (1), since that is what we want eventually. I figure it's better to have the temporary shim in the tests (where if it stays the inefficiency doesn't matter), rather than in the public function where it would have to be changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the part I think is confusing/debatable, and I'm not sure I agree just yet. If CuPy isn't compatible with the API here, why don't we skip until it is?

If other folks actually prefer the approach you've used here, I think comment(s) are needed to explain, because these test shims are messy to understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I suppose that is option (3) then. I think I was tunnel visioned on getting the test to pass (to at least check that our implementation is correct), but now that I'm confident that what we have is correct, I think you're right that we should just skip.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworked based on your comments, let me know what you think

scipy/fft/tests/test_helper.py Show resolved Hide resolved
@lucascolley
Copy link
Member Author

lucascolley commented Apr 11, 2024

I haven't had any luck with help from the CuPy gitter on how to include CuPy in get_xp_devices. Otherwise, this should be ready.

EDIT: all done now

[skip circle] [skip cirrus]
@lucascolley lucascolley force-pushed the npx-fft branch 2 times, most recently from ee37195 to eff29d7 Compare April 22, 2024 17:52
@lucascolley lucascolley changed the title MAINT: fft: remove xp backend skips, test fftfreq device MAINT/TST: fft: remove xp backend skips, test fftfreq device May 18, 2024
[skip cirrus] [skip circle]
@lucascolley
Copy link
Member Author

JAX is happy now

[skip circle] [skip cirrus]
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. It resolves some failures with CuPy for the threadsafe tests.

I pushed a few tweaks to get_xp_devices. Adding type annotations to functions in this file is quite useful to declare intent and catch potential issues.

JAX is happy now

Did you not add JAX support in get_xp_devices on purpose or by accident?

@lucascolley
Copy link
Member Author

Did you not add JAX support in get_xp_devices on purpose or by accident?

Ah, that remains to be done.

scipy/_lib/_array_api.py Outdated Show resolved Hide resolved
[skip cirrus] [skip circle]
@lucascolley
Copy link
Member Author

I've added JAX to get_xp_devices, but I haven't tested whether it works. Please test 🙏

@rgommers
Copy link
Member

Ah, CI is unhappy though:

FAILED scipy/fft/tests/test_helper.py::TestFFTFreq::test_device[torch] - RuntimeError: Expected one of cpu, cuda, ipu, xpu, mkldnn, opengl, opencl, ideep, hip, ve, fpga, ort, xla, lazy, vulkan, mps, meta, hpu, mtia, privateuseone device type at start of device string: c

@lucascolley
Copy link
Member Author

Ah, CI is unhappy though:

yep, rookie error, should be fixed now

scipy/_lib/_array_api.py Outdated Show resolved Hide resolved
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything is happy now. In it goes, thanks @lucascolley and @tylerjereddy

@rgommers rgommers merged commit de48c74 into scipy:main May 18, 2024
@lucascolley lucascolley deleted the npx-fft branch May 18, 2024 20:46
@lucascolley
Copy link
Member Author

thanks both!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array types Items related to array API support and input array validation (see gh-18286) maintenance Items related to regular maintenance tasks scipy.fft
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants