Test nightly wheel build with NumPy 2.0 #7288

lagru · 2024-01-11T10:03:56Z

Description

Concerns #7282. matplotlib pulls in contourpy and contourpy pins numpy<2. So for now, remove matplotlib as a testing dependency and assert that NumPy 2.0 is used during the tests with for the nightly wheels.

I'm currently also piling on a few fixes that we will eventually need to be compatible with NumPy 2.0.

Modules passing locally for NumPy 1 & 2 (skipping some dependencies):

References:

Checklist

A descriptive but concise pull request title
Docstrings for all functions
Unit tests
A gallery example in ./doc/examples for new features
Contribution guide is followed

Release note

Summarize the introduced changes in the code block below in one or a few sentences. The
summary will be included in the next release notes automatically:

...

matplotlib pulls in contourpy and contourpy pins numpy<2. So for now, remove matplotlib as a testing dependency and assert that NumPy 2.0 is used during the tests with for the nightly wheels.

... a little silly of me ^^.

perhaps this is passed to the test environment as well?

skimage/draw/draw.py

Unfortunately we relied on NumPy's lookfor which is promptly removed in NumPy 2.0. Technically, if we don't want to make a breaking release we need to deprecate the function in a deprecation cycle for which we need to vendor NumPy's lookfor. Though, skimage.lookfor is an interactive function, so we might get away with not vendoring NumPy's lookfor, and just making it return a deprecation warning...

instead of deprecated `sctype2char`. Also np.core is flat out deprecated. Seems weird that I can just remove obj2sctype in _convert while it still keeps working. See also numpy/numpy#25580.

lagru · 2024-01-12T23:46:24Z

skimage/util/lookfor.py

 import sys

+from .._shared.utils import deprecate_func
+from .._vendored.numpy_lookfor import lookfor as _lookfor


Highlighting this. Unfortunately we relied on NumPy's lookfor which is promptly removed in NumPy 2.0. Technically, if we don't want to make a breaking release we need to deprecate the function in a deprecation cycle for which we need to vendor NumPy's lookfor.

Though, skimage.lookfor is an interactive function, so we might get away with not vendoring NumPy's lookfor, and just making it return a deprecation warning...

If its annoying, I guess we could also vendor it back. Does anyone still use lookfor? I feel as an interactive function, it would be a few old-school power users probably (I never used it), thus the feeling we probably can get away with (and no actual program can fail).

Is there an alternative for "find a function with x in the docstring"? I guess you'd need the sphinx docs, but those aren't always there on the airplane.

Fair, I am not sure if some IDE has search features, which would be nice. It still seems odd in NumPy to me and probably very rarely used, but if you prefer keeping it for now, I am fine with that too.

I think most IDEs only take the function / method / class name into account when matching for a search string. Or you can do a full search that includes more than just docstrings. @seberg do you have an opinion on how much maintenance work this has been?

Of all discussed options, I prefer NumPy to just keep it around as that's the most simple option from our side. However, I'd totally get why you might want to remove lookfor. In that case I'd vote for the scientific python package option or removing it all-together. I intend vendoring to be temporary solution only.

I am not sure it has been maintanence work, so if there is a PR to restore it I won't mind merging it.
To me it was mostly about namespace clutter as I don't think typical user use it. IIRC it also returns some nonsense results (i.e. things that are much better in the html docs, but it doesn't find those, while it found docs that belonged just deleted).

Hmm, it doesn't give me a lot of confidence that it's returning nonsense results. Still think it's better placed in a small stand-alone tool, so I probably won't make a PR adding it back to NumPy. ;)

Realizing @Carreau may also have thought about this problem.

I don't have any particular solution.
I plan to have full documentation search with papyri at some point, we can even do some indexing. but it's not ready yet.

Then let's keep it until the status quo changes.

See also https://www.github.com/cgohlke/tifffile/issues/238

As of NumPy 2.0, can_cast no longer supports Python scalars. I think `np.result_type` [1] still takes values into account when determining the resulting type because `np.min_scalar_type` is used under the hood. So we should be able to get away by comparing if the resulting type is the same as the target type `arr.dtype`. [1] https://numpy.org/devdocs/reference/generated/numpy.result_type.html

`np.finfo(dtype).eps` will return a scalar with the same `dtype`.

lagru · 2024-01-13T15:27:53Z

skimage/color/colorconv.py

    if alpha is None:
-        alpha = alpha_max
-
-    if not np.can_cast(alpha, arr.dtype):


Highlighting this bit. As of NumPy 2.0, np.can_cast no longer supports Python scalars. I think np.result_type still takes values into account when determining the resulting type because np.min_scalar_type is used under the hood. So we should be able to get away by comparing if the resulting type is the same as the target type arr.dtype.

@seberg, taking the liberty to Cc you here. 🙏 I'm sure I haven't grokked NEP 50 in its entirety yet.

No, that isn't correct, result types of course does not take the value into account.

It ignores the values (same as if it was 0) for Python complex/float/ints. For out-of-bound integers that would mean that the full call below should fail if it is out of range, though. The result_type I guess still rejects floats if the array is integer.

Oh, you seem to be right, thanks for the feedback! I think I was led astray by this bit from the docstring

Otherwise, min_scalar_type is called on each scalar, and the resulting data types are all combined with promote_types to produce the return value.

Concering np.full, do I misunderstand you? It seems like np.full happily overflows out of bounds integers. E.g.

np.full(1, -1, dtype=np.uint8) # returns array([255], dtype=uint8) np.full(1, 300, dtype=np.uint8) # returns array([44], dtype=uint8)

Is there a an alternative solution in NumPy for what np.can_cast formerly did here? I guess I could just check if the cast is equal to the original value (54a901f). I imagine the performance impact will be low as I expect alpha to be a scalar usally.

Also curious about np.min_scalar_type(np.iinfo(np.int64).max) returning dtype('uint64'). I would have expected np.int64 but I guess the promotion path reaches np.uint64 first?

arg, you are right, full is weird np.array(value, dtype=dtype) would work, or assignment, but full unfortunately does np.asarray(value) and then assigns.

stefanv · 2024-01-16T17:57:39Z

I wonder if it is worth refactoring lookfor in to a package under SP?

Fixing this problem for NumPy 1 & 2 has been a bit tricky but I think the current solution should catch and deal with every numeric dtype that a user may use for `tolerance`.

lagru · 2024-01-18T17:57:57Z

skimage/morphology/_flood_fill.py

+            tolerance = abs(tolerance)
+            # Account for over- & underflow problems with seed_value ± tolerance
+            # in a way that works with NumPy 1 & 2
+            min_value, max_value = numeric_dtype_min_max(seed_value.dtype)
+            with np.errstate(over="raise", under="raise"):
+                try:
+                    low_tol = max(min_value, seed_value - tolerance)
+                except (OverflowError, FloatingPointError):
+                    low_tol = min_value
+                try:
+                    high_tol = min(max_value, seed_value + tolerance)
+                except (OverflowError, FloatingPointError):
+                    high_tol = max_value


Some explanation: in NumPy 2.0 we can no longer rely on

high_tol = min(max_value, seed_value + tolerance) low_tol = max(min_value, seed_value - tolerance)

to correctly bound to the min and max of the dtype. seed_value may be a np.uint8 in which case adding e.g. 379 to it will result in an OverflowError.

with np.errstate(over="raise", under="raise") is necessary because in NumPy 2 cases like

low_tol = max(0, np.uint8(2) - 3)

now underflow to np.uint8(255) because the 3 will be cast to np.uint8 before addtion. In NumPy 1, the right side would result in np.int64(-1).

Though, I may be wrong. I find it very tricky to think of and test for all edge cases...

Yes, but you will get a RuntimeWarning, which should make our test suite error out.

Easiest in this case may be max(0, int(x) - 3).

Casting to the appropriate Python scalar seems indeed like the easiest fix. 👍

Forgot to do this in a previous commit.

This option should have no bearing on the precision with which results are compared by `assert_array_almost_equal`. The parameter decimal=10 should be used for that purpose. This option also had unwanted side effects, because it changed the display global option globaly leading to problems in later doctest comparisons. Note that changing the comparisons in this test to use `assert_array_almost_equal(..., decimal=10)` fails the tests.

Previously the returned `unique_inverse` array was always flattened which seems to be no longer the case for NumPy 2. However, keep the output flattened as before. This superseeds commit 8e693ab which provided a fix that didn't work on NumPy<2.

For a correct dtype comparison, we must specify the "endianness" as well. Comparing with img.dtype.type == np.uint16 would also work. This updates commit 2e2deb9 which provided a fix that didn't work on NumPy<2.

skimage/io/tests/test_tifffile.py

lagru · 2024-02-16T16:46:02Z

Can someone tell my why

scikit-image/skimage/util/dtype.py

Line 33 in 903824f

np.uintc, # 16 or 32 or 64 bits

is used instead of np.int32 to build the list of _supported_types? It seems like on AMD np.intc isn't np.int32 which leads to failing tests that try to use our convert machinery with np.int32. Not sure if this failure is something that is new with NumPy 2 or was something we didn't notice before and was already a problem with NumPy<2.

This function emits its own warning if the given alpha isn't compatible with the given image dtype. So ignore the one raised by NumPy.

stefanv · 2024-02-17T00:25:56Z

Can someone tell my why

scikit-image/skimage/util/dtype.py

Line 33 in 903824f

np.uintc, # 16 or 32 or 64 bits

is used instead of np.int32 to build the list of _supported_types?

#3043 (comment)

My guess is it'd be fine to switch to int32.

Maybe we can also update that confusing comment above _integer_types, once we understand what it says again ;)

I'm not sure what the original intention of this test was but Stéfan suggests that asserting np.uint16 is enough and I tend to concur.

lagru · 2024-02-19T15:00:39Z

I don't really follow. In #3043 (comment) it is explained that

These map directly to the underlying C types, char, short, int, long, long long.
Every other type is just an alias to one of these (ie, np.t1 is np.t2).

But it seems that this is not true for NumPy 2 on AMD64 and Windows. According to the tests np.int32 is clearly missing from the _supported_dtypes, so none of the supposed underlying C types is an alias of np.int32.

Furthermore, I don't get why the "underlying C types" are of relevance here. _convert seems to me as a function that allows converting between dtypes with automatic scaling. So it seems way more explicit to use np.int8, np.int32, np.int64 etc here like we do with for the floating types.

Additionally, while np.int64 is covered by np.longlong, _convert(np.linspace(-1, 1)) overflows for the value 1.

in _convert. These should be more explicit and should be more reliable in ensuring that for example int32 is available on all platforms.

skimage/_shared/dtype.py

Looks like that serves our purpose perfectly.

While these might be useful to define in one place to reuse in tests, keep the PR focused for now.

This reverts commit 6e39a2e.

stefanv · 2024-02-22T17:49:34Z

doc/source/user_guide/getting_help.rst

@@ -26,16 +26,6 @@ Use the ``quick search`` field in the navigation bar of the online
 documentation to find mentions of keywords (segmentation,
 rescaling, denoising, etc.) in the documentation.

-API Discovery


Why is this taken out, if we still support it?

Right now lookfor is deprecated and its removal scheduled for 0.26. While that is so, I think it makes sense to remove this section. It's also in the TODO.txt.

To be honest, I forgot that it was marked as deprecated while the discussion in #7288 (comment) happened. I don't mind reverting the deprecation and this in a PR if you would like to see it stay...

jarrodmillman

LGTM. If we run into any issues, we can fix them in a follow-up PR.

stefanv

I think we should merge this and act on any further concerns as they arise. I did make a few comments, but let's discuss those for potential follow-up.

stefanv · 2024-02-22T17:53:39Z

skimage/filters/thresholding.py

@@ -137,6 +137,9 @@ def try_all_threshold(image, figsize=(8, 5), verbose=True):

    Examples
    --------
+    .. testsetup::


Any way to get these out of the docstrings and into a configuration file, e.g., if we used doctest++?

doctest-plus doesn't have an option to do so in a configuration file, I think.

See #7289 (comment) for background on the current pattern. Like @soupault I actually like the explicitness of this and didn't like that the other option using __doctest_skip__ is hiding that doctests are skipped.

Note that these directives don't show up in our rendered HTML docstrings.

What do you think?

You could add something like

needs_matplotlib = [ "draw/draw.py", "filters/thresholding.py", ... ] collect_ignore = [] try: import matplotlib except ImportError: collect_ignore += needs_matplotlib

to skimage/conftest.py.

It would be nice not to have the docstrings littered with statements that make little sense to new readers (for whom these are most useful).

Hmm, I see the value in that. .. testsetup might appear often enough via help(some_func) and using IPython's ? that it's a relevant concern.

I'd support an alternative solution if it makes it visible that a test is skipped. Otherwise it might get really confusing for contributors. Imagine them not having installed an optional dependency and wondering why the CI is showing red for something they can't reproduce because the doctest is just silently skipped by some collect_ignore somewhere they have no idea about.

I can't get the solution suggested by @jarrodmillman to work reliably, it seems like doctests are still collected. Furthermore, it and collect_ignore_glob can only ignore entire files or directories.

We've been using it for NetworkX reliably for several years. Can you describe how it is unreliable?

I don't think it has been confusing for contributors, but we have had to explain how to skip tests for some contributors, but I suspect that will be true of any solution.

It does ignore entire files or directories. That hasn't been an issue for NetworkX.

Can you describe how it is unreliable?

Trying that out locally just doesn't seem to work. E.g. if I add "restoration/j_invariant.py", to

scikit-image/skimage/conftest.py

Lines 5 to 6 in 9107d52

collect_ignore = [

"io/_plugins",

and run

spin test -- -v --doctest-plus skimage/restoration/j_invariant.py

it's still picked up. I'm guessing it's because I am somehow using the wrong pattern? I couldn't figure out one that works to skip the module.

Also in case of skimage/restoration/j_invariant.py only the doctest of denoise_invariant needs to be skipped if pywt is not available. Skipping the entire module would skip two doctests unnecessarily.

If we can make skipping in doctestplus more visible, then I think we can satisfy everyone. I'm currently looking into it in scientific-python/pytest-doctestplus#246...

stefanv · 2024-02-22T17:55:14Z

skimage/morphology/tests/test_flood_fill.py

@@ -40,7 +40,9 @@ def test_overrange_tolerance_float():
    image *= max_value

    expected = np.ones_like(image)
-    output = flood_fill(image, (0, 1), 1.0, tolerance=max_value * 10)
+    with np.errstate(over="ignore"):
+        tolerance = max_value * 10


Hrm, is this safe?

Good catch, I used the wrong approach to "fix" this as previously the result was using Python scalars while now it results in an np.float32(inf) which is not the same as before. I'll address this in a follow-up PR. 👍

stefanv · 2024-02-22T17:55:47Z

skimage/transform/_geometric.py

-    array([[-0.2178588368,  0.4192819131, -0.0343074756],
-           [-0.0717941428,  0.0451643229,  0.0216072614],
-           [ 0.2480621133, -0.4294781423,  0.0221019139]])
+    >>> tform_matrix.params  # doctest: +FLOAT_CMP


Again, can't this be a global doctest option?

There's the option in setup.cfg. Not fond of adding a setup.cfg again just for this.

I think the +FLOAT_CMP is actually not necessary. I added this before I discovered that another test was changing the precision for floating point comparisons. So I'll remove it in a follow-up PR.

Though, in cases like this I would still prefer the more explicit local option unless we have a reason to to not use the default precision globally?

scientific-python/pytest-doctestplus#185

A global config should be sufficient, but fine with having exceptions defined this way.

For completeness, saimn pointed out in scientific-python/pytest-doctestplus#185 that pyproject.toml is supported since scientific-python/pytest-doctestplus#222

stefanv · 2024-02-22T17:58:06Z

Thank you, @lagru, that was a big job!

Allow NumPy 2.0 by removing matplotlib as test dependency

5e3fa38

matplotlib pulls in contourpy and contourpy pins numpy<2. So for now, remove matplotlib as a testing dependency and assert that NumPy 2.0 is used during the tests with for the nightly wheels.

lagru added 🔧 type: Maintenance Refactoring and maintenance of internals ⬆️ Upstream Needs help from or involves an upstream project labels Jan 11, 2024

lagru added 6 commits January 11, 2024 11:18

Initialize workflow input as env variable

0da0377

Skip creating matplotlibrc if matplotlib is unavailable

efe7d0c

Don't include comment in command

f439660

... a little silly of me ^^.

importskip doctests that require matplotlib

bfc311a

Try solution with CIBW_ENVIRONMENT

f342114

perhaps this is passed to the test environment as well?

Skip MPL_DIR entirely if not needed

53ff441

jarrodmillman reviewed Jan 11, 2024

View reviewed changes

skimage/draw/draw.py Outdated Show resolved Hide resolved

lagru mentioned this pull request Jan 12, 2024

BLD: Build nightly wheels using NumPy nightlies #7249

Closed

lagru added 3 commits January 12, 2024 23:51

Use np.dtype(...).char

1835386

instead of deprecated `sctype2char`. Also np.core is flat out deprecated. Seems weird that I can just remove obj2sctype in _convert while it still keeps working. See also numpy/numpy#25580.

Replace deprecated ndarray.ptp with np.ptp

d92a6bb

lagru commented Jan 12, 2024

View reviewed changes

lagru mentioned this pull request Jan 13, 2024

ndarray.newbyteorder is no longer available in NumPy 2.0 cgohlke/tifffile#238

Closed

lagru added 4 commits January 13, 2024 13:49

Xfail tests that use ndarray.newbyteorder via tifffile

d88af4b

See also https://www.github.com/cgohlke/tifffile/issues/238

Ensure float32 passthrough in xyz2luv & luv2xyz

d150426

`np.finfo(dtype).eps` will return a scalar with the same `dtype`.

Use np.int64 to hold cube(...) * 1000

3620b77

lagru commented Jan 13, 2024

View reviewed changes

Actually try if cast succeeds

54a901f

lagru mentioned this pull request Jan 17, 2024

Test and prepare for NumPy 2.0 #7282

Open

2 tasks

lagru added 4 commits January 18, 2024 13:55

Merge branch 'main' into maintenance/ci-n-prep-numpy2

2a19844

Hide importorskip for matplotlib in doctests

4f65ab7

Add numeric_dtype_min_max helper

81ae0a9

Address Overflow error in flood if tolerance is set

5ccee45

Fixing this problem for NumPy 1 & 2 has been a bit tricky but I think the current solution should catch and deal with every numeric dtype that a user may use for `tolerance`.

lagru commented Jan 18, 2024

View reviewed changes

Add dtype.py to meson.build

4b482df

Forgot to do this in a previous commit.

lagru added 3 commits February 16, 2024 14:17

Make big endian uint16 test more specific

11f88df

For a correct dtype comparison, we must specify the "endianness" as well. Comparing with img.dtype.type == np.uint16 would also work. This updates commit 2e2deb9 which provided a fix that didn't work on NumPy<2.

lagru commented Feb 16, 2024

View reviewed changes

skimage/io/tests/test_tifffile.py Outdated Show resolved Hide resolved

Replace overflow warning for float64.max to float32

e7e0832

This function emits its own warning if the given alpha isn't compatible with the given image dtype. So ignore the one raised by NumPy.

lagru marked this pull request as ready for review February 16, 2024 20:51

Ignore endian-ness and assert np.uint16

5cf77cb

I'm not sure what the original intention of this test was but Stéfan suggests that asserting np.uint16 is enough and I tend to concur.

Use sized aliases for supported integer types

d1712b5

in _convert. These should be more explicit and should be more reliable in ensuring that for example int32 is available on all platforms.

lagru commented Feb 19, 2024

View reviewed changes

skimage/_shared/dtype.py Outdated Show resolved Hide resolved

lagru force-pushed the maintenance/ci-n-prep-numpy2 branch from df4e634 to 2617ab1 Compare February 19, 2024 17:32

Simplify overflow handling of seed tolerance in flood_fill

76f9018

lagru force-pushed the maintenance/ci-n-prep-numpy2 branch from 2617ab1 to 76f9018 Compare February 19, 2024 17:43

lagru added 3 commits February 21, 2024 13:35

Use .item() instead of our own to_py_scalar()

ed920f2

Looks like that serves our purpose perfectly.

Remove dtype lists

6e39a2e

While these might be useful to define in one place to reuse in tests, keep the PR focused for now.

Revert "Remove dtype lists"

daaf7bc

This reverts commit 6e39a2e.

stefanv reviewed Feb 22, 2024

View reviewed changes

jarrodmillman approved these changes Feb 22, 2024

View reviewed changes

stefanv approved these changes Feb 22, 2024

View reviewed changes

stefanv merged commit 3a66e0b into main Feb 22, 2024
66 checks passed

stefanv deleted the maintenance/ci-n-prep-numpy2 branch February 22, 2024 17:57

stefanv added this to the 0.23 milestone Feb 22, 2024

lagru mentioned this pull request Feb 23, 2024

Follow-up cleaning & fixes for compatibility with NumPy 1 & 2 #7326

Merged

jarrodmillman mentioned this pull request Feb 23, 2024

Establish guideline for packages that can upload to the scientific-python nightly channel scientific-python/upload-nightly-action#30

Open

lagru mentioned this pull request Mar 1, 2024

CI workflow failures with numpy>=2.dev0 PyTables/PyTables#1083

Open

This was referenced Apr 9, 2024

0.23 release #7374

Closed

In scikit-image==0.23.1 img_as_ubyte stop accepting np.ulonglong #7385

Closed

Test nightly wheel build with NumPy 2.0 #7288

Test nightly wheel build with NumPy 2.0 #7288

Conversation

lagru commented Jan 11, 2024 • edited

Description

Checklist

Release note

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lagru Jan 14, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanv commented Jan 16, 2024

lagru Jan 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lagru commented Feb 16, 2024

stefanv commented Feb 17, 2024 • edited

lagru commented Feb 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jarrodmillman left a comment

Choose a reason for hiding this comment

stefanv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jarrodmillman Mar 6, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lagru Feb 23, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lagru Feb 23, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stefanv commented Feb 22, 2024

lagru commented Jan 11, 2024 •

edited

lagru Jan 14, 2024 •

edited

lagru Jan 18, 2024 •

edited

stefanv commented Feb 17, 2024 •

edited

jarrodmillman Mar 6, 2024 •

edited

lagru Feb 23, 2024 •

edited

lagru Feb 23, 2024 •

edited