Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build package for Pyodide out of tree #7265

Closed
wants to merge 7 commits into from

Conversation

lagru
Copy link
Member

@lagru lagru commented Dec 11, 2023

Description

Attempt to build our own wheel for the Pyodide distribution. If this proves "easy" we could test against it regularly and catch errors before releases. It would also make it easier to include JupyterLite in our HTML docs.

Related resources:

cc @stefanv

Checklist

Release note

Summarize the introduced changes in the code block below in one or a few sentences. The
summary will be included in the next release notes automatically:

...

@lagru lagru added the 🤖 type: Infrastructure CI, packaging, tools and automation label Dec 11, 2023
@rgommers
Copy link

Nice to see this @lagru. I was just revisiting my own NumPy PR, and saw this one linked. Hopefully this is not too difficult to get over the finish line now that Pyodide has meson-python support.

@lagru
Copy link
Member Author

lagru commented Dec 12, 2023

@rgommers, thanks for taking a look! Now this is running into the same LONG_BIT error I see locally and mentioned in pyodide/pyodide#4118. You comment makes it seem that it may be a regression in the toolchain. I tried pyodide-build 0.25.0a2 but that doesn't seem to make a difference. I'll keep looking but maybe the fix is obvious to you.

@rgommers
Copy link

Hmm, I was counting on that being resolved since the upstream bug was closed, and the fix confirmed to work for scikit-learn: pyodide/pyodide#4136. That was with a Python from a conda env though, and that was the case for me as well (at least locally). Is it the same for you? If it's a different case, it may be worth commenting on pyodide/pyodide#4118.

I unfortunately didn't get time to dig deeper this week.

@stefanv
Copy link
Member

stefanv commented Dec 19, 2023

I see the same problem, except that on Fedora wordsize.h is missing. But if I edit pyconfig.h to specifically include the 64-bit header, I replicate your problem:

$ cat /usr/include/python3.11/pyconfig.h
\\#include <bits/wordsize.h>

\\#if __WORDSIZE == 32
\\#include "pyconfig-32.h"
\\#elif __WORDSIZE == 64
#include "pyconfig-64.h"
\\#else
\\#error "Unknown word size"
\\#endif

I expect that pyodide is supposed to be looking inside the .pyodide-xbuildenv for header files, and not in /usr/include.

@stefanv
Copy link
Member

stefanv commented Dec 19, 2023

I can add the include path needed:

python -m build --no-isolation --wheel -Csetup-args=--cross-file=/home/stefan/src/scikit-image/tools/emscripten.meson.cross -Csetup-args=--includedir=.pyodide-xbuildenv/xbuildenv/pyodide-root/cpython/installs/python-3.11.3/include/python3.11

But I do not know how to remove the standard Python include that is added.
My read of https://github.com/pyodide/pyodide/pull/4136/files is that those should be removed automatically, but 🤷‍♂️ this is all brand new to me :)

@lagru
Copy link
Member Author

lagru commented Dec 19, 2023

Thanks for investigating @stefanv. I just checked locally that this still seems to happen for the most recent pyodide-build versions 0.24.1 and 0.25.0a2 even though the fix pyodide/pyodide#4136 should be included. EDIT: This happens regardless of me using a venv with my system's python or micromamba.

Locally, this version seems to successfully create a wheel when running
"pyodide build" opposed to using "python -m build ...".
@lagru
Copy link
Member Author

lagru commented Dec 21, 2023

Ha, apparently there is no need to use a meson.cross file together with python -m build ... for pyodide 0.25.0a2: pyodide build "just works"! 🎉 Now I only need to figure out how to run our test suite.

@lagru
Copy link
Member Author

lagru commented Dec 21, 2023

Okay, now I'm getting loads of OSError: [Errno 8] Bad file descriptor which I can reproduce locally as well.

@lagru
Copy link
Member Author

lagru commented Feb 12, 2024

Status update: it's probably reasonable to wait with this until after NumPy 2.0 has settled. See numpy/numpy#24603 (comment).

@agriyakhetarpal
Copy link

Hi there! I wanted to chime in into the conversation, since this was referenced in numpy/numpy#24603 – its successor, here: numpy/numpy#25894, should be merged soon (maybe during this week). Please let me know If there is any way @rgommers and I can drive this forward, we would be happy to help.

@lagru
Copy link
Member Author

lagru commented Feb 28, 2024

Thanks for keeping us up to date. I was quietly following along but I think it still might make sense to wait with this until Pyodide includes NumPy 2.0 officially? It's been some time but as I understand it the current approach with out-of-tree builds only targets one package, in this case scikit-image. So to build with NumPy 2.0 we'd have to wait until the Pyodide distribution includes this. Or we get NumPy 2.0 from somewhere else but that might be unnecessarily complicated when we can just wait? What do you think?

@rgommers
Copy link

Okay, now I'm getting loads of OSError: [Errno 8] Bad file descriptor which I can reproduce locally as well.

We encountered this problem in PyWavelets. It was due to loading .npy files. The problem seems to be at least partially the same here; the last CI log shows a lot of failures for:

os.path.join(os.path.dirname(__file__), 'disk_decompositions.npy')

This was fixed a few days ago: pyodide/pyodide#4541. The fix is not yet released, but it may be worth verifying that that takes care of all bad file descriptors with a local checkout of Pyodide (in case you don't have one at hand, maybe @agriyakhetarpal can give it a try?).

I was quietly following along but I think it still might make sense to wait with this until Pyodide includes NumPy 2.0 officially?

It is probably useful to not try to merge this PR before the needed .npy fix and other needed fixes are in a Pyodide release, rather than try to copy the .patch file from @agriyakhetarpal's NumPy PR. There isn't really anything numpy 2.0 specific here that I would expect to change though. Scikit-learn works with NumPy 1.26.x, so once this PR is green with that version I'd just merge it. When Pyodide upgrades to NumPy 2.0 (which may take several months), things should stay working.

@agriyakhetarpal
Copy link

t may be worth verifying that that takes care of all bad file descriptors with a local checkout of Pyodide (in case you don't have one at hand, maybe @agriyakhetarpal can give it a try?).

It is probably useful to not try to merge this PR before the needed .npy fix and other needed fixes are in a Pyodide release, rather than try to copy the .patch file from @agriyakhetarpal's NumPy PR.

I am happy to give this a try on a fork. However, the patch was for a different purpose – to allow NumPy to properly use vendored-meson though, isn't it? The easier solution to fix the failures here is to just convert all of the .npy files to the .npz file format, which were working even before pyodide/pyodide#4541 was merged.

There isn't really anything numpy 2.0 specific here that I would expect to change though. Scikit-learn works with NumPy 1.26.x, so once this PR is green with that version I'd just merge it. When Pyodide upgrades to NumPy 2.0 (which may take several months), things should stay working.

If it helps, we can keep this PR up to date before the imminent NumPy 2.0 release (or perhaps build these Pyodide/WASM wheels on both stable and development versions of NumPy – I'm not sure if that makes sense at this time).

I would be glad to write a new PR that builds upon the changes here, or help out with code review here if needed. Please let me know what works!

@agriyakhetarpal
Copy link

agriyakhetarpal commented Feb 29, 2024

In the meantime before this gets attention, I have tried updating the PR in this branch on my fork: agriyakhetarpal/add-emscripten-builds-out-of-tree

and owing to the fact that the patch for resolving the .npy file loading issue depends on another patch for Emscripten (it has to be backported because a new Emscripten release might take time), it's not very easy to apply it here – so, in b5274dc on this branch, I have converted all of the .npy files to .npz and updated their SHA-256 hashes just to test how the build goes – we can wait till a new release for Pyodide which should fix this change.

The package succeeds in compilation without any hiccups, but I received another failure during test collection due to pooch being unable to download the files via HTTP. I removed pooch from the workflow for now to proceed to use the data files (without checksum validation) from the local skimage/data/ directory, where it now succeeds in collecting all test cases and fails after 65% of the test suite passes, due to an issue with unresolved symbols during linkage: https://github.com/agriyakhetarpal/scikit-image/actions/runs/8100301106/job/22138456770#step:9:263

which I am always unsure of as to how to debug them, but I did find this resource in the Pyodide issue tracker: pyodide/pyodide#3865 (comment) but I do not think the discussion is completely related to what is happening here. I would appreciate any help I can receive on this!

@rgommers
Copy link

rgommers commented Mar 1, 2024

I did find this resource in the Pyodide issue tracker: pyodide/pyodide#3865 (comment)

That does look like exactly the same issue. And it doesn't look easy to work around. Let me add a comment on the upstream issue.

@lagru
Copy link
Member Author

lagru commented Mar 17, 2024

I would be glad to write a new PR that builds upon the changes here, or help out with code review here if needed. Please let me know what works!

@agriyakhetarpal, please by all means feel very welcome to go ahead and take over. 🙏 Re the NumPy 1 vs 2 issue, I think it's fine to just use the most recent NumPy version provided by the Pyodide distribution. Right now we are mainly interested to run our test suite with Pyodide and add JupyterLite support in our dev docs.

@agriyakhetarpal
Copy link

@agriyakhetarpal, please by all means feel very welcome to go ahead and take over. 🙏 Re the NumPy 1 vs 2 issue, I think it's fine to just use the most recent NumPy version provided by the Pyodide distribution. Right now we are mainly interested to run our test suite with Pyodide and add JupyterLite support in our dev docs.

Thank you for the response, @lagru. I had put this aside for a while after receiving the above unresolved symbol s_cmp error, but I am happy to restart debugging and work towards a solution. A potential solution that @rgommers and I had thought of—which had also come up in our discussion while doing the out-of-tree builds for Matplotlib recently—was to test scikit-image inside a Node.js server instead of a Pyodide-managed virtual environment; it may not fix the issue at hand, but it is still worth trying out. I will open up a draft PR as soon as possible for visibility-related purposes, to supersede this one.

@lagru
Copy link
Member Author

lagru commented Mar 19, 2024

Superseded by #7350.

@lagru lagru closed this Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖 type: Infrastructure CI, packaging, tools and automation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants