Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add out-of-tree Pyodide builds in CI for scikit-image #7350

Open
wants to merge 39 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
12a2aec
Test emscripten workflow inspired by NumPy
lagru Dec 11, 2023
139c62d
Remove NumPy specific configuration
lagru Dec 12, 2023
7969f4a
Try out pyodide 0.25.0a2
lagru Dec 21, 2023
1fec286
Use correct file name for test requirements
lagru Dec 21, 2023
5cc0762
Specify test requirements without missing asv
lagru Dec 21, 2023
dc1e3bc
Remove unnecessary emscripten.meson.cross
lagru Dec 21, 2023
929dbcf
Account for unsupported subprocess
lagru Dec 21, 2023
7a90748
Update Emscripten build workflow and actions
agriyakhetarpal Feb 29, 2024
0dc7431
Change and reformat all `.npy` files to `.npz`
agriyakhetarpal Feb 29, 2024
01c61ba
Skip downloading files from pooch for now
agriyakhetarpal Feb 29, 2024
360e48a
Add JS script to run `pytest` test suite
agriyakhetarpal Mar 19, 2024
12f0bd4
Run and debug JS script for Emscripten job
agriyakhetarpal Mar 19, 2024
b529673
Run parallelized tests in serial mode on WASM
agriyakhetarpal Mar 19, 2024
2adf430
Skip test that requires reading from HTTP on WASM
agriyakhetarpal Mar 19, 2024
9afa53a
Skip test for `@run_in_parallel` decorator
agriyakhetarpal Mar 19, 2024
fa3154e
Add threading import and usage guards
agriyakhetarpal Mar 19, 2024
ee9dbc2
Skip all Matplotlib-related tests
agriyakhetarpal Mar 19, 2024
46fa541
Move `IS_WASM` check to correct location
agriyakhetarpal Mar 19, 2024
2c00ec5
Properly guard threading for multiscale features
agriyakhetarpal Mar 19, 2024
fe1777d
Set Agg Matplotlib backend via environment variable
agriyakhetarpal Mar 19, 2024
b55ae6f
Add a patch to fix NumPy array FS access
agriyakhetarpal Mar 19, 2024
24598b7
Apply patch to Emscripten SDK at build time
agriyakhetarpal Mar 20, 2024
c995211
Revert "Change and reformat all `.npy` files to `.npz`"
agriyakhetarpal Mar 20, 2024
c587e5c
Fix patch
agriyakhetarpal Mar 20, 2024
be525a4
Perform some cleanups
agriyakhetarpal Mar 20, 2024
f466193
Try using Agg backend programmatically
agriyakhetarpal Mar 20, 2024
e4cc188
Merge branch 'main' into add-emscripten-builds-out-of-tree
agriyakhetarpal Mar 25, 2024
37364b2
Merge branch 'main' into add-emscripten-builds-out-of-tree
agriyakhetarpal Mar 31, 2024
d02105b
Bump Pyodide version
agriyakhetarpal Mar 31, 2024
88903e2
Merge branch 'main' into add-emscripten-builds-out-of-tree
agriyakhetarpal Apr 11, 2024
0b1c933
Rewrite statement control flow for `is_wasm` check
agriyakhetarpal Apr 15, 2024
19159a4
Add licensing details for skimage JS test runner
agriyakhetarpal Apr 15, 2024
174f3e7
Remove `MPLBACKEND` env var from test step
agriyakhetarpal Apr 16, 2024
1df1753
Merge branch 'main' into add-emscripten-builds-out-of-tree
agriyakhetarpal Apr 16, 2024
e94b1cd
Return even earlier in run_in_parallel on WASM
lagru Apr 17, 2024
cdd9830
Mock ThreadPoolExecutor on WASM
lagru Apr 17, 2024
06230bd
Merge branch 'main' into add-emscripten-builds-out-of-tree
agriyakhetarpal May 21, 2024
e177755
Merge branch 'main' into add-emscripten-builds-out-of-tree
agriyakhetarpal May 29, 2024
a6754d5
Bump to Pyodide version 0.26.0
agriyakhetarpal May 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
88 changes: 88 additions & 0 deletions .github/workflows/emscripten.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Copied from NumPy https://github.com/numpy/numpy/pull/25894
# https://github.com/numpy/numpy/blob/d2d2c25fa81b47810f5cbd85ea6485eb3a3ffec3/.github/workflows/emscripten.yml
#

name: Test Emscripten/Pyodide build

on:
pull_request:
branches:
- main
- maintenance/**
# TODO: refine after this is ready to merge
push:
# branches:
# - main
# - maintenance/**
workflow_dispatch:

env:
FORCE_COLOR: 3

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

permissions:
contents: read # to fetch code (actions/checkout)

jobs:
build-wasm-emscripten:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is only building and testing and not uploading the wheel somewhere, correct?

It's fine if we keep that out of scope for this PR but I'd still like to talk about how we might include the wheel in our doc building workflow?

We might also upload it as part of nightly wheel building workflow similar to numpy.

In case we decide tackle this, may we ping you for support? :D

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now this is only building and testing and not uploading the wheel somewhere, correct?

It's fine if we keep that out of scope for this PR but I'd still like to talk about how we might include the wheel in our doc building workflow?

Yes, that is correct. The doc builds workflow(s) aspect is still something I am experimenting with right now, but it has been completed for SciPy (scipy/scipy#20019) and PyWavelets (PyWavelets/pywt#728). The SciPy one is disabled right now, but the PyWavelets one is available and can be accessed on the PyWavelets documentation website at any of the examples under https://pywavelets.readthedocs.io/en/latest/ref/index.html. The issue we have at hand here is how to provide these wheels pre-installed into a JupyterLite-compatible kernel through either a custom Pyodide distribution or through any other mechanism, which I am actively working towards.

We might also upload it as part of nightly wheel building workflow similar to numpy.

In case we decide tackle this, may we ping you for support? :D

Absolutely! I would be more than happy to make the changes myself in a follow-up PR, considering that this would be beneficial for the documentation builds mentioned above – the plan will be to use these wheels, after they have been built, but only in the latest versions of the docs (the stable versions may use a pinned wheel from the Pyodide CDN), however, this is preliminary and I suspect things will change. I hope I can trouble you for a review on another PR soon after this one ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect. Then let's get to that after merging this PR. :D

name: Build scikit-image distribution for Pyodide
runs-on: ubuntu-22.04
# To enable this workflow on a fork, comment out:
# if: github.repository == 'scikit-image/scikit-image'
env:
PYODIDE_VERSION: 0.25.0
# PYTHON_VERSION and EMSCRIPTEN_VERSION are determined by PYODIDE_VERSION.
# The appropriate versions can be found in the Pyodide repodata.json
# "info" field, or in Makefile.envs:
# https://github.com/pyodide/pyodide/blob/main/Makefile.envs#L2
PYTHON_VERSION: 3.11.3
EMSCRIPTEN_VERSION: 3.1.46
NODE_VERSION: 18
steps:
- name: Checkout scikit-image
uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1

- name: Set up Python ${{ env.PYTHON_VERSION }}
id: setup-python
uses: actions/setup-python@0a5c61591373683505ea898e09a3ea4f39ef2b9c # v5.0.0
with:
python-version: ${{ env.PYTHON_VERSION }}

- name: Set up Emscripten toolchain
uses: mymindstorm/setup-emsdk@6ab9eb1bda2574c4ddb79809fc9247783eaf9021 # v14
with:
version: ${{ env.EMSCRIPTEN_VERSION }}
actions-cache-folder: emsdk-cache

# We need to apply a patch to the Emscripten SDK to work around a bug in
# the file system API. This patch is already applied in the Emscripten SDK
# upstream, but we need to apply it here because we are using an older
# version of the Emscripten SDK with Pyodide.
- name: Apply necessary patch(es)
run: |
patch -d $EMSDK/upstream/emscripten/ -p1 < tools/emscripten/patches/0001-fix-npy-file-access.patch

- name: Install pyodide-build
run: pip install "pydantic<2" pyodide-build==${{ env.PYODIDE_VERSION }}

- name: Build scikit-image for Pyodide
run: |
pyodide build

- name: Set up Node.js
uses: actions/setup-node@60edb5dd545a775178f52524783378180af0d1f8 # v4.0.2
with:
node-version: ${{ env.NODE_VERSION }}

- name: Set up Pyodide virtual environment and test scikit-image for Pyodide
env:
# Avoid using the default TkAgg backend, which requires GUI support
MPLBACKEND: "Agg"
run: |
pyodide venv .venv-pyodide
source .venv-pyodide/bin/activate
npm install pyodide@${{ env.PYODIDE_VERSION }}
node tools/emscripten/pytest_scikit_image_pyodide.js --pyargs skimage
2 changes: 2 additions & 0 deletions skimage/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,8 @@ def _raise_build_error(e):
)
except FileNotFoundError:
pass
except OSError:
pass # If skimage is built with emscripten which does not support processes
else:
out, err = p.communicate()
if p.returncode == 0:
Expand Down
4 changes: 4 additions & 0 deletions skimage/_shared/_dependency_checks.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
from .version_requirements import is_installed
import sys
import platform

has_mpl = is_installed("matplotlib", ">=3.3")

is_wasm = (sys.platform == "emscripten") or (platform.machine() in ["wasm32", "wasm64"])
37 changes: 24 additions & 13 deletions skimage/_shared/testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@
"""

import os
import platform
import re
import struct
import threading
import sys
import functools
import inspect
from tempfile import NamedTemporaryFile
Expand All @@ -32,6 +33,7 @@
from ..data._fetchers import _fetch
from ..util import img_as_uint, img_as_float, img_as_int, img_as_ubyte
from ._warnings import expected_warnings
from ._dependency_checks import is_wasm

import pytest

Expand All @@ -44,7 +46,6 @@

SKIP_RE = re.compile(r"(\s*>>>.*?)(\s*)#\s*skip\s+if\s+(.*)$")


# true if python is running in 32bit mode
# Calculate the size of a void * pointer in bits
# https://docs.python.org/3/library/struct.html
Expand Down Expand Up @@ -322,12 +323,17 @@ def fetch(data_filename):
pytest.skip(f'Unable to download {data_filename}', allow_module_level=True)


# Ref: about the lack of threading support in WASM, please see
# https://github.com/pyodide/pyodide/issues/237
def run_in_parallel(num_threads=2, warnings_matching=None):
"""Decorator to run the same function multiple times in parallel.

This decorator is useful to ensure that separate threads execute
concurrently and correctly while releasing the GIL.

It is currently skipped when running on WASM-based platforms, as
the threading module is not supported.

Parameters
----------
num_threads : int, optional
Expand All @@ -346,19 +352,24 @@ def run_in_parallel(num_threads=2, warnings_matching=None):
def wrapper(func):
@functools.wraps(func)
def inner(*args, **kwargs):
with expected_warnings(warnings_matching):
threads = []
for i in range(num_threads - 1):
thread = threading.Thread(target=func, args=args, kwargs=kwargs)
threads.append(thread)
for thread in threads:
thread.start()

if not is_wasm:
with expected_warnings(warnings_matching):
threads = []
for i in range(num_threads - 1):
import threading

thread = threading.Thread(target=func, args=args, kwargs=kwargs)
agriyakhetarpal marked this conversation as resolved.
Show resolved Hide resolved
threads.append(thread)
for thread in threads:
thread.start()

func(*args, **kwargs)

for thread in threads:
thread.join()
else:
func(*args, **kwargs)

for thread in threads:
thread.join()

return inner

return wrapper
Expand Down
2 changes: 2 additions & 0 deletions skimage/_shared/tests/test_testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
assert_stacklevel,
)
from skimage._shared import testing
from skimage._shared._dependency_checks import is_wasm

from skimage._shared._warnings import expected_warnings
from warnings import warn
Expand Down Expand Up @@ -81,6 +82,7 @@ def __init__(self):
doctest_skip_parser(c)


@pytest.mark.skipif(is_wasm, reason="Cannot start threads in WASM")
def test_run_in_parallel():
state = []

Expand Down
38 changes: 28 additions & 10 deletions skimage/feature/_basic_features.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
from itertools import combinations_with_replacement
import itertools
import sys
import platform
import numpy as np
from skimage import filters, feature
from skimage.util.dtype import img_as_float32
from concurrent.futures import ThreadPoolExecutor


def _texture_filter(gaussian_filtered):
Expand Down Expand Up @@ -72,6 +73,13 @@ def _mutiscale_basic_features_singlechannel(
features : list
List of features, each element of the list is an array of shape as img.
"""
# Guard threading import for Emscripten
threading_available = True
if (sys.platform == "emscripten") or (platform.machine() in ["wasm32", "wasm64"]):
threading_available = False
else:
from concurrent.futures import ThreadPoolExecutor

# computations are faster as float32
img = np.ascontiguousarray(img_as_float32(img))
if num_sigma is None:
Expand All @@ -83,16 +91,26 @@ def _mutiscale_basic_features_singlechannel(
base=2,
endpoint=True,
)
with ThreadPoolExecutor(max_workers=num_workers) as ex:
out_sigmas = list(
ex.map(
lambda s: _singlescale_basic_features_singlechannel(
img, s, intensity=intensity, edges=edges, texture=texture
),
sigmas,
if threading_available:
with ThreadPoolExecutor(max_workers=num_workers) as ex:
lagru marked this conversation as resolved.
Show resolved Hide resolved
out_sigmas = list(
ex.map(
lambda s: _singlescale_basic_features_singlechannel(
img, s, intensity=intensity, edges=edges, texture=texture
),
sigmas,
)
)
)
features = itertools.chain.from_iterable(out_sigmas)
features = itertools.chain.from_iterable(out_sigmas)
# assumed threading is not available, so we use a serial version
else:
out_sigmas = [
_singlescale_basic_features_singlechannel(
img, s, intensity=intensity, edges=edges, texture=texture
)
for s in sigmas
]
features = itertools.chain.from_iterable(out_sigmas)
return features


Expand Down
6 changes: 6 additions & 0 deletions skimage/feature/tests/test_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ def test_mask_border_keypoints():
)
def test_plot_matches(shapes):
from matplotlib import pyplot as plt
from matplotlib import use

use('Agg')
Comment on lines +71 to +73
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the MPLBACKEND environment variable set inside the workflow, it didn't work and the tests for Matplotlib were failing – I would suggest keeping this change here (and in other places).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't have as much context as @lagru here, but for what it's worth: in SciPy we set the backend to Agg for testing, and it's my understanding that that in general is a robust strategy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really follow why use('Agg') fixes things and MPLBACKEND="Agg" doesn't. E.g. for be525a4, pytest's error log already indicates that Agg is used.

This is before you try to use Agg programmatically

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand it enough either (perhaps Node.js is not properly letting environment variables pass to the Python interpreter). But sure, let me try to revert this change in a separate commit, again, at the end of addressing the review and we can see if things break in an isolated state. Either way, keeping just one of these options would be better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done a bit of reading about this as someone who does not have much JavaScript experience, and I think what is happening here is that Pyodide runs Python code within the browser environment (or in a different environment if a browser is not present), isolated from the Node.js process that spawns the Pyodide interpreter – therefore, it doesn't directly inherit environment variables from Node.js.
It could be tricky to debug further but I'm happy to do so, so I'll remove the MPLBACKEND environment variable from the workflow file for now, given that the programmatic approach with matplotlib.use will be beneficial for not only these Pyodide-specific tests, but for the general test suite on regular supported platforms too. Edit: removed in 174f3e7.


fig, ax = plt.subplots(nrows=1, ncols=1)

Expand Down Expand Up @@ -112,6 +115,9 @@ def test_plot_matches(shapes):
)
def test_plot_matched_features(shapes):
from matplotlib import pyplot as plt
from matplotlib import use

use('Agg')

fig, ax = plt.subplots()

Expand Down
2 changes: 2 additions & 0 deletions skimage/io/tests/test_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

from skimage import io
from skimage._shared.testing import assert_array_equal, fetch
from skimage._shared._dependency_checks import is_wasm
from skimage.data import data_dir


Expand Down Expand Up @@ -47,6 +48,7 @@ def test_imread_file_url():
assert image.shape == (512, 512)


@pytest.mark.skipif(is_wasm, reason="no access to pytest-localserver")
agriyakhetarpal marked this conversation as resolved.
Show resolved Hide resolved
def test_imread_http_url(httpserver):
# httpserver is a fixture provided by pytest-localserver
# https://bitbucket.org/pytest-dev/pytest-localserver/
Expand Down
3 changes: 3 additions & 0 deletions skimage/io/tests/test_mpl_imshow.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@

plt = pytest.importorskip("matplotlib.pyplot")

if plt:
plt.switch_backend("Agg")


def setup():
io.reset_plugins()
Expand Down