Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnivariateSpline gives varying results when multithreaded on Windows 10 #11828

Closed
thomasaarholt opened this issue Apr 9, 2020 · 7 comments · Fixed by #16053
Closed

UnivariateSpline gives varying results when multithreaded on Windows 10 #11828

thomasaarholt opened this issue Apr 9, 2020 · 7 comments · Fixed by #16053
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.interpolate
Milestone

Comments

@thomasaarholt
Copy link

UnivariateSpline gives slightly varying results when maped in parallel using ThreadPoolExecutor. This does not happen with python's map or ProcessPoolExecutor. This happens on Windows (tested with 10), and does not happen on Ubuntu.

In the following example I compare performing a UnivariateSpline on 30 identical arrays of the first quarter of a sine curve. As seen from the bottom-left figure, particularly the beginning and end of the plot varies significantly from array to array. On the equivalent plots using map and ProcessPoolExecutor, this does not happen.

spline

If I replace UnivariateSpline with interp1d, also from scipy.interpolation, it does not give the strange result.

Reproducing code example:

import numpy as np
import matplotlib.pyplot as plt
import scipy.interpolate
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

from os import cpu_count
def func(d):
    x = np.linspace(0, np.pi/2, len(d))
    spline = scipy.interpolate.UnivariateSpline(x, d)
    return spline(x)

if __name__ == '__main__':
    print(f"Processors available: {cpu_count()}")

    # parallel=True uses the following map function
    samples, length = 50, 2000

    x = np.linspace(0, np.pi/2, length)

    # data is 30 identical first quarters of a sine curve
    data = np.repeat(np.sin(x)[None, :], samples, axis=0)

    with ThreadPoolExecutor(max_workers=cpu_count()) as executor:
        thread = list(executor.map(func, data))

    with ProcessPoolExecutor(max_workers=cpu_count()) as executor:
        process = list(executor.map(func, data))
    
    regular = list(map(func, data))

    fig, AX = plt.subplots(nrows=2, ncols=2, figsize=(7, 7))

    ax1, ax2, ax3, ax4 = AX.flatten()
    ax1.set_title(f'{samples} identical sin(x)')
    ax2.set_title(f'{samples} splines on sin(x) using map')
    ax3.set_title('using ThreadPoolExecutor')
    ax4.set_title('using ProcessPoolExecutor')

    for i in range(samples):
        ax1.plot(x, data[i])
        ax2.plot(x, regular[i])
        ax3.plot(x, thread[i])
        ax4.plot(x, process[i])

    plt.tight_layout()
    #plt.savefig('spline.png', dpi=200)
    plt.show()

Scipy/Numpy/Python version information:

import sys, scipy, numpy; print(scipy.__version__, numpy.__version__, sys.version_info)
1.3.2 1.18.1 sys.version_info(major=3, minor=8, micro=2, releaselevel='final', serial=0)

@miladsade96 miladsade96 added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.interpolate labels Apr 9, 2020
@tjof2
Copy link

tjof2 commented Apr 11, 2020

See #4770 (related) and possibly #8856

@ev-br
Copy link
Member

ev-br commented Apr 11, 2020

How is gh-4470 related?

@tjof2
Copy link

tjof2 commented Apr 11, 2020

Wrong link, sorry! It's 4770, easy typo to make.

@sturlamolden
Copy link
Contributor

It might be the that the Fortran compilers used to build FITPACK have different default settings on Linux and Windows.

Fortran subroutines are generally not reentrant, because of things like implicit SAVE attributes. This is also why recursive functions in Fortran must be declared RECURSIVE. Unless we know for certain that the Fortran compiler will produce reentrant binaries, we must protect all calls to Fortran with a global lock (preferably not the GIL, but one unique to the module). However, in SciPy we mostly assume that a Fortran subroutine will be reentrant, and we do so by marking it “threadsafe” in f2py. This is an incorrect assumption, unless we have full control over the compiler and its settings.

For example, ifort has this misfeature turned on by default, unless we compile with one of the options -auto, -qopenmp, or -recursive, and the -reentrancy option must be used to specify the runtime library to use:
https://software.intel.com/en-us/forums/intel-fortran-compiler/topic/714808
https://software.intel.com/en-us/fortran-compiler-developer-guide-and-reference-reentrancy

gfortran requires -frecursive to avoid this issue:
https://gcc.gnu.org/onlinedocs/gfortran/Code-Gen-Options.html

If I am correct, this is a general problem with Fortran in SciPy, not just affecting FITPACK, but also e.g. ODEPACK, FFTPACK, and maybe even LAPACK (at least if we use OpenBLAS). Last time I suggested to protect all Fortran with locks I was ruled down, because we were going to fix it by ensuring the compilers did not use implicit SAVE. However, I do not think anything has actually been done to avoid it. This misfeature of Fortran is detrimental to any modern system with multithreading. It made sence to have this in the 1970s, because of limited memory on the computers, and it did not matter when all parallel computing was done with MPI, but now it can cause havoc on any system that uses multithreading. It must be taken more serious.

If I am correct, the problem was introduced here:
402a1ec

Solution? At least we must make sure -recursive is passed to ifort and -frecursive is passed to gfortran.

@ev-br
Copy link
Member

ev-br commented Apr 11, 2020

Ok, does the problem disappear if gh-4770 is reverted?

@pv
Copy link
Member

pv commented Apr 11, 2020 via email

@sturlamolden
Copy link
Contributor

@pv‘s solution will work, independent of the compiler, but we might have to tag every Fortran subroutine or function with “recursive”. Chances are this static allocation only happens with automatic arrays and not local variables. However, who wants to sit down and read through and edit all the legacy Fortran sources? Perhaps we can have the build scripts do this automatically, instead of manually patching all the Fortran sources? I don’t know. Personally I would just have distutils add the correct compiler flags for the most common Fortran compilers. But yes, there are many ways to solve this, including what @pv suggested.

(And we must not forget about LAPACK, when compiled with OpenBLAS or ATLAS. It could be affected as well, and we get it in through the backdoor.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.interpolate
Projects
Development

Successfully merging a pull request may close this issue.

7 participants