-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnivariateSpline gives varying results when multithreaded on Windows 10 #11828
Comments
How is gh-4470 related? |
Wrong link, sorry! It's 4770, easy typo to make. |
It might be the that the Fortran compilers used to build FITPACK have different default settings on Linux and Windows. Fortran subroutines are generally not reentrant, because of things like implicit SAVE attributes. This is also why recursive functions in Fortran must be declared RECURSIVE. Unless we know for certain that the Fortran compiler will produce reentrant binaries, we must protect all calls to Fortran with a global lock (preferably not the GIL, but one unique to the module). However, in SciPy we mostly assume that a Fortran subroutine will be reentrant, and we do so by marking it “threadsafe” in f2py. This is an incorrect assumption, unless we have full control over the compiler and its settings. For example, ifort has this misfeature turned on by default, unless we compile with one of the options -auto, -qopenmp, or -recursive, and the -reentrancy option must be used to specify the runtime library to use: gfortran requires -frecursive to avoid this issue: If I am correct, this is a general problem with Fortran in SciPy, not just affecting FITPACK, but also e.g. ODEPACK, FFTPACK, and maybe even LAPACK (at least if we use OpenBLAS). Last time I suggested to protect all Fortran with locks I was ruled down, because we were going to fix it by ensuring the compilers did not use implicit SAVE. However, I do not think anything has actually been done to avoid it. This misfeature of Fortran is detrimental to any modern system with multithreading. It made sence to have this in the 1970s, because of limited memory on the computers, and it did not matter when all parallel computing was done with MPI, but now it can cause havoc on any system that uses multithreading. It must be taken more serious. If I am correct, the problem was introduced here: Solution? At least we must make sure -recursive is passed to ifort and -frecursive is passed to gfortran. |
Ok, does the problem disappear if gh-4770 is reverted? |
If it's really the gfortran recursive stuff, more robust solution than
(compiler-dependent) flags is to just add `recursive` to the subroutine
declarations since we're not stuck with f77.
|
@pv‘s solution will work, independent of the compiler, but we might have to tag every Fortran subroutine or function with “recursive”. Chances are this static allocation only happens with automatic arrays and not local variables. However, who wants to sit down and read through and edit all the legacy Fortran sources? Perhaps we can have the build scripts do this automatically, instead of manually patching all the Fortran sources? I don’t know. Personally I would just have distutils add the correct compiler flags for the most common Fortran compilers. But yes, there are many ways to solve this, including what @pv suggested. (And we must not forget about LAPACK, when compiled with OpenBLAS or ATLAS. It could be affected as well, and we get it in through the backdoor.) |
UnivariateSpline gives slightly varying results when
map
ed in parallel using ThreadPoolExecutor. This does not happen with python'smap
or ProcessPoolExecutor. This happens on Windows (tested with 10), and does not happen on Ubuntu.In the following example I compare performing a
UnivariateSpline
on 30 identical arrays of the first quarter of a sine curve. As seen from the bottom-left figure, particularly the beginning and end of the plot varies significantly from array to array. On the equivalent plots usingmap
andProcessPoolExecutor
, this does not happen.If I replace UnivariateSpline with
interp1d
, also fromscipy.interpolation
, it does not give the strange result.Reproducing code example:
Scipy/Numpy/Python version information:
The text was updated successfully, but these errors were encountered: