-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: linalg: array library interoperability #19068
Comments
I am a bit on the fringes for this but having some familiarity with linalg library, in my opinion array API is focusing on the wrong functionality to start with. The main argument is the following; In this day and age, the algorithms are kinda sorta established. There are not many ways to do LU factorization or cholesky implementation. The compiled code speed varies however the algorithms are quite static and very rarely something else comes up. The performance since the last decade lies in the structure discovery and dispatching to the correct flavor of the algorithms. In fact this is the key to many algorithms that you quickly run over the array to discover structure and then dispatch to triangular, hermitian, hessenberg, bidiagonal etc. This is why we introduced However on the other hand, these have to be as fast as possible to get rid of the additional overhead. In other words, We already suffer from this with Unfortunately we also send things to LAPACK internally and that also adds Therefore the utmost important things we need to see is these helper function implementations to support array API. Otherwise this is just going to be a case-switch for if this library, use this function because these are going to be used extensively inside Especially, To that end we already started some limited effort to remove Fortran code as much as we can in SciPy in #18566, like you mentioned. And once we are close to having those removed we hope to switch to more compiled code without obfuscating the code too much. |
Re: performance, do we not have code paths that traverse DGEMM/
This decision strikes me as less obvious--my guess would have been that we'd try to improve the performance of our existing |
I think that is a good point. I see you moved that conversation to data-apis/array-api#675, so let's keep this part of the conversation there.
Yes, we should be careful here to not interfere with the efforts of converting Fortran code to Python/Cython/etc. I think it's a matter of @lucascolley avoiding to touch those functions. And in case there's overlap, let's help out and hopefully push the Fortran conversion forward faster.
I think you bring up a number of good points around performance of functions in
There is indeed. Even on CPU, some of the merged scikit-learn work showed that there's significant performance gains with PyTorch (simply comparing in envs that install wheels from PyPI), because it parallellizes more and builds again MKL, while we bundle OpenBLAS which is significantly slower on average. |
This is a good point and I agree that this wouldn't be the obvious choice. I still think that there would be some benefit to providing all of the standard functionality under the SciPy umbrella, but this is certainly not something that the standard suggests that consuming libraries should do. It depends on the magnitude of the drawbacks. It certainly isn't much work to add these functions. There may be large concerns in terms of maintenance or API-bloat, in which case it totally makes sense to not add these. |
Indeed, but the part that I don't understand is the Array API definitions probably. Because there is I am not even really sure I understand correctly why Array API has a linalg section. |
That is correct. But it gets slightly more complicated for matrix decompositions and so on. SciPy doesn't have much of array manipulations or ufuncs that operate over the array. Hence matmul's or fast transpose add's etc. are probably within the array vendor functionality catalogue (or should be). We typically send it to somewhere (compiled code land) and get the result. But linalg section is confusing me in that vendors are also providing linalg libs or what SciPy's role would be. Hence fully agree with your remark
|
The former; they are functions of the namespace ( For a slightly more concise example than xp = array_namespace(a)
if is_numpy(xp):
return _inv(a, overwrite_a=overwrite_a, check_finite=check_finite)
if overwrite_a:
raise ValueError(arg_err_msg("overwrite_a"))
if not check_finite:
raise ValueError(arg_err_msg("check_finite"))
if hasattr(xp, 'linalg'):
return xp.linalg.inv(a)
a = np.asarray(a)
return xp.asarray(_inv(a)) (Support for parameters which are not in the standard could be looked into, but here we raise if a non-standard parameter is provided with a non-NumPy array.) |
I am feeling a bit lost again. This is what we discussed extensively in the uarray proposal too. SciPy already has a linalg library. Why would I mediate another library array going into another library linalg function? Whoever is calling can also use Let's forget about this linalg part; Array API, as I understand it, has Array standardization right. All vendors agree that arrays should have x, y, z properties and A, B, C methods. That's all cool. Then if the consumers like SciPy, when they are implementing stuff, they can follow the standard if they want to support these vendors through Array API. So I keep an eye on the standard, does it have Transpose? yes, cool. Does it have something like norm? Nice yes cool again. Does it have isfinite? No? OK then I need to find a way to do that. and so on. And independent from where it is coming from I can push it to But calling other linalg libraries makes no sense to me. This is exactly where we left off the previous discussion and seems like we are getting back to it again. |
One reason to have With our official |
I fully agree but why is this taking place under SciPy code? Let me put it another way, is every consumer is going to jump through the hoops for torch or jax etc. basically all edge cases just to implement array api compatible code? I think this is something I am not able to convey the message clearly. In other words, the consumer is orchestrating the code path. This doesn't seem like an array API but a separate linalg API. I don't think any other library would actually do this but only SciPy. This is why I am typically emphasizing the need for a consumer model before scipy gets into this. |
I agree that it is likely that only |
So if we enumerate them
The only remaining option is that
Because I really don't see the benefit of it for SciPy (yet!). And frankly, I don't understand why SciPy should be the arbiter of this to start with. I am full in for the Array API if I understand it correctly. # xp = array_namespace(a)
arr = xp.array([1, 2, 3], [4, 5, 6]) # Whatever xp is
# following just works because the array has all the necessary
# methods used in the inv implementation
# And SciPy tries its best to use only those limited set of
# functionalities from Array API
arr_inv = scipy.linalg.inv(arr) This is clearly a superior standardization that generalizes the xp = array_namespace(a)
arr = xp.array([1, 2, 3], [4, 5, 6]) # Whatever xp is
# some code that checks if the kwargs are compatible
....
#then the dance we mentioned above
if numpy:
arr_inv = scipy.linalg.inv(arr)
elif it is coming from some xp with linalg:
call their linalg
elif gpu stuff:
perform some host device dance
....
else:
TypeError("don't know what to do with this input") Why does SciPy has to host what other libraries do while other libraries don't need to care anything SciPy does? In other words, why should SciPy has to adopt this? It gains nothing other than a tiny bit of overhead and quite some if-else switches, because its most use case is NumPy (overall speaking). Moreover, SciPy binds itself to this switching code and its maintenance with 100% dependency to other libraries' implementation velocity. Have a look at Tyler's welch implementation in #19001. I consider it pretty convoluted as per my remarks. And also it seems like we didn't settle on the basic functionality either (as one piece of anectodal evidence; data-apis/array-api#627) Probably I am not able to peek from your perspective regarding the unique positions because this standard's goal is to remove that uniqueness (if I understand correctly) hence I would appreciate a bit more motivation to do this within SciPy. My current understanding is that linalg extension of this API is over-reaching to implementers and consumers which should not be the case. Gaining great performance just by shifting to GPU is great but if we are going to orchestrate the code such that it goes to another library then it is much easier to use that library already with its linalg extensions. It is very much the same behavior if we don't change anything and let the Actually I am looking at ways to do array api conforming |
My understanding is that the proposal being made in this issue (the one I labeled as the best bang for buck) is exactly the one that enables the first code example that you mark out as superior. The dance is hidden inside of |
In case I didn't emphasize it, let me clarify; I consider SciPy to be a consumer and thus we, too, should not do this dance. I guess that's crucial point. In the terminology of other Array API issues, consumers are the array consumers not the end users. Or at least that's how I understood it. So there should be no dance in anywhere. My question is why SciPy is the arbiter of this dance at all if it is a consumer. |
Ask differently, who can implement linalg extension of Array API without this dance? |
Because of To highlight a point of agreement, at least between us two, I agree that we should not be doing backend-specific checks likes the ones you noted in #19001, so I'd strike the The proposal does not involve backend-specific checks other than checking for true Now, the ideal scenario would be that the |
Much clearer thank you for the clarification. I think I completely agree though certain details, I just don't have enough information about. But they are not much relevant for this discussion. So I am kinda sorta, thinking about the def sqrtm(A):
# Using the xp.linalg and np.linalg by their namespace to cut it short
if np:
q, z = scipy.linalg.qz(A)
elif xp:
# possibly different API
q, z, _ = xp.linalg.qz(A, some_kwarg=True)
else:
ValueError('Don't know how to handle A')
# then do bunch of stuff with q and z
# maybe call some cythonized code
q_2 = some_cythonized_func(q)
....
# Now say we need to call eigh
if np:
L, V = scipy.linalg.eigh(A, subset_by_index=[2, 5])
elif xp:
# possibly different API
L, V, _ = xp.linalg.eigh(A, some_kwarg=True)
# sorting is left open in the spec
# https://data-apis.org/array-api/latest/extensions/generated/array_api.linalg.eigh.html
L = xp.sort(L, descending=True)
else:
ValueError('Don't know how to handle A')
# more code possibly with more scipy.linalg/xp.linalg cases
....
return M Alternatively, we can collect things under big sub functions def sqrtm(A):
if np:
# scipy original implementation
q, z = scipy.linalg._sqrtm(A)
elif xp:
q, z, _ = xp.linalg.qz(A, some_kwarg=True)
# the array api compatible implementation
....
L, V, _ = xp.linalg.eigh(A, some_kwarg=True)
L = xp.sort(L, descending=True)
....
else:
ValueError('Don't know how to handle A')
return M This is what I can't wrap my head around yet. Seems to me that SciPy is selected as the common consumer to every vendor (well like you said, let's say we convinced folks about the GPU stuff but, Matt, at least, was impressed with the GPU results so that seems to be the killer feature). And we need to take care of array API quite comprehensively. Maybe we can move array API namespace inside linalg with matching signatures? Say |
I do think that a separate namespace of Among other reasons, keeping the I think my guiding principle for thinking about I will also note that my desire to remove things like |
The former may be worth considering I think, especially if it helps make a step in the direction of the long-standing issue of diverging behavior of
I don't really agree here. The key element of the proposed design (see gh-18286) is this sentence: The basic design principle I propose aiming for in all of SciPy for array libraries is: container type in == container type out. The reasoning here seems to not take that into account. The design has several goals/consequences:
The tentative argument here against changing For (2), Composability, let me also point out a scikit-learn example: https://github.com/scikit-learn/scikit-learn/blob/80c39a4966ee88150d7e5849ab21a8e17bb1fbc8/sklearn/decomposition/_pca.py#L545-L555. They'd like
Agreed with the spirit of this part - we should minimize |
Fair. I came into the discussion late, and the point of discussion at the time was whether or not the That said, for the rest of |
Over the past few weeks a lot of improvements in test infra were implemented, and @lucascolley's PR (gh-19005) for support in
@lucascolley I know there's a few
I do believe that there may be a case for a standalone library for this kind of thing. |
Thanks Robert, Ilhan, Ralf and Tyler for the discussion here! It looks like we've already answered some important questions, not least from my end. In tandem with opening my WIP for
To clarify on this point, I am no longer considering any API additions. I am currently writing a blog post which will describe the different changes required to convert a submodule. Testing aside, the changes are:
My PR will tackle the functions covered by (3). Future PRs will tackle (1) and (2). After or alongside this work, separate PRs could implement calling out to specific libraries like CuPy and PyTorch where (2) would otherwise be hit. This is what Matt has done for some functions in
This is another type of work which can happen alongside what is above. Since this is just SciPy internal, I don't think that we need to figure this out fully before starting work on the other parts. I hope that this helps to clarify things! I would recommend saving any specific implementation questions for my PR, but feel free to ask anything about the vision for conversion which I have laid out here. |
This can be found at https://labs.quansight.org/blog/scipy-array-api. |
Is your feature request related to a problem? Please describe.
Plans for addressing #18286 in
scipy.linalg
.Currently,
scipy.linalg
only supports the use of NumPy arrays. Work towards interoperability has the potential to bring large performance improvements now and, with adoption of the array API standard, will bring compatibility with any future library which complies with the standard.linalg
is an array API standard extension, which means that supporting it will bring automatic dispatching for all standard functions.array-api-compat
has implementedlinalg
wrappers fornumpy
,cupy
andtorch
, so that automatic dispatching is available today.For those functions which use compiled code and are not in the standard, non-CPU device support will be limited to start off with. From the RFC:
This does not mean that GPU support is off-limits for all of these functions right now. Firstly, if we can write non-standard functions in terms of standard functions, then they will become compatible with all compliant libraries. Secondly, we have the option to call to CuPy and PyTorch for some functions. While this is not the general, library-agnostic solution of (1) - the "long-term goal" - it may be worth doing this to unlock GPU support for some important functions.
Describe the solution you'd like.
A. Use the standard extension.
This should be (almost) possible in one PR, similar to #19005 in
fft
. This involves validating array inputs witharray_namespace
and usingxp.linalg
, if it is available, for all standard functions. If it is not available, a conversion to NumPy is attempted. For NumPy arrays, the existing SciPy implementation is used.B. Re-write some functions to be array-agnostic.
Any function which does not use compiled-code can be included here. Furthermore, any functions which can be rewritten in terms of standard functions will be included. There may be overlap here with the work to convert functions from Fortran to other languages (#18566), in the case that anything is converted to pure Python. As a result, this point may become multiple PRs, with future work coming as more compiled code is eliminated.
This (first PR) will look similar to the
cluster
bits of #18668, and a follow-up PR to come forfft
.C. Call to CuPy and PyTorch.
We could also add calls to CuPy and PyTorch for any functions which they share with SciPy, and to which points A and B do not apply. This would ultimately be superseded by a dispatching mechanism, but there are no active plans for that right now.
Which functions in each PR?
For A, the standard functions which are in SciPy are:
cholesky
,det
,eigh
,eighvalsh
,inv
,pinv
,qr
,solve
,svd
andsvdvals
.The standard functions which would be added are:
cross
,diagonal
,matmul
,matrix_norm
(can use SciPy'snorm
),matrix_power
,matrix_rank
,matrix_transpose
,outer
,slogdet
,tensordot
,trace
,vecdot
,vector_norm
(can use SciPy'snorm
).These functions to be added all either have implementations in
np
ornp.linalg
, or are aliases of other standard functions, so they will immediately work with NumPy arrays.For B, the Special Matrices functions look doable. There may be others, but I haven't had a thorough look through.
For C, some functions which have been suggested are
solve_triangular
,lu
,lu_solve
andlstsq
.Describe alternatives you've considered.
The number of PRs to split this work into is undecided. If there are multiple PRs, the order of them is also undecided.
If this all looks reasonable, I plan work on a PR for A. The main question for this PR would be whether or not to touch non-standard functions. The advantage of this would be to have consistent validation with
array_namespace
and behaviour with arrays coercible bynp.asarray
, across the whole submodule. The disadvantage of this would be additional review overhead, where the same functions would be revisited in PRs for B and C. I have no preference, so will do whatever the maintainers prefer. Regardless of when the work occurs, I think that it makes sense to have this consistent validation and behaviour eventually.Overall, I am not familiar with
scipy.linalg
yet, so I may be missing a lot! Please feel free to give any corrections or suggestions.References
cluster
: ENH: add machinery to support Array API #18668fft
to the standard: ENH: fft: support array API standard #19005linalg
Extension: https://data-apis.org/array-api/latest/extensions/linear_algebra_functions.htmlThe text was updated successfully, but these errors were encountered: