You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, scipy.linalg only supports the use of NumPy arrays. Work towards interoperability has the potential to bring large performance improvements now and, with adoption of the array API standard, will bring compatibility with any future library which complies with the standard.
linalg is an array API standard extension, which means that supporting it will bring automatic dispatching for all standard functions. array-api-compat has implemented linalg wrappers for numpy, cupy and torch, so that automatic dispatching is available today.
For those functions which use compiled code and are not in the standard, non-CPU device support will be limited to start off with. From the RFC:
When a non-NumPy array type sees compiled code in SciPy (which tends to use the NumPy C API), we have a couple of options:
dispatch back to the other library (PyTorch, CuPy, etc.).
convert to a NumPy array when possible (i.e., on CPU via the buffer protocol, DLPack, or array), use the compiled code in question, then convert back.
don't support this at all
I'll note that (1) is the long-term goal; how to implement this is out of scope for this proposal - for more on that, see the Appendix section. For now we choose to do (2) when possible, and (3) otherwise. Switching from that approach to (1) in the future will be backwards-compatible.
This does not mean that GPU support is off-limits for all of these functions right now. Firstly, if we can write non-standard functions in terms of standard functions, then they will become compatible with all compliant libraries. Secondly, we have the option to call to CuPy and PyTorch for some functions. While this is not the general, library-agnostic solution of (1) - the "long-term goal" - it may be worth doing this to unlock GPU support for some important functions.
Describe the solution you'd like.
A. Conform to the standard extension.
This should be (almost) possible in one PR, similar to scipy#19005 in fft. This involves validating array inputs with array_namespace and using xp.linalg, if it is available, for all standard functions. If it is not available, a conversion to NumPy is attempted. For NumPy arrays, the existing SciPy implementation is used.
For completeness and ease of use, I propose to add the standard functions which are not currently in scipy. Therefore, if a user wants to write array-agnostic code using the standard extension, they can just use SciPy, without going back to the array library for the functions. This will not take many lines of code as it is just dispatching to xp.linalg, or attempting to use functions from np.
I say almost possible since, to fully conform, the standard requires that functions on matrices support “batching” (i.e., the ability to perform the operation over a “stack” of matrices). There is also a discrepancy between the default mode of qr (full/complete vs economical/reduced) in the APIs. This, and potentially other similar issues, may be unsuitable to address here. These issues are really details of the API though, and lots of interoperability will still be achieved, despite not completely conforming immediately.
B. Re-write some functions to be array-agnostic.
Any function which does not use compiled-code can be included here. Furthermore, any functions which can be rewritten in terms of standard functions will be included. There may be overlap here with the work to convert functions from Fortran to other languages (scipy#18566), in the case that anything is converted to pure Python. As a result, this point may become multiple PRs, with future work coming as more compiled code is eliminated.
This (first PR) will look similar to the cluster bits of scipy#18668, and a follow-up PR to come for fft.
C. Call to CuPy and PyTorch.
We could also add calls to CuPy and PyTorch for any functions which they share with SciPy, and to which points A and B do not apply. This would ultimately be superseded by a dispatching mechanism, but there are no active plans for that right now.
Which functions in each PR?
For A, the standard functions which are in SciPy are: cholesky, det, eigh, eighvalsh, inv, pinv, qr, solve, svd and svdvals.
The standard functions which would be added are: cross, diagonal, matmul, matrix_norm (can use SciPy's norm), matrix_power, matrix_rank, matrix_transpose, outer, slogdet, tensordot, trace, vecdot, vector_norm (can use SciPy's norm).
These functions to be added all either have implementations in np or np.linalg, or are aliases of other standard functions, so they will immediately work with NumPy arrays.
For B, the Special Matrices functions look doable. There may be others, but I haven't had a thorough look through.
For C, some functions which have been suggested are solve_triangular, lu, lu_solve and lstsq.
Describe alternatives you've considered.
The number of PRs to split this work into is undecided. If there are multiple PRs, the order of them is also undecided.
If this all looks reasonable, I plan work on a PR for A. The main question for this PR would be whether or not to touch non-standard functions. The advantage of this would be to have consistent validation with array_namespace and behaviour with arrays coercible by np.asarray, across the whole submodule. The disadvantage of this would be additional review overhead, where the same functions would be revisited in PRs for B and C. I have no preference, so will do whatever the maintainers prefer. Regardless of when the work occurs, I think that it makes sense to have this consistent validation and behaviour eventually.
Overall, I am not familiar with scipy.linalg yet, so I may be missing a lot! Please feel free to give any corrections or suggestions.
Is your feature request related to a problem? Please describe.
Plans for addressing scipy#18286 in
scipy.linalg
.Currently,
scipy.linalg
only supports the use of NumPy arrays. Work towards interoperability has the potential to bring large performance improvements now and, with adoption of the array API standard, will bring compatibility with any future library which complies with the standard.linalg
is an array API standard extension, which means that supporting it will bring automatic dispatching for all standard functions.array-api-compat
has implementedlinalg
wrappers fornumpy
,cupy
andtorch
, so that automatic dispatching is available today.For those functions which use compiled code and are not in the standard, non-CPU device support will be limited to start off with. From the RFC:
This does not mean that GPU support is off-limits for all of these functions right now. Firstly, if we can write non-standard functions in terms of standard functions, then they will become compatible with all compliant libraries. Secondly, we have the option to call to CuPy and PyTorch for some functions. While this is not the general, library-agnostic solution of (1) - the "long-term goal" - it may be worth doing this to unlock GPU support for some important functions.
Describe the solution you'd like.
A. Conform to the standard extension.
This should be (almost) possible in one PR, similar to scipy#19005 in
fft
. This involves validating array inputs witharray_namespace
and usingxp.linalg
, if it is available, for all standard functions. If it is not available, a conversion to NumPy is attempted. For NumPy arrays, the existing SciPy implementation is used.For completeness and ease of use, I propose to add the standard functions which are not currently in
scipy
. Therefore, if a user wants to write array-agnostic code using the standard extension, they can just use SciPy, without going back to the array library for the functions. This will not take many lines of code as it is just dispatching toxp.linalg
, or attempting to use functions fromnp
.I say almost possible since, to fully conform, the standard requires that functions on matrices support “batching” (i.e., the ability to perform the operation over a “stack” of matrices). There is also a discrepancy between the default mode of
qr
(full
/complete
vseconomical
/reduced
) in the APIs. This, and potentially other similar issues, may be unsuitable to address here. These issues are really details of the API though, and lots of interoperability will still be achieved, despite not completely conforming immediately.B. Re-write some functions to be array-agnostic.
Any function which does not use compiled-code can be included here. Furthermore, any functions which can be rewritten in terms of standard functions will be included. There may be overlap here with the work to convert functions from Fortran to other languages (scipy#18566), in the case that anything is converted to pure Python. As a result, this point may become multiple PRs, with future work coming as more compiled code is eliminated.
This (first PR) will look similar to the
cluster
bits of scipy#18668, and a follow-up PR to come forfft
.C. Call to CuPy and PyTorch.
We could also add calls to CuPy and PyTorch for any functions which they share with SciPy, and to which points A and B do not apply. This would ultimately be superseded by a dispatching mechanism, but there are no active plans for that right now.
Which functions in each PR?
For A, the standard functions which are in SciPy are:
cholesky
,det
,eigh
,eighvalsh
,inv
,pinv
,qr
,solve
,svd
andsvdvals
.The standard functions which would be added are:
cross
,diagonal
,matmul
,matrix_norm
(can use SciPy'snorm
),matrix_power
,matrix_rank
,matrix_transpose
,outer
,slogdet
,tensordot
,trace
,vecdot
,vector_norm
(can use SciPy'snorm
).These functions to be added all either have implementations in
np
ornp.linalg
, or are aliases of other standard functions, so they will immediately work with NumPy arrays.For B, the Special Matrices functions look doable. There may be others, but I haven't had a thorough look through.
For C, some functions which have been suggested are
solve_triangular
,lu
,lu_solve
andlstsq
.Describe alternatives you've considered.
The number of PRs to split this work into is undecided. If there are multiple PRs, the order of them is also undecided.
If this all looks reasonable, I plan work on a PR for A. The main question for this PR would be whether or not to touch non-standard functions. The advantage of this would be to have consistent validation with
array_namespace
and behaviour with arrays coercible bynp.asarray
, across the whole submodule. The disadvantage of this would be additional review overhead, where the same functions would be revisited in PRs for B and C. I have no preference, so will do whatever the maintainers prefer. Regardless of when the work occurs, I think that it makes sense to have this consistent validation and behaviour eventually.Overall, I am not familiar with
scipy.linalg
yet, so I may be missing a lot! Please feel free to give any corrections or suggestions.References
cluster
: ENH: add machinery to support Array API scipy/scipy#18668fft
to the standard: ENH: fft: support array API standard scipy/scipy#19005linalg
Extension: https://data-apis.org/array-api/latest/extensions/linear_algebra_functions.htmlThe text was updated successfully, but these errors were encountered: