-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault in fsspmdm #805
Comments
x86 displacements are limited to 32-bit signed integers. But |
@hfp Are you okay to add a check limiting the total size ( |
this is in my eyes a hot fix. I would like to see where the bug is in the code gen, and we can easily fix the large displacement issue with SIB addressing mode. |
The main offender for ldb is: https://github.com/libxsmm/libxsmm/blob/main_stable/src/generator_spgemm_csr_asparse_reg.c#L723 (with the displacement being defined as: https://github.com/libxsmm/libxsmm/blob/main_stable/src/generator_spgemm_csr_asparse_reg.c#L597) while for ldc: https://github.com/libxsmm/libxsmm/blob/main_stable/src/generator_spgemm_csr_asparse_reg.c#L734 |
I am ok with it (also my 1st thought was to check the input). However, with Alex' fix this is not necessary except for hotfix. If support for the full/anticipated range keeps slipping, we can still deploy a range-check. |
So the most efficient means of supporting this is probably through using several registers for storing |
I observe that libxsmm_fsspmdm_create is giving a segfault when ldb and ldc are large. The cutoff ldb/ldc value for segfault seems to vary a bit with the size of the A matrix.
I managed to recreate the issue with the pyfr samples in the libxsmm repository. Below there are three examples of segfaults. In the first two cases A matrix sizes are roughly the same but they have different nnz. Halving the ldb/ldc for the first two results in a successful run, and both fail at ldb=ldc=2,400,000 as shown. Last one is a larger A matrix but roughly the same nnz as the first example, and it fails at ldb=ldc=1,200,000.
I used the latest available version of libxsmm but actually I first observed a segfault when running PyFR on Intel Skylake and ARM (Graviton2/3) a few months ago, just wasn't able to pinpoint until now where the segfault was originating. I believe the present issue was the root cause for all this so I think this issue first appeared at least a few months ago.
I run the above examples on an i7-1185G7 (Willow Cove). For building libxsmm I just did 'make' in the main folder and then again 'make' in samples/pyfr.
The text was updated successfully, but these errors were encountered: