Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarks result in INTERNAL ASSERT FAILED when run with device mps #355

Open
apullin opened this issue Nov 20, 2023 · 4 comments
Open
Labels
enhancement New feature or request

Comments

@apullin
Copy link

apullin commented Nov 20, 2023

Running the benchmarks main.py targeting device mps results in an assertion failure:

Traceback (most recent call last):
  File "/Users/apullin/personal/pyg/pytorch_sparse/benchmark/main.py", line 174, in <module>
    correctness(dataset)
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/apullin/personal/pyg/pytorch_sparse/benchmark/main.py", line 43, in correctness
    mat.fill_cache_()
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch_sparse-0.6.18-py3.10-macosx-11.0-arm64.egg/torch_sparse/tensor.py", line 286, in fill_cache_
    self.storage.fill_cache_()
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch_sparse-0.6.18-py3.10-macosx-11.0-arm64.egg/torch_sparse/storage.py", line 470, in fill_cache_
    self.rowptr()
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch_sparse-0.6.18-py3.10-macosx-11.0-arm64.egg/torch_sparse/storage.py", line 209, in rowptr
    rowptr = torch.ops.torch_sparse.ind2ptr(row, self._sparse_sizes[0])
  File "/Users/apullin/anaconda3/lib/python3.10/site-packages/torch/_ops.py", line 692, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: ind.device().is_cpu() INTERNAL ASSERT FAILED at "/Users/apullin/personal/pyg/pytorch_sparse/csrc/cpu/convert_cpu.cpp":8, please report a bug to PyTorch. ind must be CPU tensor

Invocation was:
python main.py --device=mps

notably, I also had to comment out lines 66 and 84, with import torch.mps, to get this to run - those imports should not be needed?

A similar error occurs when trying to do an applied problem using mps as the device, e.g. a GCN that uses a sparse adjacency matrix.
(this came up in coursework - I will have to recreate a minimum working example so I don't post a solution)

running:

torch==2.1.0
torch-scatter==2.1.2
torch-sparse==0.6.18
torch_geometric==2.4.0

on python 3.10
on Apple Silicon (M2 Max)

@rusty1s
Copy link
Owner

rusty1s commented Nov 26, 2023

Sorry for late reply. Our custom kernels currently do not support mps backend at the moment.

@apullin
Copy link
Author

apullin commented Nov 27, 2023

Darn. Sadly, it looks like Apple does not provide any kind of GPU BLAS.
Does this mean the sparse ops kernels would have to be manually implemented as Metal shaders?

@rusty1s
Copy link
Owner

rusty1s commented Nov 27, 2023

I haven't looked at the detailed backend code required for mps yet, but yeah, it would mean we need to register MPS as a backend for torch-sparse and then implement this functionality in Metal.

Copy link

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?

@github-actions github-actions bot added the stale label May 26, 2024
@rusty1s rusty1s added enhancement New feature or request and removed stale labels May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants