Trying to stack tensors from different devices in _pad_to_max_length
in Whisper batched inference
#30223
Closed
2 of 4 tasks
Labels
This issue seems to be due to the following line, added in #29065 to fix #29036, but the fix doesn't work with batched inference on GPU/MPS because the tensor is on the wrong device:
transformers/src/transformers/models/whisper/generation_whisper.py
Line 146 in 4f7b434
System Info
transformers
version: 4.40.0.dev0 bf9a7abWho can help?
@ylacombe @sanchit-gandhi
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
No error
The text was updated successfully, but these errors were encountered: