You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to use a derivative project (subsai) with whisperX as its back-end. It works perfectly and generates my subtitle file as desired when run normally, but it fails as soon as I try to enable speaker diarization - IMO the killer feature of whisperX over the other whisper implementations supported by subsai.
I am nearly certain that this is not a bug in subsai, because when I try to upload the same file to victor-upmeet's demo instance, I get the same result: works perfectly when run normally, but returns the following when I enable speaker diarization:
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.2. To apply the upgrade to your files permanently, run python -m pytorch_lightning.utilities.upgrade_checkpoint ../root/.cache/torch/whisperx-vad-segmentation.binModel was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.1.0+cu121. Bad things might happen unless you revert torch to 1.x. Traceback (most recent call last): File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status response.raise_for_status() File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://huggingface.co/pyannote/segmentation/resolve/2022.07/pytorch_model.bin The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1261, in hf_hub_download metadata = get_hf_file_metadata( ^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1674, in get_hf_file_metadata r = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 369, in _request_wrapper response = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 393, in _request_wrapper hf_raise_for_status(response) File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 371, in hf_raise_for_status raise HfHubHTTPError(str(e), response=response) from e huggingface_hub.utils._errors.HfHubHTTPError: 502 Server Error: Bad Gateway for url: https://huggingface.co/pyannote/segmentation/resolve/2022.07/pytorch_model.bin The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/cog/server/worker.py", line 217, in _predict result = predict(**payload) ^^^^^^^^^^^^^^^^^^ File "/src/predict.py", line 167, in predict result = diarize(audio, result, debug, huggingface_access_token, min_speakers, max_speakers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/src/predict.py", line 291, in diarize diarize_model = whisperx.DiarizationPipeline(model_name='pyannote/speaker-diarization@2.1', ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/whisperx/diarize.py", line 19, in __init__ self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/core/pipeline.py", line 136, in from_pretrained pipeline = Klass(**params) ^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 130, in __init__ model: Model = get_model(segmentation, use_auth_token=use_auth_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/pipelines/utils/getter.py", line 75, in get_model model = Model.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 624, in from_pretrained path_for_pl = hf_hub_download( ^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1406, in hf_hub_download raise LocalEntryNotFoundError( huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
To my uneducated eyes, it looks like it's failing to connect to HuggingFace. I have properly inputted my access token, and accepted the terms of both the segmentation and the speaker-diarization libraries. Here is the audio file in question. (The actual audio to be transcribed is nearly 2 hours long, but I trimmed it to ~25 minute segments to see if reducing the file size would help. This yielded no results.)
Any ideas how to overcome this?
The text was updated successfully, but these errors were encountered:
I'm trying to use a derivative project (subsai) with whisperX as its back-end. It works perfectly and generates my subtitle file as desired when run normally, but it fails as soon as I try to enable speaker diarization - IMO the killer feature of whisperX over the other whisper implementations supported by subsai.
I am nearly certain that this is not a bug in subsai, because when I try to upload the same file to victor-upmeet's demo instance, I get the same result: works perfectly when run normally, but returns the following when I enable speaker diarization:
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.2.2. To apply the upgrade to your files permanently, run
python -m pytorch_lightning.utilities.upgrade_checkpoint ../root/.cache/torch/whisperx-vad-segmentation.binModel was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x. Model was trained with torch 1.10.0+cu102, yours is 2.1.0+cu121. Bad things might happen unless you revert torch to 1.x. Traceback (most recent call last): File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status response.raise_for_status() File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://huggingface.co/pyannote/segmentation/resolve/2022.07/pytorch_model.bin The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1261, in hf_hub_download metadata = get_hf_file_metadata( ^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1674, in get_hf_file_metadata r = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 369, in _request_wrapper response = _request_wrapper( ^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 393, in _request_wrapper hf_raise_for_status(response) File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 371, in hf_raise_for_status raise HfHubHTTPError(str(e), response=response) from e huggingface_hub.utils._errors.HfHubHTTPError: 502 Server Error: Bad Gateway for url: https://huggingface.co/pyannote/segmentation/resolve/2022.07/pytorch_model.bin The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/cog/server/worker.py", line 217, in _predict result = predict(**payload) ^^^^^^^^^^^^^^^^^^ File "/src/predict.py", line 167, in predict result = diarize(audio, result, debug, huggingface_access_token, min_speakers, max_speakers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/src/predict.py", line 291, in diarize diarize_model = whisperx.DiarizationPipeline(model_name='pyannote/speaker-diarization@2.1', ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/whisperx/diarize.py", line 19, in __init__ self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/core/pipeline.py", line 136, in from_pretrained pipeline = Klass(**params) ^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 130, in __init__ model: Model = get_model(segmentation, use_auth_token=use_auth_token) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/pipelines/utils/getter.py", line 75, in get_model model = Model.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/pyannote/audio/core/model.py", line 624, in from_pretrained path_for_pl = hf_hub_download( ^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/root/.pyenv/versions/3.11.8/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1406, in hf_hub_download raise LocalEntryNotFoundError( huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
To my uneducated eyes, it looks like it's failing to connect to HuggingFace. I have properly inputted my access token, and accepted the terms of both the segmentation and the speaker-diarization libraries. Here is the audio file in question. (The actual audio to be transcribed is nearly 2 hours long, but I trimmed it to ~25 minute segments to see if reducing the file size would help. This yielded no results.)
Any ideas how to overcome this?
The text was updated successfully, but these errors were encountered: