Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Other languages #145

Open
peregilk opened this issue Dec 18, 2023 · 2 comments
Open

Other languages #145

peregilk opened this issue Dec 18, 2023 · 2 comments

Comments

@peregilk
Copy link

I noticed that passing on the language to WhisperX does not work if that language is not in the accepted list of WhisperX. However WhisperX does support loading your own model with specifying the Wav2Vec endpoint. Is this possible to do with whisper-diarization?

I want to load a Norwegian model:
NbAiLabBeta/nb-whisper-small

Then I want WhisperX to accept the Waw2Vec-model:
NbAiLab/nb-wav2vec2-300m-bokmaal

With WhsiperX, you can then do:
whisperx examples/sample01.wav --model NbAiLabBeta/nb-whisper-small --align_model NbAiLab/nb-wav2vec2-300m-bokmaal --batch_size 4

This is an alternative to passing along --language that only works for {en, fr, de, es, it, ja, zh, nl, uk, pt} where WhisperX simply finds the preferred model.

@MahmoudAshraf97
Copy link
Owner

thanks for the suggestion I'll work on it

@peregilk
Copy link
Author

Great. FYI I submitted a pull request to WhisperX for accepting Norwegian (Bokmål and Nynorsk). This was accepted. In the next offcial release, Norwegian Bokmål (no) and Norwegian Nynorsk (nn) should be accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants