Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New library possibly faster than Jax or just a hoax? #157

Open
BBC-Esq opened this issue Nov 15, 2023 · 0 comments
Open

New library possibly faster than Jax or just a hoax? #157

BBC-Esq opened this issue Nov 15, 2023 · 0 comments

Comments

@BBC-Esq
Copy link

BBC-Esq commented Nov 15, 2023

Has anyone seen this repository? https://github.com/Vaibhavs10/insanely-fast-whisper

It makes the incredulous claims that it's approximately 6x times faster than faster-whisper. I checked the github repo and they don't make the sourcecode available (even though there's a "src" folder), but there is a library on Pypi that you can install named "insanely-fast-whisper" located https://pypi.org/project/insanely-fast-whisper/.

Apparently, you can use it either with or without FlashAttention2...I couldn't get FlashAttention2 to install...

Does Faster-Whisper use flashattention2?
Does anyone know what the backend is...is it using Faster-Whisper per chance...the only difference being the "batch_size" parameter that allows them to process more segments of the audio file at once?

Even when I change the batch_size to 1, however, it still runs faster than faster-whisper APPROXIMATELY 2X, NOT 6X LIKE IT CLAIMS.

My test was as follows...

Using Faster-Whisper...
large-v2 model in float32 format (in ctranslate2 format of course)

For "insanely-faster-whisper" I transcribed the same audio file...and here's the relevant portion of my script:

# Initialize the pipeline
pipe = pipeline("automatic-speech-recognition",
                "openai/whisper-large-v2",
                torch_dtype=torch.float32,
                device="cuda:0")

pipe.model = pipe.model.to_bettertransformer()

# Process the audio file
outputs = pipe("[REMOVED PATH TO FILE FOR PRIVACY REASONS",
               chunk_length_s=30,
               batch_size=1,
               return_timestamps=True)

Again, even though the batch size was "1" it was still approximately 2x as fast. Now, I didn't get a chance to test accuracy but...Anyone know what this library is based on???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant