Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLS Transcription Stops After A Couple Minutes #205

Open
austinm1120 opened this issue Apr 21, 2024 · 4 comments
Open

HLS Transcription Stops After A Couple Minutes #205

austinm1120 opened this issue Apr 21, 2024 · 4 comments
Labels
help wanted Extra attention is needed

Comments

@austinm1120
Copy link

austinm1120 commented Apr 21, 2024

I set up a local HLS stream playing a long video of someone talking.

Everything seems great until after exactly 2 minutes in the transcription stops completely.

INFO:faster_whisper:Processing audio with duration 00:07.936
INFO:faster_whisper:Processing audio with duration 00:02.984
INFO:faster_whisper:Processing audio with duration 00:03.032
INFO:faster_whisper:Processing audio with duration 00:01.432
INFO:faster_whisper:Processing audio with duration 00:03.480
INFO:faster_whisper:Processing audio with duration 00:05.272
INFO:faster_whisper:Processing audio with duration 00:01.152
INFO:faster_whisper:Processing audio with duration 00:03.200
INFO:faster_whisper:Processing audio with duration 00:02.548
INFO:faster_whisper:Processing audio with duration 00:04.596
INFO:faster_whisper:Processing audio with duration 00:01.796
INFO:faster_whisper:Processing audio with duration 00:03.844
INFO:faster_whisper:Processing audio with duration 00:02.484
INFO:faster_whisper:Processing audio with duration 00:02.484
INFO:faster_whisper:Processing audio with duration 00:02.484
INFO:faster_whisper:Processing audio with duration 00:02.484
INFO:faster_whisper:Processing audio with duration 00:02.484
INFO:faster_whisper:Processing audio with duration 00:02.484
INFO:faster_whisper:Processing audio with duration 00:02.484

In the server logs i can see that chunks of variable length are processed by the server. However the problem starts when the "00:02.484" chunks keep getting processed. I'm unsure if its just continuing to send the same chunk and it keeps translating it therefore the client appears to be "stuck" or if its stuck in a different loop of some sort.

Setting use_vad to True doesn't seem to make a difference.

I have tried both on Mac (M3 Max chip) and Windows 10. Both docker and python server. Both produce the same results.

This is the client code:

from whisper_live.client import TranscriptionClient
client = TranscriptionClient(
  "localhost",
  9090,
  lang="en",
  translate=False,
  model="small",
  use_vad=False,
)


client(hls_url="http://localhost.m3u8")
@makaveli10
Copy link
Collaborator

@austinm1120 Thanks for reporting the issue, does whisper-live behave similarly with other input types as well or is it only HLS?

@makaveli10 makaveli10 added the help wanted Extra attention is needed label May 29, 2024
@nums
Copy link

nums commented Jun 3, 2024

Same issue with rtmp / rtsp

@aliuspetraska
Copy link

same happens to me, but this time it's exactly 10 minutes every time:

[INFO]: Server disconnected due to overtime.
[INFO]: Websocket connection closed: 1000: 

Investigating on my own and if I will come to any fixes, will let you know.

@aliuspetraska
Copy link

OK, I found the "issue" :) It's not an issue, it's was designed to work that way: https://github.com/collabora/WhisperLive/blob/main/whisper_live/server.py#L28

So in my case I'll do refactoring to strip that part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Development

No branches or pull requests

4 participants