-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delayed Microphone Audio Capture #7681
Comments
Thanks @majweldon for the kind words! cc @hannahblair and @dawoodkhan82 as well as this relates to frontend validation |
taking a look |
Okay so I haven't exactly been able to reproduce the extent of lag that you describe. If I record a very long audio (~5 min), I do encounter two sources of lag:
I made a PR that improves the performance of (2) - it only works though if the recorded audio format is "wav". Can you try installing the gradio from this PR. So do the following:
See if you notice a difference in performance and lmk. |
Thank you so much for your time and effort @abidlabs.
I have done as you said (with the url pasted into my requirements.txt), and
am using the .wav format. It builds and runs with the PR library, but I
still have a significant lag (about 12 seconds per minute of recorded
audio) before I can process any audio data.
I have attached the error log for reference. Here, I press the submit
button once before the audio captures and once afterwards. I can tell the
audio captures because the waveform in the gradio interface will refresh,
though it is visibly the same waveform. Post capture, there is a valid
audio path passed to my transcribe function which is missing in the
pre-capture.
Mike :)
On Tue, 2 Apr 2024 at 16:59, aliabid94 ***@***.***> wrote:
Okay so I haven't exactly been able to reproduce the extent of lag that
you describe. If I record a very long audio (~5 min), I do encounter two
sources of lag:
1. Generating the waveform in the browser (this is new to gradio
4.x.). However on my Macbook pro, this only takes ~2s.
2. Processing the file for saving in the backend. This can take 5-6
seconds. However, this is identical in gradio 4.x and 3.42, so I'm not sure
why you wouldn't see this in 3.42.
I made a PR that improves the performance of (2) - it only works though if
the recorded audio format is "wav". Can you try installing the gradio from this
PR <#7917>. So do the following:
1. pip install
https://gradio-builds.s3.amazonaws.com/b35e3ae839d208520180299077f4ce57bb96fca4/gradio-4.25.0-py3-none-any.whl
2. Change your audio component to gr.Audio(sources=["microphone"],
type="filepath",format="wav")
See if you notice a difference in performance and lmk.
—
Reply to this email directly, view it on GitHub
<#7681 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BFNOKXYDO26O3REDJSPGAF3Y3MZ4VAVCNFSM6AAAAABES4R7XSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMZTGI2TGMBSGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
```
===== Application Startup at 2024-04-05 15:49:34 =====
Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
Received audio file path: None
Attempt 1 of 1 failed with error: Invalid file: None
###############Failed to open audio file after 1 attempts.##############
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/gradio/queueing.py", line 522, in process_events
response = await route_utils.call_process_api(
File "/usr/local/lib/python3.9/site-packages/gradio/route_utils.py", line 260, in call_process_api
output = await app.get_blocks().process_api(
File "/usr/local/lib/python3.9/site-packages/gradio/blocks.py", line 1750, in process_api
data = self.postprocess_data(fn_index, result["prediction"], state)
File "/usr/local/lib/python3.9/site-packages/gradio/blocks.py", line 1521, in postprocess_data
self.validate_outputs(fn_index, predictions) # type: ignore
File "/usr/local/lib/python3.9/site-packages/gradio/blocks.py", line 1495, in validate_outputs
raise ValueError(
ValueError: An event handler (transcribe) didn't receive enough output values (needed: 3, received: 1).
Wanted outputs:
[<gradio.components.textbox.Textbox object at 0x7f73f7c80220>, <gradio.components.number.Number object at 0x7f73f7c80340>, <gradio.components.number.Number object at 0x7f73f7c80490>]
Received outputs:
[None]
```
************ After the lag for audio capture, I push re-submit and the error is gone
Received audio file path: /tmp/gradio/be60e81248568cc78a52a8bd6c9accaa3fdc6193/audio.wav
Dear fellow scholars, the medications are Tylenol, Metoprolol, and Aspirin. What a time to be alive!
**Medications:**
- Acetaminophen
- Metoprolol
- Aspirin
|
Did the PR make any difference at all? If you're still seeing that much lag when the processing time should have been cut, then perhaps its a network issue? Are you running your demo locally or over a server? |
I didn't see any difference with the PR, unfortunately.
My demo is running on the hugging face server, and I see similar behaviour
at work, at home, and on my mobile device.
Would network issues affect latency differently between the libraries?
I can see the waveform and playback the audio within 5-6 seconds in both
versions, similar to what you report. I just can't pass the audio to my
function (transcribe) for much longer using the 4.x versions - it seems to
have to wait until audio.wav is written to disk and can be passed in as a
filepath.
Thanks again,
Mike :)
…On Fri, 5 Apr 2024 at 14:35, aliabid94 ***@***.***> wrote:
Did the PR make any difference at all? If you're still seeing that much
lag when the processing time should have been cut, then perhaps its a
network issue? Are you running your demo locally or over a server?
—
Reply to this email directly, view it on GitHub
<#7681 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BFNOKX6YFTL5DLEFNDSMNE3Y34DI3AVCNFSM6AAAAABES4R7XSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBQGU4TEMBWGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Describe the bug
Ver 3.48.0 (Desired Behaviour)
-As soon as I push stop recording in a microphone input I can push submit (for transcription) with no errors. That is, the file seems usable from the moment stop is pushed.
Ver 4.21.0
-Once I stop a recording, I have to wait some time before the audio 'captures' before I can push submit. This delay is about 1 second for every 10 seconds of recording, so can be substantial for 5+ minutes of audio. I don't mind if there is additional latency, but, ideally, I can push the submit button as soon as I am done recording and come back once everything is done.
Thanks for building and supporting Gradio - it has changed my professional life for the better in a big way.
Mike :)
Have you searched existing issues? 🔎
Reproduction
Screenshot
No response
Logs
System Info
Severity
I can work around it
The text was updated successfully, but these errors were encountered: