-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
long-form/streaming support? #53
Comments
We're planning to release long-form and streaming soon after we've had some bandwidth to push code with faster inference... by the way, can you point me to how you're generating 500+ chars / streaming with xtts? i've tried https://huggingface.co/spaces/coqui/xtts but this has a 200 chars limit... |
Hey @vatsalaggarwal, is that release still in the pipeline? |
@platform-kit, yes release is still planned. We just released fine-tuning capabilities #93. We are now going to start working on long-form & streaming. Would love insights on the below
|
@sidroopdaska The way I did this in my implementation of XTTS (https://github.com/Render-AI/cog-xtts-v2/blob/main/predict.py) was to split the text into chunks (i.e. sentences, but it could be done in other ways), then render each sentence as an audio output and then concatenate the audio. You do lose some context this way but it makes the output very stable (avoiding weird outputs where the voice trails off as the duration increases, for example). |
@sidroopdaska |
i wanna use it in role plays and the audio is mostly 500+ chars big so the generation is long.....
is there and stream mode planned?
like in xtts?
The text was updated successfully, but these errors were encountered: