-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Different tones on different parts of the text #490
Comments
I second this. Some sort of markup language would be nice if it exists. |
Hi, |
@porky11 - how could sarcasm or whispering be applied to an output voice without training with relevant voice recordings? Is there some proces you have in mind that could be applied to the audio to achieve this? My suspicion is that there isn't a viable way to do this (without the audio + training) |
In my experience, you need audio data for each case (sarcasm, whispering, etc.) and then a multi speaker model needs to be trained with each case being a different "speaker". This is exactly what the Thorsten emotional voice does (German). |
It would be nice if it was possible to add some specific tones to single words or whole sentences.
For example:
(also all combinations, like whispering sarcasm)
Is this already possible somehow?
I saw espeak generates emphasis markers anyway, but maybe this could be altered manually in some way?
Or I could probably train and use different variants of some voice. But it seems it's not possible to switch voices without causing pauses, even when setting "sentence_silence" to 0. But this would probably be the best workaround so far.
It would still be nice if such a feature existed, preferably without the need of training new voices (if possible).
The text was updated successfully, but these errors were encountered: