Examples of a good fine-tune? #65
Replies: 5 comments 19 replies
-
https://www.youtube.com/watch?v=Tuz7_7q0Pr0 Trained on this interview: https://www.youtube.com/watch?v=ozOoONmJ9EQ |
Beta Was this translation helpful? Give feedback.
-
This is the results I got using default config for 50 epochs (past SLM adversarial training is the most VRAM consuming part. I don't know how to mitigate this problem and fit it in smaller machines. Maybe techniques used for LLM finetuning could help as we are working with large speech language models here. |
Beta Was this translation helpful? Give feedback.
-
Here are the results I have from 2 different models. Aurora: 50 Epochs with joint training after 10, 8 Hours of audio, single voice. Batch Size 2, max length 220. AuroraTest1.webm |
Beta Was this translation helpful? Give feedback.
-
Fine-tuning on LibriTTS using a single Brazilian Portuguese speaker involved processing approximately 24 hours of audio over 60 epochs. Link: https://drive.google.com/file/d/1pBqHbIuuaO7jvMsnnpbjrsFAPcHZKr41/view?usp=sharing I'm using PL-BERT multilingual. Please, any idea why there is this annoying noise on the end of the audio clip? Thanks! Jonathan S. Santos |
Beta Was this translation helpful? Give feedback.
-
Acredito que por falta de um pad de silêncio de pelo menos 400ms outra coisa se os áudios estiverem maior que o length faça o cálculo dos segundos e a frequência não tentei treinar ainda em português assim que concluir os LLM vou liberar um checkpoint em português se puder compartilhar seu check point |
Beta Was this translation helpful? Give feedback.
-
Does anyone have an example of a good fine-tuned styletts2 model?
The only one I can find is the LJSpeech model, which sounds really good! But wondering what some other narrators / speakers would sound like, especially voices more outside the training dataset. Thanks, and awesome work on this.
Beta Was this translation helpful? Give feedback.
All reactions