Can I train the Chinese model? #111
Replies: 20 comments 5 replies
-
Look at issue #41 to check the current progress. |
Beta Was this translation helpful? Give feedback.
-
You can, but with the current PL-BERT in English the quality won’t be as good it’s originally proposed to be. I’m working on multilingual PL-BERT now and it may take one or two months to finish. |
Beta Was this translation helpful? Give feedback.
-
See yl4579/StyleTTS#10 for more details. |
Beta Was this translation helpful? Give feedback.
-
@yl4579 I trained styletts2 successfully using Chinese data, it sound very good. As wavlm-base-plus only supporting English, I used a Chinese Hubert model as SLM. When I want to train a model both for Chinese and English, I can not find a pre-trained model sopport Chinese and English at the same time. About SLM,Do you have any suggestions ? |
Beta Was this translation helpful? Give feedback.
-
You can try whisper encoder that was trained with multiple languages. You can also try multilingual wav2vec2.0: https://huggingface.co/facebook/wav2vec2-large-xlsr-53 |
Beta Was this translation helpful? Give feedback.
-
Did you use the English PL-BERT or did you train PL-BERT with Chinese data? |
Beta Was this translation helpful? Give feedback.
-
train PL-BERT with Chinese data |
Beta Was this translation helpful? Give feedback.
-
What is your modeling unit? IPA or Pinyin? |
Beta Was this translation helpful? Give feedback.
-
@Moonmore The modeling unit is pinyin. test.zip is a synth sample. |
Beta Was this translation helpful? Give feedback.
-
Do you use the tone of pinyin when training Chinese PL-BERT? I believe StyleTTS uses F0 for Chinese tones. Can this PL-BERT with tones work with StyleTTS? |
Beta Was this translation helpful? Give feedback.
-
I trained Chinese PL-BERT without pinyin tones. But maybe PL-BERT with tones will also work normally, so you can try. |
Beta Was this translation helpful? Give feedback.
-
How many samples did you use to train Chinese PL-BERT? |
Beta Was this translation helpful? Give feedback.
-
@zhouyong64 I used about 84,000,000 text sentences to train the Chinese PL-BERT model. |
Beta Was this translation helpful? Give feedback.
-
Sounds really good. I would like to ask if the pinyin unit you mentioned cannot be disassembled into phones? How to align plbert and text input? |
Beta Was this translation helpful? Give feedback.
-
@Moonmore @zhouyong64 Sorry for the wrong information of yesterday, I tained PL-BERT with tones, and trained asr without tones.
|
Beta Was this translation helpful? Give feedback.
-
So can I understand that all text-related models are trained using the same phoneme unit, and the characteristics of each minimum pronunciation modeling unit are obtained. like(ni3 hao3 -> n i3 h ao3), The input length is 4, and the output length of the model is also 4. text encoder and the bert model. and how to construct the plbert label? |
Beta Was this translation helpful? Give feedback.
-
@Moonmore |
Beta Was this translation helpful? Give feedback.
-
@hermanseu Thank you for your reply. |
Beta Was this translation helpful? Give feedback.
-
How can the above be applied to StyleTTS2? Is there a complete repo already I could look up that is specialized on Mandarin using this G2PW? As a non-expert I am looking at the puzzle pieces but don't see the entire picture. Perhaps its too early in the development. |
Beta Was this translation helpful? Give feedback.
-
@hermanseu ,兄弟,请问一下你在训asr模块的时候,意思是分解音素时没有声调吗? 比如 |
Beta Was this translation helpful? Give feedback.
-
I want to train the Chinese model. Do you support mixed input in Chinese and English?
Beta Was this translation helpful? Give feedback.
All reactions