Skip to content

[Last Updated 2021] TTS from Cookie. Messy and experimental!

License

Notifications You must be signed in to change notification settings

CookiePPP/cookietts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repo kinda works!

Check back in a week. Thanks.

Missing stuff;

  • Normalize transcripts before running Montreal Forced Aligner

Install/Setup:

Clone this repo: https://github.com/CookiePPP/cookietts.git

This will make a folder called cookietts where the command is run, and clone the repo into said folder.

Run cd cookietts

This will move you into the cloned repo.

Run pip install -e .

This will 'install' the package (without moving around any files).

Run pip install -r requirements.txt

This will install dependencies.

e.g: pytorch, numpy, scipy, tensorboard, librosa

Install Nvidia/Apex.

Nvidia/Apex contains the LAMB optimizer and faster running fused optimizers.

This also allows for fp16 (mixed-precision) training, which saves VRAM and runs faster on RTX cards.

(please nag me if you cannot install Apex. Pytorch added native fp16 (mixed-precision) support some time ago and I might be able to set that up as an alternative)

That should be the main stuff.

If something fails during preprocessing, check you have ffmpeg and sox installed as well.

If something else fails, you can create an 'issue' on the top of the github page.


Usage:

Head into the CookieTTS folder and read around.

If you want to train custom multispeaker models then folders 0, 1 and 2 are of interest.

If you want to use an already trained model to generate new speech then folder 5 (_5_infer) is of interest.

About

[Last Updated 2021] TTS from Cookie. Messy and experimental!

Topics

Resources

License

Stars

Watchers

Forks