Skip to content

Yangyangii/Tacotron-pytorch

Repository files navigation

Tacotron - pytorch implementation

Prerequisite

  • python 3.6
  • pytorch 1.0
  • librosa, scipy, tqdm, tensorboardX

Dataset

Usage

  1. Download the above dataset and modify the path in config.py. And then run the below command. 1st arg: signal prepro, 2nd arg: metadata (train/test split)

    python prepro.py 1 1
    
  2. The model needs to train 100k+ steps (10+ hours).

    python train.py
    
  3. After training, you can synthesize some speech from text.

    python synthesize.py
    

Attention

  • In speech synthesis, the attention module is important. If the model is normally trained, then you can see the monotonic attention like the follow figures.

Notes

  • I used bilinear attention instead of MLP attention in the model.
  • I adjusted some momentums to stabilize the model. It alliviates overfitting.

Other Codes