Skip to content

LlmKira/VitsServer

Repository files navigation

cover.png

Python License Docker

Vits-Server ๐Ÿ”ฅ

โšก A VITS ONNX server designed for fast inference, supporting streaming and additional inference settings to enable model preference settings and optimize performance.

Advantages ๐Ÿ’ช

  • Long Voice Generation, Support Streaming. ้•ฟ่ฏญ้Ÿณๆ‰นๆฌกๆŽจ็†ๅˆๅนถใ€‚
  • Automatic language type parsing for text, eliminating the need for language recognition segmentation. ่‡ชๅŠจ่ฏ†ๅˆซ่ฏญ่จ€็ฑปๅž‹ๅนถๅค„็†ไธ€ๅˆ‡ใ€‚
  • Supports multiple audio formats, including ogg, wav, flac, and silk. ๅคšๆ ผๅผ่ฟ”ๅ›žๅ†™ๅ…ฅใ€‚
  • Multiple models, streaming inference. ๅคšๆจกๅž‹ๅˆๅง‹ๅŒ–ใ€‚
  • Additional inference settings to enable model preference settings and optimize performance. ้ขๅค–็š„ๆŽจ็†่ฎพ็ฝฎ๏ผŒๅฏ็”จๆจกๅž‹ๅๅฅฝ่ฎพ็ฝฎใ€‚
  • Auto Convert PTH to ONNX. ่‡ชๅŠจ่ฝฌๆขpthๅˆฐonnxใ€‚
  • Support for multiple languages, including Chinese, English, Japanese, and Korean. ๅคš่ฏญ่จ€ๅคšๆจกๅž‹ๅˆๅนถๆ”ฏๆŒ๏ผˆไปปๅŠกๆ‰นๆฌกๅˆ†ๅ‘ๅˆฐไธๅŒๆจกๅž‹๏ผ‰ใ€‚

API Documentation ๐Ÿ“–

We offer out-of-the-box call systems.

client = VITS("http://127.0.0.1:9557")
res = client.generate_voice(model_id="model_01", text="ไฝ ๅฅฝ๏ผŒไธ–็•Œ๏ผ", speaker_id=0, audio_type="wav",
                            length_scale=1.0, noise_scale=0.5, noise_scale_w=0.5, auto_parse=True)
with open("output.wav", "wb") as f:
    for chunk in res.iter_content(chunk_size=1024):
        if chunk:
            f.write(chunk)

Running ๐Ÿƒ

We recommend using a virtual environment to isolate the runtime environment. Because this project's dependencies may potentially disrupt your dependency library, we recommend using pipenv to manage the dependency package.

Config Server ๐Ÿš

Configuration is in .env, including the following fields:

VITS_SERVER_HOST=0.0.0.0
VITS_SERVER_PORT=9557
VITS_SERVER_RELOAD=false
# VITS_SERVER_WORKERS=1
# VITS_SERVER_INIT_CONFIG="https://....json"
# VITS_SERVER_INIT_MODEL="https://.....pth or onnx"

or you can use the following command to set the environment variable:

export VITS_SERVER_HOST="0.0.0.0"
export VITS_SERVER_PORT="9557"
export VITS_SERVER_RELOAD="false"
export VITS_DISABLE_GPU="false"

VITS_SERVER_RELOAD means auto restart server when file changed.

Running from pipenv ๐Ÿ and pm2.json ๐Ÿš€

apt-get update &&
  apt-get install -y build-essential libsndfile1 vim gcc g++ cmake
apt install python3-pip
pip3 install pipenv
pipenv install  # Create and install dependency packages
pipenv shell    # Activate the virtual environment
python3 main.py # Run
# then ctrl+c exit
apt install npm
npm install pm2 -g
pm2 start pm2.json
# then the server will run in the background

and we have a one-click script to install pipenv and npm:

curl -LO https://raw.githubusercontent.com/LlmKira/VitsServer/main/deploy_script.sh && chmod +x deploy_script.sh && ./deploy_script.sh

Building from Docker ๐Ÿ‹

we have docker pull sudoskys/vits-server:main to docker hub.

you can also build from Dockerfile.

docker build -t <image-name> .

where <image-name> is the name you want to give to the image. Then, use the following command to start the container:

docker run -d -p 9557:9557 -v <local-path>/vits_model:/app/model <image-name>

where <local-path> is the local folder path you want to map to the /app/model directory in the container.

Model Configuration ๐Ÿ“

In the model folder, place the model.pth/ model.onnx and corresponding model.json files. If it is .pth, it will be automatically converted to .onnx!

you can use .env to set VITS_SERVER_INIT_CONFIG and VITS_SERVER_INIT_MODEL to download model files.

VITS_SERVER_INIT_CONFIG="https://....json"
VITS_SERVER_INIT_MODEL="https://.....pth?trace=233 or onnx?trace=233"

model folder structure:

.
โ”œโ”€โ”€ 1000_epochs.json
โ”œโ”€โ”€ 1000_epochs.onnx
โ”œโ”€โ”€ 1000_epochs.pth
โ”œโ”€โ”€ 233_epochs.json
โ”œโ”€โ”€ 233_epochs.onnx
โ””โ”€โ”€ 233_epochs.pth

Model ID is 1000_epochs and 233_epochs.

when you put model files in the model folder, you should restart the server.

Model Extension Design ๐Ÿ”

You can add extra fields in the model configuration to obtain information such as the model name corresponding to the model ID through the API.

{
  //...
  "info": {
    "name": "coco",
    "description": "a vits model",
    "author": "someone",
    "cover": "https://xxx.com/xxx.jpg",
    "email": "xx@ws.com"
  },
  "infer": {
    "noise_scale": 0.667,
    "length_scale": 1.0,
    "noise_scale_w": 0.8
  }
  //....
}

infer is the default(prefer) inference settings for the model.

info is the model information.

How can I retrieve these model information?

You can access {your_base_url}/model/list?show_speaker=True&show_ms_config=True to obtain detailed information about model roles and configurations.

TODO ๐Ÿ“

  • Test Silk format
  • Docker for automatic deployment
  • Shell script for automatic deployment

Acknowledgements ๐Ÿ™

We would like to acknowledge the contributions of the following projects in the development of this project: