New V4 VAD Released

Improved quality
Improved perfomance
Both 8k and 16k sampling rates are now supported by the ONNX model
Batching is now supported by the ONNX model
Added audio_forward method for one-line processing of a single or multiple audio without postprocessing
Hotfix applied - wrong model was uploaded
Minor hotfix re. PyTorch version

Assets 2

17 Dec 15:22

snakers4

v3.1

8ebaf13

New V3 ONNX VAD Released

We finally were able to port a model to ONNX:

Compact model (~100k params);
Both PyTorch and ONNX models are not quantized;
Same quality model as the latest best PyTorch release;
Only 16kHz available now (ONNX has some issues with if-statements and / or tracing vs scripting) with cryptic errors;
In our tests, on short audios (chunks) ONNX is 2-3x faster than PyTorch (this is mitigated with larger batches or long audios);
Audio examples and non-core models moved out of the repo to save space;

Assets 2

07 Dec 12:17

snakers4

v3.0

236d250

New V3 Silero VAD is Already Here

Main changes

One VAD to rule them all! New model includes the functionality of the previous ones with improved quality and speed!
Flexible sampling rate, 8000 Hz and 16000 Hz are supported;
Flexible chunk size, minimum chunk size is just 30 milliseconds!
100k parameters;
GPU and batching are supported;
Radically simplified examples;

Migration

Please see the new examples.

New get_speech_timestamps is a simplified and unified version of the old deprecated get_speech_ts or get_speech_ts_adaptive methods.

speech_timestamps = get_speech_timestamps(wav, model, sampling_rate=16000)

New VADIterator class serves as an example for streaming tasks instead of old deprecated VADiterator and VADiteratorAdaptive.

vad_iterator = VADIterator(model)
window_size_samples = 1536

for i in range(0, len(wav), window_size_samples):
   speech_dict = vad_iterator(wav[i: i+ window_size_samples], return_seconds=True)
   if speech_dict:
       print(speech_dict, end=' ')
vad_iterator.reset_states()

Assets 2

07 Dec 09:07

snakers4

v2.0-legacy

a345715

V2 Legacy Release for History

This is a technical tag, so that users, who do now want to use newer models, could just checkout this tag.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New V4 VAD Released

Main changes

Migration

Releases: snakers4/silero-vad