automatic speech recognition with wav2vec2

Use any wav2vec model with a microphone.

Setup

I recommend to install this project in a virtual environment.

python3 -m venv ./venv
source ./venv/bin/activate
pip install -r requirements.txt

Depending on linux distribution you might encounter an error that portaudio was not found when installing pyaudio. For Ubuntu you can solve that issue by installing the "portaudio19-dev" package.

sudo apt install portaudio19-dev

Finally you can test the speech recognition:

python live_asr.py

Possible Issues:

The code uses the systems default audio device. Please make sure that you set your systems default audio device correctly.
"attempt to connect to server failed" you can safely ignore this message from pyaudio. It just means, that pyaudio can't connect to the jack audio server.

Usage

You can use any wav2vec2 model from the huggingface model hub. Just set the model name, all files will be downloaded on first execution.

from live_asr import LiveWav2Vec2

english_model = "facebook/wav2vec2-large-960h-lv60-self"
german_model = "maxidl/wav2vec2-large-xlsr-german"
asr = LiveWav2Vec2(german_model,device_name="default")
asr.start()

try:        
    while True:
        text,sample_length,inference_time = asr.get_last_text()                        
        print(f"{sample_length:.3f}s"
        +f"\t{inference_time:.3f}s"
        +f"\t{text}")
        
except KeyboardInterrupt:   
    asr.stop()

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.vscode		.vscode
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
convert_torch_to_onnx.py		convert_torch_to_onnx.py
live_asr.py		live_asr.py
live_vad_asr.py		live_vad_asr.py
onnx_performance_evaluation.py		onnx_performance_evaluation.py
requirements.txt		requirements.txt
test.wav		test.wav
wav2vec2_inference.py		wav2vec2_inference.py
wav2vec2_onnx_inference.py		wav2vec2_onnx_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.vscode

.vscode

docs

docs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

convert_torch_to_onnx.py

convert_torch_to_onnx.py

live_asr.py

live_asr.py

live_vad_asr.py

live_vad_asr.py

onnx_performance_evaluation.py

onnx_performance_evaluation.py

requirements.txt

requirements.txt

test.wav

test.wav

wav2vec2_inference.py

wav2vec2_inference.py

wav2vec2_onnx_inference.py

wav2vec2_onnx_inference.py

Repository files navigation

automatic speech recognition with wav2vec2

Setup

Possible Issues:

Usage

About

Contributors 3

Languages

License

oliverguhr/wav2vec2-live

Folders and files

Latest commit

History

Repository files navigation

automatic speech recognition with wav2vec2

Setup

Possible Issues:

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Languages