Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keeping the same speaker in different files #777

Open
wallaceblaia opened this issue Apr 11, 2024 · 1 comment
Open

Keeping the same speaker in different files #777

wallaceblaia opened this issue Apr 11, 2024 · 1 comment

Comments

@wallaceblaia
Copy link

First off, thank you for your fantastic work here. I am working on a project where I aim to translate and dub YouTube live streams almost in real-time. I've managed to achieve a delay of 3 minutes, but I'm looking to reduce this even further.

In my implementation, I capture the live stream and create segments of approximately 1 minute each because I use a technique to make cuts in speech only between words. After processing this audio with Demucs, I send it to the Whisperx pipeline. However, the speaker data varies across each audio file. I am interested in knowing if there is a way to preserve the embedding data of speakers across multiple audio files, with the same speakers, because I use the flags returned from diarization to dub in another language. But in each audio, I would have different flags for the same speaker.

@wallaceblaia wallaceblaia changed the title Mantendo o mesmo falante em arquivos diferente Keeping the same speaker in different files Apr 11, 2024
@SeeknnDestroy
Copy link

Hey @wallaceblaia could you find any solution for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants