Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this repo usable for a production use case!! #158

Open
utility-aagrawal opened this issue Jan 23, 2024 · 7 comments
Open

Is this repo usable for a production use case!! #158

utility-aagrawal opened this issue Jan 23, 2024 · 7 comments

Comments

@utility-aagrawal
Copy link

Hi All,

I am wondering if anyone has used this repo for a production use case. Currently, I am using openai whisper for transcription but want to include speaker diarization now. I have tried pyannote in the past but results from this repo look much better. My concern is that the source code hasn't been written keeping a production use case in mind - not too flexible, too many log messages, etc. I can rewrite this code but what if there were updates in the future. Will appreciate the community's input on this. Thanks!

@utility-aagrawal
Copy link
Author

@MahmoudAshraf97 , will appreciate your take on this! Thanks for sharing your work!

@MahmoudAshraf97
Copy link
Owner

Hello and thanks for the input، please open a PR with any changes you see that are useful and we can discuss them together

@utility-aagrawal
Copy link
Author

@MahmoudAshraf97 , Thanks for your understanding! This is what I want to do:

  1. Leave existing functionalities as-is.

  2. Please see the attached .txt file. Currently, a lot of messages/warnings/logs are displayed in command line, I want to make this optional where users can choose if they want to see these messages.
    whisper_diarization_stdout.txt

  3. If users want, they should be able to run the whole pipeline locally. Meaning that they can download all the models in a directory beforehand. Faster-whisper and whisperX load_align_model already have support for this. I can check if other models can also be used in this way. Do you know if this is feasible? What other models are used in this pipeline? I still have to go through the code and don't have this answer yet.

  4. Format the code for readability and usability.

Let me know what you think. It will take some time to make all these changes. Before I spend any time, I wanted to align with you. Thanks!

@utility-aagrawal
Copy link
Author

@MahmoudAshraf97 , do you have any feedback?

@utility-aagrawal
Copy link
Author

@MahmoudAshraf97 , thought?

@aedocw
Copy link

aedocw commented May 8, 2024

I'm not speaking for @MahmoudAshraf97 here, but if you take a look at his response from Jan 24, it's pretty clear. This is an open source project that he's doing for whatever his reasons are. @utility-aagrawal, you are treating it like a commercial product that you are paying for.

If you want these changes, you are free to implement them and submit the PR's to get them merged into the project. If you are not a developer, you could pay someone to do the work and submit the patches.

@transcriptionstream
Copy link
Contributor

I have this running in a production environment - it’s stable, consistent, and does a great job

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants