Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel request to the server #188

Open
alpcansoydas opened this issue Mar 21, 2024 · 1 comment
Open

Parallel request to the server #188

alpcansoydas opened this issue Mar 21, 2024 · 1 comment

Comments

@alpcansoydas
Copy link

alpcansoydas commented Mar 21, 2024

For example, the server side is deployed. Can the server-side handle multiple parallel transcription requests? How many requests can it be handled? Will there be any performance issues? It may be a basic question, but I wanna know about it. Thanks:)

@cjpais
Copy link

cjpais commented Mar 21, 2024

Yes it can. On a RTX4080 I am able to get 4 parallel streams without issue. 4 streams is the default value for max_clients in the TranscriptionServer class. You can specify more or less for your particular application.

With 4 parallel streams I see minimal performance impact to my eye, but I have not benchmarked it.

One note is that: each stream loads into VRAM, so eventually you will be limited by VRAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants