Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for GPU Capacity Estimation Based on Pixel Processing #2971

Open
FranckUltima opened this issue Mar 5, 2024 · 2 comments
Open

Proposal for GPU Capacity Estimation Based on Pixel Processing #2971

FranckUltima opened this issue Mar 5, 2024 · 2 comments
Labels
status: triage this issue has not been evaluated yet

Comments

@FranckUltima
Copy link

Currently, orchestrators must determine the encoding and decoding capabilities of their nodes themselves, based on the GPU type and a benchmark. However, it is extremely difficult to estimate how many streams an orchestrator can accept before going out of real time.

Indeed, should we estimate the number of streams based on a 4K to 1080p/720/480/360p profile (30 or 60 fps) or on a 1080p to 720p 480p or even a 720p to 480p profile? The number of streams acceptable by a GPU can then vary considerably, and underestimating it can lead to loss of real time and a decrease in the quality of service for customers.

Could we consider defining the limits of GPUs instead by the quantity of pixels processed? This way, if a new job exceeds the current capacity of the GPU, it would be rejected.

I believe it would be more accurate to estimate the number of pixels that a GPU can process (even if this can also vary depending on the codec, etc.) and this would allow us to make the best use of the orchestrators' capabilities without risking going out of real time and impacting the quality of service for customers.

@github-actions github-actions bot added the status: triage this issue has not been evaluated yet label Mar 5, 2024
@leszko
Copy link
Contributor

leszko commented Mar 6, 2024

@FranckUltima thanks for raising this proposal. I think the pixel capacity is a better measurement, then max GPUs/sessions. Another thing we could consider is defining a "unit of work", like you have vCPU in cloud providers. I mention this, because with the Livepeer AI work in place, we may need to define some abstract unit of capacity.

@Tibia2000
Copy link

Tibia2000 commented Mar 6, 2024

I think it should be configurable by decoded and encoded pixels as OR rule because gpus have different chip number. If only maxencoded pixel flag is used it might be not saturated before decoded pixels capacity is reached and your gpu crashes. By the way decoded pixels might be also incorporated to the payment for work system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: triage this issue has not been evaluated yet
Projects
None yet
Development

No branches or pull requests

3 participants