Add queue per GPU to ensure sequential inference requests #18

yondonfu · 2024-01-29T20:29:23Z

No description provided.

ad-astra-video · 2024-01-30T19:51:45Z

I like this! Most easily accessible GPUs likely have a 1 job at a time limit. Possibly a 1 job queue per gpu? Could this tie into the worker callback saying what step is completed so the O could know its say 75% done with the current job? The O could then allocate to the GPU closest to being done I think.

rickstaa · 2024-05-08T11:59:05Z

Tracked internally in https://linear.app/livepeer-ai-spe/issue/LIV-318/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add queue per GPU to ensure sequential inference requests #18

Add queue per GPU to ensure sequential inference requests #18

yondonfu commented Jan 29, 2024

ad-astra-video commented Jan 30, 2024

rickstaa commented May 8, 2024

Add queue per GPU to ensure sequential inference requests #18

Add queue per GPU to ensure sequential inference requests #18

Comments

yondonfu commented Jan 29, 2024

ad-astra-video commented Jan 30, 2024

rickstaa commented May 8, 2024