Process text-to-image requested image count sequentially #66

ad-astra-video · 2024-04-21T17:53:08Z

Updated the text-to-image to sequentially process images if more than 1 requested.

This provides more stable on GPU memory usage and I believe inference is relatively linear for text-to-image models. If a user wants faster inference they can split the request to separate requests that would go to separate orchestrators.

This is a quick fix for issue #49.

Some questions:

Should we set a sensible limit per request? Timeout seems to be 30s so 15 sounds like a good limit that would allow most graphics cards to respond within the timeout.
Is there a check for if a set of images is requested but non-matching number of seeds is provided? For example, batch of 10 requested but only 5 seeds provided. I was not seeing where it would fill out the seeds to match the requested images count.

ad-astra-video · 2024-04-21T19:22:57Z

Did some testing and when I get past 12 images, the first ones processed return an ErrNotFound from the broadcaster. 15 images returns 3 ErrNotFound, 13 returns 1 ErrNotFound. 20 returns 8 ErrNotFound.

Is there a limit of 12 somewhere? or is this just my setup?

ad-astra-video · 2024-04-21T19:32:28Z

Did some testing and when I get past 12 images, the first ones processed return an ErrNotFound from the broadcaster. 15 images returns 3 ErrNotFound, 13 returns 1 ErrNotFound. 20 returns 8 ErrNotFound.

Is there a limit of 12 somewhere? or is this just my setup?

I think I found it, the MemoryDriver in go-tools has a cache length of 12.
Node uses a session based cache setup here https://github.com/livepeer/go-livepeer/blob/930533388e62d21b339cccd8d6f4348dc14c5e75/cmd/livepeer/starter/starter.go#L1170.

Memory driver is in livepeer go-tools repo. https://github.com/livepeer/go-tools/blob/ac33a3c30a694a743f0077630aaf62423b5fd1f9/drivers/local.go#L17

eliteprox · 2024-04-25T06:17:00Z

1. Should we set a sensible limit per request?  Timeout seems to be 30s so 15 sounds like a good limit that would allow most graphics cards to respond within the timeout.

I think we should set a reasonable limit for the number of files sent to a given orchestrator and send remainder to the next orchestrator in the pool if one is available.

2. Is there a check for if a set of images is requested but non-matching number of seeds is provided?  For example, batch of 10 requested but only 5 seeds provided.  I was not seeing where it would fill out the seeds to match the requested images count.

I'll have to debug/learn the multi-file workflow first, but this line should be generating a seed when it's not provided https://github.com/livepeer/ai-worker/pull/66/files#diff-2952f66b536acb78e9bb5ee0337a00485762b802438a3209508b3b0ee088212dR62

process text-to-image requested image count sequentially

19129a2

ad-astra-video requested a review from rickstaa as a code owner April 21, 2024 17:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process text-to-image requested image count sequentially #66

Process text-to-image requested image count sequentially #66

ad-astra-video commented Apr 21, 2024

ad-astra-video commented Apr 21, 2024

ad-astra-video commented Apr 21, 2024 •

edited

eliteprox commented Apr 25, 2024

Process text-to-image requested image count sequentially #66

Are you sure you want to change the base?

Process text-to-image requested image count sequentially #66

Conversation

ad-astra-video commented Apr 21, 2024

ad-astra-video commented Apr 21, 2024

ad-astra-video commented Apr 21, 2024 • edited

eliteprox commented Apr 25, 2024

ad-astra-video commented Apr 21, 2024 •

edited