Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable configurable queue concurrency at project level #2022

Open
1 task
christad92 opened this issue Apr 23, 2024 · 2 comments
Open
1 task

Enable configurable queue concurrency at project level #2022

christad92 opened this issue Apr 23, 2024 · 2 comments
Assignees
Labels

Comments

@christad92
Copy link

christad92 commented Apr 23, 2024

Configurable concurrency is a mechanism that allows us to restrict how many workflows/jobs the worker can run in parallel within a project. When concurrency is set to 1, only a work order/run/job can be processed at a time and other jobs will be enqueued until that is done.

How this worked in v1

openfn often will…
1. get bunch of csv files from sftp --> convert to JSON // Input data 
2. post json payloads to project Inbox (but post the payloads in the order we want to process -e.g., first post Contacts data and then related Donations data bc Donations can’t exist without Contacts)
3. OpenFn would then process all Contacts messages 1-at-a-time first… before then moving onto the Donations message (moving chronologically through the Inbox/Runs queue)

  • This is configurable in the project settings as an integer input

Questions:

  • What is the current behaviour? Do we allow multiple jobs to run concurrently?
@taylordowns2000 taylordowns2000 changed the title Enable configurable concurrency at project level Enable configurable queue concurrency at project level Apr 24, 2024
@taylordowns2000
Copy link
Member

taylordowns2000 commented Apr 24, 2024

I'd say we need two things here, almost certainly separate issues. Something around: https://github.com/OpenFn/lightning/blob/main/lib/lightning/runs/queue.ex#L12

  1. Artificially limit (or increase?) the number of runs being excecuted at once per project (concurrency dial, just like v1). Something like...?
i'm the worker, give me runs to execute...
WHERE NOT IN runs.state == executing AND runs.project_id IN (
  some sort of list of projects?
)
  1. Artificially add a "cool-down" to stall, per project, after each run. Something like...?
i'm the worker, give me runs to execute...
WHERE NOT IN min(age(now(), runs.finished_at)) > '10 seconds' AND runs.project_id IN (
  some sort of list of projects?
)

Nice blog that @stuartc found - https://docs.hatchet.run/blog/multi-tenant-queues#introducing-concurrency-limits

And here's the UI from v1:

Image

Image

@christad92
Copy link
Author

Stu is yet to spike the approach to implementing this. cc: @stuartc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Backlog
Development

No branches or pull requests

3 participants