-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't get run_at from ActiveJob #6196
Comments
Running jobs don't have access to any state except their arguments. They are designed to be very very lightweight: MyJob.new.perform(args...) If you give me a higher level idea of what you are trying to accomplish, I can advise more. |
Thanks, @mperham. We do some very heavy computations in Sidekiq jobs, and have to do a lot of concurrency management that is pretty dynamic. It can't be as simple as "here is a uniqueness key for a job - don't run multiple of these" but instead needs to be "I know the resources I am going to hit, I need to see if anything is hitting many of those right now." We implemented something pretty good for this that is hitting the Redis backend to look at the WorkingSet entries and their string payloads and look for the number of running jobs that are using different subsets of the resources in question. If there are too many, we reschedule the one we were considering starting. We are using run_at to tie-break for the scenario of a stack of jobs all starting at the same time (like after a restart) and deciding which ones win. We implemented a workaround to just look for the current jobs run_at in the same list of running jobs using the jid. However, we found that often the current job is not in that result set. Looking at the code, it looks like the WorkingSet data for each worker is only synced with the heartbeat method every 10 seconds, which defeats a bit of the plan above altogether. However, empirically it seems to sync much more often than that. We are using Pro with reliable_push and super_fetch if that affects it. I know that we could write extra code that just pushed the whole payload of the request to our own Redis collection, but that turns into a bunch of extra Redis calls and more latency. Right now, we are considering calling |
This is getting weirder. We put a loop with a sleep in an around_perform on our jobs. In it, we would grab the WorkingSet (via Sidekiq::WorkSet.new) and look for jobs that have one of our arguments. We are only checking one arg in this for simplicity. We sometimes have that loop go for 60 seconds, and never see a match (while most of the time we do see matches). To head of the question, we do make a new WorkSet object each time around the loop too. Any idea how that is possible? |
The WorkSet is updated asynchronously every N seconds. It's not real-time. https://github.com/sidekiq/sidekiq/wiki/API#workers |
You won't be able to solve your issue with the API. It's not designed for that purpose. With Sidekiq 7+ you can throttle job execution from a specific queue with a Capsule. Sidekiq Enterprise provides various rate limiters which can map onto limited resources to gracefully handle time periods where you might be overloaded. |
Hi!
This could be a doc bug, but I can't find a way to find the Sidekiq payload that you get from the WorkSet on each job but from within the running ActiveJob.
run_at
is the thing we need in particular, but it would be great to get all the payload info, not just the ActiveJob parts.The text was updated successfully, but these errors were encountered: