-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to ensure ordering of jobs in case of delayed retries. #68
Comments
Thank you for your question. I am afraid that currently, when a job fails, the queue is not halted, so the other jobs waiting to be processed will be processed as soon as a worker is free. |
It is critical, unfortunately. Our use case is a number of event queues for webhooks (each queue representing a customer's subscription), where we would like to submit events in proper order. We see that in practice, webhooks are sometimes not working (e.g. the customer's endpoint is temporary available) and need to be retried, but we can't have those events move to the back of the queue, because order matters. As a dummy example: imagine two events occurring in this order:
If we would send these events in inverted order, the outcome on the customer end would be completely wrong, since they assumed the system is offline, and might cease communication to it. |
Ok, so this function would be specific for groups, where a group would not continue processing new jobs until the previous one have been completely completed or failed, furthermore this feature would only make sense with concurrency equal 1. |
You're right. We're already using concurrency of 1 extensively to enforce sequential processing because there's a lot of cases for us that warrant that. Preserving order on retries is just one flavor more. If it's a bad fit for BullMQ, we could work around the issue with the following strategy I guess:
This is absolutely feasible for us. We just figured that ordered processing (including retries with backoff delays) would be a common scenario, so we wanted to discuss this with you first 👍 |
It wouldn't be specific for groups though: we thought about creating a queue for each customer (rather than groups with a customer ID), which would reduce the complexity for the retries remarkably compared to queues that still would have to process events for other groups. |
We are working on a solution for this in BullMQ and then we will extend it to groups as well, this is the PR: taskforcesh/bullmq#2465 |
You guys rock! Looking forward to the implementation :) |
Just wondering how this is progressing. I'm processing ordered sports facts from third parties and being able to block at the group level would be fantastic. |
@Adam-Burke yes, we have this PR almost ready. The biggest issue I see with this is that despite everything the order cannot be guaranteed as long as you have more than one worker, as even though they will pick the jobs in order, due to network latencies and such, it is possible that one worker will start processing a job before the other one that is running in a different machine or process. |
Could there be a way to ensure that jobs from the same group would always be processed by the same worker (assuming it's still running). So you could still scale out workers but have group-based, at-least-once, ordered processing? Either way I think it still is quite useful for our purposes. |
@Adam-Burke Let's see. If you used groups with max concurrency 1, then it is guaranteed that only 1 job will be processed per group, therefore order is guaranteed within a group, except for the fail case with retries. So if we supported this case, (keep order within a group for retries), would that solve your use case? UPDATE: sorry for the confusion, now I see that this issue is exactly about this... so yes, basically we will support this case soon. |
Hi, |
Hello,
for our current use case, we are adding jobs to a queue and would like them to be successfully processed in the same order they were added (FIFO).
So when a job fails, the desired outcome would be that the job is retried automatically after some delay before moving on to the next job on the queue.
For example, if we have these 3 jobs in our queue [1, 2, 3] with job 1 being the first added job, here is what a possible execution would look like:
How can we achieve this with bullmq?
Thank you in advance for the support!
The text was updated successfully, but these errors were encountered: