-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Excessive Redis Connections and waitUntilFinished Timeout in High-Load Environments #2749
Comments
any reaction? |
|
@manast I am facing the same. I am developing a realtime AI and Image convertion tool. The user use I think the |
@daimalou it may be useful but it is not the proper way to use queues. As I see it, you would be better off just spawning a NodeJS worker thread, run the job and wait it for completion, than using a queue. |
@daimalou btw, SSE are used for sending data from the server to the client, not sure what you mean that you use it for sending images to the server :/. In any case, if you use SSE or web sockets, it does not matter, you can easily communicate to the client when a job has completed without relying on |
@manast yes, my description was a bit unclear. i use https://github.com/Azure/fetch-event-source, it can post some date to server and server response data using SSE. |
Version
v5.12.12
Platform
NodeJS
What happened?
Environment:
Kubernetes: k3s with 3 nodes
Redis: Sentinel configuration with 3 nodes
Description:
Our system is designed to create a large number of queues. We noticed that while it's possible to pass an already established Redis connection for creating Queue, Job, and Worker instances, the QueueEvents class duplicates the passed connection using duplicate().
During our load tests, where we ran between 100-600 queues, each queue created jobs that returned results. We observed that under low load, waitUntilFinished worked as expected, returning the job result. However, under high load conditions, waitUntilFinished failed to return, hanging indefinitely until the TTL expired.
To debug this, I implemented a parallel polling mechanism that ran the scripts.isFinished script, which indicated that the job had indeed reached the completed status and I could retrieve its result, even though waitUntilFinished remained stuck.
Additionally, the excessive number of connections created by QueueEvents due to the duplicate() method led to us exceeding the maximum number of Redis connections, especially under high concurrency.
Workaround:
As a temporary workaround, I replaced waitUntilFinished with periodic execution of scripts.isFinished to check job completion.
Issue Summary:
waitUntilFinished does not return in high-load scenarios, even when the job is completed according to scripts.isFinished.
QueueEvents creates redundant connections by duplicating the Redis connection, which leads to exceeding the maximum number of connections under high concurrency.
How to reproduce.
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: