-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
back pressure #118
Comments
I used zmq to dispatch events to separate node worker processes. Pseudo code below. // BROKER.js in your trigger - farm out work to worker processes
|
that is interesting, but it just moves the problem to the zeromq send buffer. |
some ideas up my sleeve / yet to explore. Will getting the code working on more cores / more machines help? What number of transactions per second are you seeing where things go pear shape? I have in backlog / plan to get the code to support an active/active simultaneous connections - round robin style. No single source of failure. ie. have one machine process odd event/sql ids / event ids and then the other machine do the other. zmq does this out of the box - The underlying design of this is using replication - so when the core triggers start banking up - we would know via SECONDS_BEHIND_MASTER , right? Could this be used to help back off? SECONDS_BEHIND_MASTER= (YET TO TRY THIS) const instance = new MySQLEvents(connection, {
startAtEnd: true,
excludedSchemas: {
startAtEnd: true, // WHAT IF WE SET TO FALSE AND PLAY ENTIRE TRANSACTION REPLAY???
mysql: true,
},
});
Here we could take time - mark the events / transactions that we've processed / then on subsequent starts - skip all those. ZMQ has a few more options to throw and introduces concept of high water mark - https://stackoverflow.com/questions/41941702/one-zeromq-socket-per-thread-or-per-call ZMQ high water mark / will start dropping inbound messages (I get that this maybe unacceptable for your use case) UPDATE did some more digging and understanding of backpressure / streams - seems like @bllevy solved this problem partially ? with nodejs / streams for mysql -> google big query https://github.com/bllevy/mysql_to_gbq/blob/master/config.js#L86 Found these nodejs libraries / perhaps can use something off the shelf. UPDATE https://aws.amazon.com/kinesis/data-streams/faqs/ UPDATE - this library has a pause and resume (and stop) @stephen-dahl - I'm looking to use this in conjunction with last processed id to on startup. UPDATE - I got the resume functionality working using dynamodb ( I just save the last processed row id once I send it off for processing and set the playback to start of database / drop everything less than that) - I abandoned zmq for aws sqs (because of high availability across boxes / sqs simplified a heap of code / no sockets needed.) sqs dispatches work to any computer instance that's ready for work / this can be elastically scaled / there's also a check for health from elb / elastic load balancer - and it can offer fail over active / passive (as opposed to being a single point of failure) |
zongji is overwhelming downstream processes with to much data to fast. I need a way to slow it down until my processor is ready for more. Can this implement a streams interface with back pressure?
The text was updated successfully, but these errors were encountered: