-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: a general commit back framework for source #16736
Comments
non-persistent pulsar does not equal to 0 retention. 0 retention does persist data. But it will not be kept after acked by all subscriptions. In our case, we should ack a message only after it was committed. More detailed in https://pulsar.apache.org/docs/next/cookbooks-retention-expiry/ |
Thanks for raising the questions. First I've already had a WIP PR #16733 if you want to preview the impl details in my mind. For question 1:
I think @fuyufjh answered this:
Do you mean heavier at the RW side or the MQ side? If former, I think it's handled by backpressure an no worries. Not sure about the affect of long ack timeout on MQ's side though.
What does this mean? For quersion 2: We don't need to persist the ack-ids. They are just kept in memory. I think it's OK to lose them and let upstream resend the unacked messages. |
Got your concern. Indeed, we must be careful when using a source with ack timeout, otherwise, if it timeouts before our epoch commits, the message will be sent again in the next epoch, causing a live lock effectively. For the case of Google Pub/Sub, they provided modifyAckDeadline API to delay the deadline. I suppose most Pub/Sub systems (that doesn't rely on offsets) must have similar functions to provide exactly-once delivery.
Log store will introduce huge complexity. Please just ignore this extreme cases for now, a 30-min barrier will cause way more problems than this. |
BTW, Pulsar support log semantics and accumulative ack, so we just need to commit latest offset, like CDC. Also, it seems currently we don't
It will clean data unless
Note that TTL (automatically ack) is different from retention. To further clarify, actually there are 2 issues related with ack:
P.S. Pubsub's retention duration is not related with ack. |
For reference some facts about ack ddl: Flink recommends:
Pubsub Acknowledgement deadline options:
https://cloud.google.com/pubsub/docs/subscription-properties#ack_deadline |
Thanks for answering the questions raised above.
Take the example above, a 30min mq-side timeout, and now the barrier latency is 35min. RW received all data but the mq thought the messages got lost, so RW will receive the same data one more time in the future. and there can be a chance of losing data when the barrier latency remains at a high level. Messages failing multiple times can go into the dead letter queue and will not be delivered again. (it can be configured, just FYI, https://cloud.google.com/pubsub/docs/handling-failures)
|
Hi all, I am revisiting the issue for implementing a general ack mechanism in source, thanks @xxchan for mentioning this.
My major questions are:
@yuhao-su can you share more about the case? IIRC, a non-persistent pulsar topic cannot seek to spec position and I think it is expected.
Originally posted by @tabVersion in #15464 (comment)
The text was updated successfully, but these errors were encountered: