-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sequence change race condition #14805
Fix sequence change race condition #14805
Conversation
merge new code
@panhongan Thank you for PR, can you please elaborate the issue ? |
Sync diff from master
sync apache/druid master
sync panhongan/druid master
Is it same as #14995? cc @AmatyaAvadhanula |
Yes, that's the same issue. But there was race condition, need to safely handle that. |
...rc/test/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunnerTest.java
Fixed
Show fixed
Hide fixed
...rc/test/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunnerTest.java
Fixed
Show fixed
Hide fixed
...rc/test/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunnerTest.java
Fixed
Show fixed
Hide fixed
...rc/test/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunnerTest.java
Fixed
Show fixed
Hide fixed
...rc/test/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunnerTest.java
Fixed
Show fixed
Hide fixed
@pranavbhole @abhishekagarwal87 @kfaraz Can you help review and merge? |
Hi @panhongan. It seems like the new tests pass with or without the changes. Could you please try adding more deterministic tests if possible? |
…on' into fix-sequence-change-race-condition
@AmatyaAvadhanula @abhishekagarwal87 I improved the unit test. Also I see the #14995 was merged to fix the same issue. From my side, |
@panhongan , do you still see the issue happening after #14995 was merged? |
This pull request has been marked as stale due to 60 days of inactivity. |
Fixes #XXXX.
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner#maybePersistAndPublishSequences
Description
Exception:
org.apache.druid.java.util.common.ISE: Attempting to publish with empty segment set, but total row count was not 0: [257963].
Context:
intermediateHandoffPeriod: PT20M
Case Analysis:
18:20 ingestion task started with sequence:
===================================
18:40 1st hand-off succeed with sequence:
2023-08-06T18:40:26,525 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing [2] segments in background
2023-08-06T18:40:26,526 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing segments: [
datasource.1_2023-08-06T18:15:00.000Z_2023-08-06T18:30:00.000Z_2023-08-06T17:27:23.279Z_91,
datasource.1_2023-08-06T18:30:00.000Z_2023-08-06T18:45:00.000Z_2023-08-06T17:17:58.416Z_12
]
===========================
19:00 2nd hand-off succeed with sequence:
2023-08-06T19:00:26,069 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing [3] segments in background
2023-08-06T19:00:26,070 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing segments: [
datasource.1_2023-08-06T18:30:00.000Z_2023-08-06T18:45:00.000Z_2023-08-06T17:17:58.416Z_54,
datasource.1_2023-08-06T18:45:00.000Z_2023-08-06T19:00:00.000Z_2023-08-06T17:25:18.834Z_38,
datasource.1_2023-08-06T19:00:00.000Z_2023-08-06T19:15:00.000Z_2023-08-06T18:20:49.142Z_39
]
==============================
Go on running with 3rd sequence:
2023-08-06T19:02:58,507 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing [0] segments in background
2023-08-06T19:02:58,507 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted rows[0] and (estimated) bytes[0]
2023-08-06T19:02:58,509 Error while publishing segments for sequenceNumber
[
SequenceMetadata{
sequenceId=1, sequenceName='datasource.1_ec6e2d1b6806983_1',
assignments=[],
startOffsets={162=652421674, 2=1236043200, 82=653189675, 202=656183403, 42=1231829967, 122=660629209},
exclusiveStartPartitions=[],
endOffsets={162=652734796, 2=1236359960, 82=653501954, 202=656497034, 42=1232144385, 122=660942851},
sentinel=false, checkpointed=true
}
]
org.apache.druid.java.util.common.ISE: Attempting to publish with empty segment set, but total row count was not 0: [257963].
Question:
Related Source Code:
By checking the source code, found that there is an race condition issue:
Code Block A:maybePersistAndPublishSequences()
Code Block B:After pushing segment metadata succeed
Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Release note
Key changed/added classes in this PR
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner#maybePersistAndPublishSequences
This PR has: