-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chaos mesh daily test: ch-benchmark-pg-cdc data verification failed #15312
Comments
There is a null ptr issue in the log, which has been fixed in release-1.8, I suggest we rerun the pipeline with new image. And I notice the pipeline is using the dedicated source, it is problematic in chaos-mesh test as explained in #15141 (comment). Please refactor the pipeline to ensure the historical data is empty or use share source instead. cc @cyliu0 |
test failed with share source. |
I found same error logs as in the #15141. |
Hi @xuefengze, please rerun the test with |
https://buildkite.com/risingwave-test/chaos-mesh/builds/730#018eb902-b9e8-44f1-83a4-058cc8232d99 |
This time there are more rows in the RW tables:
|
I also run against the nightly image (750), the symptom is same: RW rows is more than upstream pg. |
During a previous refactor to source parser, I found a possible cause of cdc offset rewind, which according to @StrikeW is possibly a reason of this issue. The bug case is that, if there's always a transaction beginning in the middle of a source message batch, and ending in the next batch, we'll always yield stream chunks immediately when the transaction is committed. Then heartbeat messages to update offset can possibly be delayed forever. After a series of such transactions, the earliest heartbeat message may get emitted, but the offset may already be invalid then. I created a PR #16608 to fix this bug. Let's hope it can fix this issue🙏 |
I am building an image for the branch, let's see whether the issue can be fixed. |
I run the chaos mesh test against the PR, but it fails in q3. But the nightly image can pass the test.
I think we should take a look before merge the PR. |
@StrikeW Is the issue fixed? |
chaos mesh pipeline didn't fail for a long time, I think we can close this issue and open new issues if the pipeline fails. |
Chaos-mesh test failed(ch-benchmark-pg-cdc). The experiment made the meta unavailable for 20 seconds.
Data verification fails:
Data verification fails:
The text was updated successfully, but these errors were encountered: