Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metrics): add cdc consume lag metrics #13877

Merged
merged 11 commits into from
Dec 11, 2023
Merged

Conversation

StrikeW
Copy link
Contributor

@StrikeW StrikeW commented Dec 8, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

The payload.source.ts_ms in debezium message reflects the timestamp for when the change was made in the database. We use this timestamp to calculate the lag latency from the perspective of source executor.
https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-create-events
image

close #13440

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link

codecov bot commented Dec 8, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (f1e103f) 68.34% compared to head (1b9e109) 68.33%.
Report is 9 commits behind head on main.

Files Patch % Lines
src/connector/src/source/cdc/source/message.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13877      +/-   ##
==========================================
- Coverage   68.34%   68.33%   -0.01%     
==========================================
  Files        1528     1528              
  Lines      263303   263313      +10     
==========================================
- Hits       179947   179936      -11     
- Misses      83356    83377      +21     
Flag Coverage Δ
rust 68.33% <90.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM. Thanks for the PR!

long sourceTsMs =
sourceStruct == null
? System.currentTimeMillis()
: sourceStruct.getInt64("ts_ms");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dumb question: Is ts_ms guaranteed to be a utc based timestamp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think yes, since timestamp in mysql and pg is stored as UTC.

@@ -114,13 +115,19 @@ var record = event.value();
committer.markProcessed(event);
continue;
}
// get upstream event time from the "source" field
var sourceStruct = ((Struct) record.value()).getStruct("source");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be any difference in payload.source.ts_ms and payload.ts_ms for direct cdc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

payload.ts_ms is the time at which the connector processed the event, that is process time of the connector.

let opts = histogram_opts!(
"source_cdc_event_lag_duration_milliseconds",
"source_cdc_lag_latency",
exponential_buckets(1.0, 2.0, 20).unwrap(), // max 1048s
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: the max will be 1*(2^(20-1)) ~= 524s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out.

@StrikeW StrikeW enabled auto-merge December 11, 2023 02:54
@StrikeW StrikeW added this pull request to the merge queue Dec 11, 2023
Merged via the queue into main with commit ddbd74b Dec 11, 2023
28 of 29 checks passed
@StrikeW StrikeW deleted the siyuan/cdc-lag-metrics branch December 11, 2023 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

use ts_ms to determine the lag between the source database update and Debezium
2 participants