-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metrics): add internal latency of actors, MVs and sinks #19639
Changes from all commits
e460b6a
d0fbe4d
997e0c5
353a0d2
28f4dca
3e214fb
330a18a
a426ba7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -744,6 +744,26 @@ def section_streaming(outer_panels): | |
), | ||
], | ||
), | ||
panels.timeseries_latency( | ||
"Latency of Materialize Views & Sinks", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems to be easier to interpret than the epoch one. Should we add it to dev dashboard also? (Actually I've never checked user dashboard before... 🤪) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dev dashboard shows epoch instead of latency. The latency is actually calculated by |
||
"The current epoch that the Materialize Executors or Sink Executor are processing. If an MV/Sink's epoch is far behind the others, " | ||
"it's very likely to be the performance bottleneck", | ||
[ | ||
panels.target( | ||
# Here we use `min` but actually no much difference. Any of the sampled `current_epoch` makes sense. | ||
f"max(timestamp({metric('stream_mview_current_epoch')}) - {epoch_to_unix_millis(metric('stream_mview_current_epoch'))}/1000) by (table_id) * on(table_id) group_left(table_name) group({metric('table_info')}) by (table_id, table_name)", | ||
"{{table_id}} {{table_name}}", | ||
), | ||
panels.target( | ||
f"max(timestamp({metric('log_store_latest_read_epoch')}) - {epoch_to_unix_millis(metric('log_store_latest_read_epoch'))}/1000) by (sink_id, sink_name)", | ||
"{{sink_id}} {{sink_name}} (output)", | ||
), | ||
panels.target( | ||
f"max(timestamp({metric('log_store_latest_write_epoch')}) - {epoch_to_unix_millis(metric('log_store_latest_write_epoch'))}/1000) by (sink_id, sink_name)", | ||
"{{sink_id}} {{sink_name}} (enqueue)", | ||
), | ||
] | ||
), | ||
], | ||
) | ||
] | ||
|
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, if an upstream MV is blocked, then all downstream MVs will have epoch lag? (Therefore we can't locate the root cause from this metrics alone) 🤔
Besides, backpressure may also affect upstream MV's epoch...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found you once raised the idea to show the epoch on DAG. 🤔
#13481 (comment)