-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dashboard): visualize average backpressure rather than spot backpressure #18219
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The calculation is obscure to explain or understand. As a monitoring component, I'd like to keep it straight-forward.
I am thinking about a simpler way to address the issue. Instead of use the diff of delta_blocking_duration
time, how about just use current value - first value
, which means the accumulated one since the web page opens. In other words, the web page shows the average backpressure of 0-5s, 0-10s, 0-15s, ... as the user staying in the page. When severe backpressure is happening, the longer the user stay, the average number will be more accurate.
Your approach is also acceptable to me, but it's a bit hard to understand. Particularly, the duration spilled here is from a collected barrier i.e. it happened in the past, but it's spilled into the future.
Approved. Please pick the one you like the most. :)
Average BP makes sense to me. That's a very good solution, because of this:
Let me try it. |
715ef63
to
979219f
Compare
…pressure (#18219) (#18258) Co-authored-by: Noel Kwan <[email protected]>
Signed-off-by: Bugen Zhao <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
See the demo: https://www.notion.so/risingwave-labs/Backpressure-Graph-improvements-a11b0c5c74d54202922acb071aca72a0?pvs=4
Currently dashboard calculates the backpressure with the following mechanism.
actor_buffer_output_blocking_ns
metric frommeta
/prometheus
at a fixed duration (5s).blocking_duration
.0
, sinceactor_buffer_output_blocking_ns
will ONLY be incremented after the chunk has been yielded downstream.We can just compute average BP to solve this problem. cr @fuyufjh.
Further PRs:
see: #18176
Other changes
I changed the suggested steps to run the dashboard in the readme, so there's actually some live workload running. Previously it's just a series of DDLs with no data running through it.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.