-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dashboard): add back_pressure_rate UI #14863
Conversation
73d9ad4
to
2772601
Compare
b6e6f83
to
9ae89d7
Compare
I add a GUI switch for it. If selecct |
dashboard/pages/fragment_graph.tsx
Outdated
} | ||
doFetch() | ||
return () => { } | ||
}, [toast]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second argument of useEffect
- dependencies
, tells React to re-run the function once any value of it is changed. Here you pass a toast
? Seems to be a misunderstanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I follow the code style in the codebase.
toast(e, "warning") |
risingwave/dashboard/pages/cluster.tsx
Line 173 in 4500a08
toast(e, "warning") |
I know it will trigger re-run. But in current logic, no parameters suitable for here. So I just put the toast
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, do some more investigation. Maybe this is the problem why there is a mismatch on rate between my impelmentation and the previous impelementation. Will remove the toast
. This will result multiple thread keep call and calculate the back pressure. Then the interval of the call changed. The final rate become inaccurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can run npm run lint
and the linter will help to fix that. Generally all outside variables captured by the closure should be included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/risingwavelabs/risingwave/compare/eric/no_toast
I tried removing it from dependencies
and everything seemed to work. cc. @BugenZhao Can you also take a try?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already tried to remove toast
here, everything still seems to works.
You can run npm run lint and the linter will help to fix that. Generally all outside variables captured by the closure should be included.
I tried npm run lint
, and it will remind me to add the dependencies previousBP
. I applied the suggestion, but it will cause the function do infinite quickly re-trigger which is not we want here.
So in current logic, the previousBP
are not suitable to put into dependencies.
NVM, find a way to refactor it.
dashboard/pages/fragment_graph.tsx
Outdated
@@ -168,8 +169,8 @@ function buildFragmentDependencyAsEdges( | |||
|
|||
const SIDEBAR_WIDTH = 200 | |||
|
|||
type BackPressureAlgo = "p50" | "p90" | "p95" | "p99" | |||
const backPressureAlgos: BackPressureAlgo[] = ["p50", "p90", "p95", "p99"] | |||
type BackPressureAlgo = "p50" | "p90" | "p95" | "p99" | "current" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend to make current
the default option. Previously we use Disabled
by default because Prometheus is not always available, but now current
don't depend on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it seems too obscure to users that p50/90/99
is from Prometheus while current
is from Meta.
A better way is probably making two ListBox
: 1) Data source: Embedded or Prometheus 2) Only shows for Prometheus: average/p50/p99
Let me try some large data test locally. Only try some small data before. |
@fuyufjh @BugenZhao
We should not compared with the |
Co-authored-by: Bugen Zhao <[email protected]>
Co-authored-by: Bugen Zhao <[email protected]>
Co-authored-by: Bugen Zhao <[email protected]>
Are you asking that whether we should show back-pressure on actor-level or fragment-level granularity? I think actor-level information is useless and too verbose in most cases. The only cases I can think of is inbalance i.e. data skew, this happens sometimes, so it would be nice if we allow users to click to dive into actor-level information in the dashabord. |
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
9425213 | Triggered | Generic Password | fbc0e59 | ci/scripts/regress-test.sh | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
Our GitHub checks need improvements? Share your feedbacks!
Oh, I mean this, in old api, it directly use the following code to get metrics from prometheus:
My understand is it will do average on actor id, and group by fragment_id, downstream_fragment_id. I implememnt the new api following this old logic. Just want to confirm Average logic is what we want. |
BTW, I find some compile problem when marge the latest main branch. I find the problem also exist in the latest main branch which not include my PR. So it should not introduced in this PR. Record it in #15219 , I guess it should related to the recent merged PRs in past 1 month. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the rest
Co-authored-by: Eric Fu <[email protected]>
|
||
// Get back pressure from meta node -> compute node | ||
export async function getBackPressureWithoutPrometheus() { | ||
const response = await api.get("/metrics/back_pressures") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we move it to another path such as "embedded_metrics" or sth? It's mixed with the metrics fetched from Prometheus, which looks confusing to me.
risingwave/src/meta/src/dashboard/mod.rs
Lines 423 to 428 in ae7d7ba
.route("/metrics/cluster", get(prometheus::list_prometheus_cluster)) | |
.route( | |
"/metrics/actor/back_pressures", | |
get(prometheus::list_prometheus_actor_back_pressure), | |
) | |
.route("/metrics/back_pressures", get(get_back_pressure)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, /metrics/actor/back_pressures
actually output fragment-level metrics instead of actor-level... 😂 Legacy issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will solve the problem in a seperate PR
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
In #13830, we plan to add a UI page to show page pressure rate without using prometheus. We already add the API in meta to get the current back pressure. In this PR
current
, will shift from prometheus data into this API.actor-id
,fragement id
,next fragement id
back pressure
. We convert theback pressure
intoback pressure rate
, and also aggregate byfragement id
,next fragement id
onactor id
. Then apply the value in streaming graph.Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.