-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(grafana): relative CPU usage #17401
Conversation
grafana/README.md
Outdated
./risedev configure # make sure that grafana is turned on | ||
./risedev dev full |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess update
means there already has been one grafana running 🤔 no need to call risedev full
to start a new cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will undo this change. I added it here, because I did not have a cluster running 😁
What's the difference with the old CPU usage metrics? |
f"sum(rate({metric('process_cpu_seconds_total')}[$__rate_interval])) by ({COMPONENT_LABEL}, {NODE_LABEL}) / avg({metric('process_cpu_core_num')}) by ({COMPONENT_LABEL}, {NODE_LABEL}) > 0", | ||
"cpu usage (avg per core) - {{%s}} @ {{%s}}" | ||
% (COMPONENT_LABEL, NODE_LABEL), | ||
), | ||
], | ||
), | ||
panels.timeseries_cpu( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is depend on k8s metrics, that means we need a flag to ensure it only can be used in Cloud ENV
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me look into that... So far I was fine with the way it is. In the risedev env it will simply show no data
This panel is the CPU usage utilization. For example, we allocated 4C for the CN, but CN used 3C, that means the utilization is 75%. |
@Nebulazhang is right here. We also define the high CPU usage alerts (see here) as
I think it would be helpful to see the same numbers in the dev dashboard when debugging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@tabVersion The panel will not work in non-k8s environments (e.g. risedev). In those cases it will show that no data is available. I added a description that points that out. Are you fine with that? If not we would need some mechanism to hide panels without data, but I feel like this is overkill 😆 |
…o cajan93/cpu-relative
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
We get alerts that are relative to the CPU limit of the container. Let's add a panel that reflects this CPU usage
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.