Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chaos Mesh compute-node-limit-bandwidth create table statement fails too slow? #14226

Open
Tracked by #14213
lmatz opened this issue Dec 27, 2023 · 2 comments
Open
Tracked by #14213
Assignees
Milestone

Comments

@lmatz
Copy link
Contributor

lmatz commented Dec 27, 2023

Describe the bug

https://buildkite.com/risingwave-test/longevity-chaos-mesh/builds/367#018ca985-8aa6-42f4-bc2b-40843b071743
(Please omit the first execution/tab, it failed due to other environmental reasons)

This experiment is implemented by @xuefengze

experiments="[{\"action\": \"bandwidth\", \"cases\": [{\"mode\": [\"one\", \"all\"], \"label\": {\"key\": \"risingwave/component\", \"value\": \"compute\"}, \"duration\": \"10m\", \"direction\": \"to\", \"bandwidth\": {\"rate\": \"1mbps\", \"limit\": 20971520, \"buffer\": 10000}}]}]"

We have two experiments, one is limiting the bandwidth of only one compute node and the other one is limiting the bandwidth of all the compute nodes. The limitation lasts for 10 minutes.

SCR-20231227-isx

In the first experiment, the limitation of bandwidth is applied around 12:34:02.
The create table t1 query was issued around 12:38:07 but only returned around 12:44:04.

There are two questions:

  1. why did the select query before create table t1 succeed while the create table t1 failed?
  2. why does the create table t1 statement require several minutes to return? Since we typically expect create type of statement to finish quickly, does it make sense to reduce the timeout and return a message soon, e.g. under 10 seconds. This is similar to the issue found in Chaos Mesh compute-meta-network-partition batch query fails occasionally #14217.

Error message/log

No response

To Reproduce

No response

Expected behavior

No response

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

@lmatz lmatz added type/bug Something isn't working found-by-chaos-mesh labels Dec 27, 2023
@github-actions github-actions bot added this to the release-1.6 milestone Dec 27, 2023
@lmatz lmatz modified the milestones: release-1.6, release-1.7 Jan 9, 2024
@fuyufjh
Copy link
Member

fuyufjh commented Mar 6, 2024

Might be the same as #14217?

Can we run an await tree dump if it reproduces?

@fuyufjh fuyufjh modified the milestones: release-1.7, release-1.8 Mar 6, 2024
@lmatz lmatz modified the milestones: release-1.8, release-1.9 Apr 8, 2024
@lmatz lmatz modified the milestones: release-1.9, release-1.10 May 14, 2024
Copy link
Contributor

github-actions bot commented Aug 1, 2024

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants