Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reglngvty-20231128-150243 OOM #13711

Closed
lmatz opened this issue Nov 29, 2023 · 5 comments
Closed

reglngvty-20231128-150243 OOM #13711

lmatz opened this issue Nov 29, 2023 · 5 comments
Assignees
Labels
Milestone

Comments

@lmatz
Copy link
Contributor

lmatz commented Nov 29, 2023

OOM

setting: Nightly run all nexmark (16 sets of nexmark queries) with 10000 throughput

https://risingwave-labs.slack.com/archives/C048NM5LNKX/p1701220086545809

Grafana Metric https://grafana.test.risingwave-cloud.xyz/d/EpkBw5W4k/risingwave-dev-dashboard?orgId=1&var-datasource=Prometheus:%20test-useast1-eks-a&var-namespace=reglngvty-20231128-150243&from=1701185131000&to=1701220086000
Grafana Logs https://grafana.test.risingwave-cloud.xyz/d/liz0yRCZz1/log-search-dashboard?orgId=1&var-data_source=Logging:%20test-useast1-eks-a&var-namespace=reglngvty-20231128-150243&from=1701185131000&to=1701220086000
Buildkite Job https://buildkite.com/risingwave-test/longevity-test/builds/826
SCR-20231129-flp

Add some context:

Notice the "16 sets of nexmark queries" means every query is run for 16 times, that is, totally 16 * 25 queries running in the testing instance. This is the first time we ran such test.

@lmatz lmatz added type/bug Something isn't working found-by-longevity-test labels Nov 29, 2023
@github-actions github-actions bot added this to the release-1.5 milestone Nov 29, 2023
@lmatz
Copy link
Contributor Author

lmatz commented Nov 30, 2023

looks like no single component using too much memory...

anyone has idea?

@fuyufjh fuyufjh self-assigned this Nov 30, 2023
@fuyufjh
Copy link
Member

fuyufjh commented Nov 30, 2023

Already contains #13648.

I have tested #13648 with longevity test (nexmark all queries), and it should pass.

Let me take a look.

@fuyufjh
Copy link
Member

fuyufjh commented Nov 30, 2023

Well, the memory manager already tries its best to evict cache, so it's not its fault.

image

@wcy-fdu wcy-fdu self-assigned this Nov 30, 2023
@fuyufjh fuyufjh modified the milestones: release-1.5, release-1.6 Dec 4, 2023
@fuyufjh
Copy link
Member

fuyufjh commented Dec 4, 2023

It's because too many queries running in the instance: 16 sets of nexmark queries

This seems to be expected. When we have 400 queries running parallelly, there will be many fragmented memory usage, added up together and blow off the memory.

looks like no single component using too much memory...

Yes. This is actually a good sign, otherwise, we have something to fix...

@fuyufjh
Copy link
Member

fuyufjh commented Dec 4, 2023

I think this should not be considered a bug.

@fuyufjh fuyufjh closed this as not planned Won't fix, can't repro, duplicate, stale Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants