-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: source splits can be unevenly assigned to workers when there are too many actors #14333
Comments
Is it possible to be caused by #14170 especially the last commit |
It seems previously the assignment is kind of random (?), but now all assigned to one node. |
How many source actors will be created in this case? If there are >24 actors in a node I think #14170 will lead to the problem.. 🤔 |
I get it. We have 3 sources, each one have 8 partitions. For each source, each compute node has 8 source actors because it has 8 cores. 24 actors in total. Previously it's randomly assigned, so sometimes it's relatively even (7-8-9), sometimes it's uneven (3-8-13), and can OOM. Now I added a cmp by actor id to make the assignment deterministic, so all 8 splits will be assigned to 8 actors with lowest actor ids (on the same node). This is the same for all 3 sources, so one node will be assigned 24 splits. The previous behavior is also not ideal. How can we improve that? 🤔 Basically the problem is |
https://buildkite.com/risingwave-test/longevity-test/builds/883#018ccab2-3a2b-4d12-aace-6757affb4abe 1 topic, 1 unified source, parallelism 3 create MV for each logical source, 3 mvs in total and there are 8*25 nexmark MVs built on top of these 3 base MVs |
Wait, if |
I guess in the benchmark script, the paralleism only affects query MVs, but not the 3 source MVs, so my reasoning still applies. |
shall we close this issue as fixed? cc @xxchan |
I'm not sure. We haven't implemented rack-aware scheduling, so the problem can still happen. Do you think it's not a large concern and we will not implement it in near future? 🤔 |
The source split is not even since nightly-20231228
reglngvty-20231228-150237
(nightly-20231228
)reglngvty-20231227-150231
(nightly-20231227
)Code diff: 4695ad1...aa9dcac
Any ideas? cc. @shanicky
Originally posted by @fuyufjh in #14324 (comment)
The text was updated successfully, but these errors were encountered: