Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batching and re-build the chunk in the exchange receiver to avoid too low selectivity visibility bitmap #15713

Open
st1page opened this issue Mar 15, 2024 · 6 comments
Assignees
Milestone

Comments

@st1page
Copy link
Contributor

st1page commented Mar 15, 2024

proposed by @fuyufjh

Currently, our stream exchange highly depends on chunk's visibility bitmap. In the sender side, the chunk is cloned into each dispatcher and then each dispatcher eliminates the rows not belong to the downstream shard.

StreamChunk::with_visibility(ops.clone(), chunk.columns().into(), vis_map);

So supposing there are N parallelisms of the downstream fragment, the selectivity of the visibility bitmap would be 1/N.
And on the Receiver side, it will receive these two kind of chunk

  • compacted chunk with low cardinality(chunk_size/N) from the remote shuffle
  • chunk with low selectivity visibility bitmap (1/N)

They are both not friendly to our vectorized processing engine

@github-actions github-actions bot added this to the release-1.8 milestone Mar 15, 2024
@lmatz
Copy link
Contributor

lmatz commented Mar 15, 2024

Is it possible that we rearrange/rebuild the rows in the chunk before exchange and group rows that belong to the downstream consecutive with each other?

Then, on the receiver side, we don't compact and still get chunk with low selectivity visibility bitmap,
but the following operator still gets the benefit of vectorized processing as it only needs to process a consecutive range of rows.

@st1page
Copy link
Contributor Author

st1page commented Mar 15, 2024

Is it possible that we rearrange/rebuild the rows in the chunk before exchange and group rows that belong to the downstream consecutive with each other?

Then, on the receiver side, we don't compact and still get chunk with low selectivity visibility bitmap, but the following operator still gets the benefit of vectorized processing as it only needs to process a consecutive range of rows.

But what is the advantage comparing with rebuilding in the receiver side 🤔 Their overhead are both one more copy and construct the chunk

@st1page st1page self-assigned this Mar 15, 2024
@lmatz
Copy link
Contributor

lmatz commented Mar 15, 2024

But what is the advantage comparing with rebuilding in the receiver side

each receiver side needs to rebuild once vs just rebuild once on the sender side

Edit:
I see, there is batching step on the receiver side, so it can be amortized

@st1page st1page changed the title Batching and re-build the chunk in the exchange receiver to avoid too lower selectivity visibility bitmap Batching and re-build the chunk in the exchange receiver to avoid too low selectivity visibility bitmap Mar 17, 2024
@st1page st1page modified the milestones: release-1.9, release-1.10 May 13, 2024
@st1page
Copy link
Contributor Author

st1page commented May 13, 2024

Update: rebuild the chunk in any case will make the nexmark performance degrade #16055

Copy link
Contributor

github-actions bot commented Aug 1, 2024

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

@st1page
Copy link
Contributor Author

st1page commented Aug 19, 2024

#17968

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants