-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(stream): merge stream chunks at MergeExecutor #17968
Conversation
src/stream/src/executor/project.rs
Outdated
} else { | ||
chunk | ||
}; | ||
// let chunk = if chunk.selectivity() <= self.materialize_selectivity_threshold { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat a workaround before. As we now compact all data after Exchange
, the problem should be mostly resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM.
Curious how to export the benchmark comparison result to the table and bar graph? It will be helpful to test any PR that can be significant to performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Should we monitor the chunk size after BufferChunks
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
/// A wrapper that buffers the `StreamChunk`s from upstream until no more ready items are available. | ||
/// Besides, any message other than `StreamChunk` will trigger the buffered `StreamChunk`s | ||
/// to be emitted immediately, as well as the message itself. | ||
struct BufferChunks<S: Stream> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe we can call this BufferedChunkReader
since it's very like tokio::io::BufReader
and java.io.BufferedReader
.
I exported the result from Metabase RW Compare to Excel to calculate the geomean. |
I'd like to take a look for this PR, but generally I don't think we need it in the future. |
This PR has been open for 60 days with no activity. If it's blocked by code review, feel free to ping a reviewer or ask someone else to review it. If you think it is still relevant today, and have time to work on it in the near future, you can comment to update the status, or just manually remove the You can also confidently close this PR to keep our backlog clean. (If no further action taken, the PR will be automatically closed after 7 days. Sorry! 🙏) |
Close this PR as there's no further actions taken after it is marked as stale for 7 days. Sorry! 🙏 You can reopen it when you have time to continue working on it. |
Well... Let me merge this today, as we just release 2.1 |
|
GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
---|---|---|---|---|---|
9425213 | Triggered | Generic Password | cb84263 | ci/scripts/e2e-source-test.sh | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Resolves #17824 (comment). Please see the background there.
Related work #17967, but the approach of this PR avoids additional pending time. This is because the
BufferChunks
will build and return the ready chunks immediately when the inner stream returnsPoll::Pending
.Benchmark result
Overall, it's a bit better than before.
I will run a longevity test as well before merging it.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.