-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(topn): reduce unnecessary table scan in streaming (Group)TopN executors #13832
Conversation
…State Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
cache.insert(topn_row.cache_key, (&topn_row.row).into()); | ||
if cache.len() == cache_size_limit { | ||
table_row_count = None; // cache becomes full, we cannot get precise table row count this time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we shall check this euqal before insert, so we can know that the number of record in hummock is exactly larger than cache. As following:
if cache.len() == cache_size_limit {
table_row_count = None;
break;
}
cache.insert(topn_row.cache_key, (&topn_row.row).into());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait a minute, I'm still working on this. Not ready😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get it. When fill_high_cache
being called, the high
cache will never be full, so this part of code is just to insert into the cache until it reaches the size limit.
} | ||
|
||
pub fn delete(&mut self, value: impl Row) { | ||
self.state_table.delete(value); | ||
if let Some(row_count) = self.row_count.as_mut() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ensure that the delete key must exist in hummock ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe now we have state table delete sanity check right?
Signed-off-by: Richard Chien <[email protected]>
@@ -325,19 +351,20 @@ impl TopNCacheTrait for TopNCache<false> { | |||
&& (self.offset == 0 || cache_key > *self.low.last_key_value().unwrap().0) | |||
{ | |||
// The row is in mid | |||
self.middle.remove(&cache_key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why to move this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to keep the entries in cache consistent with table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea sounds OK to me. May test whether it works.
Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
Did you tested the perf issue is resolved by this? |
Will run a test soon |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #13832 +/- ##
==========================================
- Coverage 68.06% 68.05% -0.02%
==========================================
Files 1535 1535
Lines 264826 264874 +48
==========================================
+ Hits 180266 180269 +3
- Misses 84560 84605 +45
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
The throughput doesn't change too much while the bloom filter false-positive rate does decrease compared to #13797. |
Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
The barrier's piling up is because at the beginning, the source emits insert operations, and after 5 minutes passed, the temporal filter began to emit delete operations. Then the performance degraded, source executors were backpressured. But the temporal filter can not be back pressured so it still emits the delete message. |
It is included🥵 |
The low performance might because of #13968 |
Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge the PR first because it actually solve a part of performance issue.
nexmark |
Can you share the effect? like how much table iter ops was reduced after this PR |
As originally reported in #13797, the bloom filter false-positive rate is high (up to 50% in most time), which means near half of table iter operations are wasted. After this PR, the FPR is near 0% in most time, around 20% when warming up (many groups are new, so that manually maintained in-executor However, throughput was still not improved as expected. As @st1page found, another problem (possibly the true IO bottleneck) is that GroupTopN doesn't fetch missing groups concurrently (#13968), which seems likely to prevent from storage layer batching. |
…ecutors (#13832) Signed-off-by: Richard Chien <[email protected]>
…ecutors (#13832) Signed-off-by: Richard Chien <[email protected]>
Signed-off-by: Richard Chien <[email protected]> Co-authored-by: Richard Chien <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
According to #13797 and some further investigation, when using a temporal filter with short time period, it's very likely that the
high
part of TopN cache frequently being deleted to empty. And because TopN executors (actually the managed TopN state) just blindly scan the state table to fillhigh cache
, the performance can be significantly impacted.This PR maintains a
row_count
in the managed TopN state, so that it's possible to avoid table scan when cache is empty if the managed state knowns the cache is still synced with the table (row counts match).Fixes #13797.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.