-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(compaction): partition all levels by vnode #11903
Conversation
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 3994 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
1758 | 1 | 2235 | 0 |
Click to see the invalid file list
- src/meta/src/hummock/compaction/picker/intral_sub_level_picker.rs
src/meta/src/hummock/compaction/picker/intral_sub_level_picker.rs
Outdated
Show resolved
Hide resolved
Signed-off-by: Little-Wallace <[email protected]>
7838018
to
0ce7761
Compare
Signed-off-by: Little-Wallace <[email protected]>
75e02df
to
71a0b73
Compare
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
proto/hummock.proto
Outdated
@@ -55,6 +56,7 @@ message IntraLevelDelta { | |||
uint64 l0_sub_level_id = 2; | |||
repeated uint64 removed_table_ids = 3; | |||
repeated SstableInfo inserted_table_infos = 4; | |||
uint32 partition_vnode_count = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about unify partition_vnode_count
-> vnode_partition_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. I will rename it.
src/common/src/config.rs
Outdated
@@ -890,7 +890,7 @@ pub mod default { | |||
} | |||
|
|||
pub fn partition_vnode_count() -> u32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
@@ -188,6 +222,17 @@ impl DynamicLevelSelectorCore { | |||
std::cmp::min(base_bytes_max, cur_level_size) | |||
}; | |||
|
|||
if levels.can_partition_by_vnode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if levels.can_partition_by_vnode() {
partition_level(
levels.member_table_ids[0],
levels.vnode_partition_count as usize,
levels.get_level(ctx.base_level),
&mut ctx
);
}
IMO, its more intuitive to remove line:176 and directly modify or clear ctx.target_partitions in function partition_level
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No... Because it may return in line 199
max_size * SCORE_BASE * (level.vnode_partition_count as u64) | ||
/ ctx.level_max_bytes[level_idx], | ||
); | ||
// Reduce the level num of l0 non-overlapping sub_level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix comments
picker_type: PickerType::BaseLevelCompaction(ctx.target_partitions.clone()), | ||
}); | ||
} else { | ||
// Reduce the level num of l0 non-overlapping sub_level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Some(compact_task) | ||
} | ||
|
||
pub fn is_trivial_move_task(task: &CompactTask) -> bool { | ||
if task.split_weight_by_vnode > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add comments
let mut compaction_bytes = 0; | ||
let mut compaction_file_count = 0; | ||
let mut input_levels = vec![]; | ||
if l0.sub_levels.len() - idx < self.config.level0_sub_level_compact_level_count as usize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put it as the first judgment condition to reduce calculations?
input_levels.reverse(); | ||
|
||
let vnode_partition_count = | ||
if compaction_bytes >= self.config.sub_level_max_compaction_bytes || wait_enough { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this pr, shall we consider to increase the sub_level_max_compaction_bytes
as default ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is not a good idea, because if the sub_level_max_compaction_bytes
increase, the avg count of input level compacting to base-level may decrease.
|
||
let mut skip_pending_compact = false; | ||
|
||
for part in &self.partitions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about encapsulating the logic of basic
and partition
into separate functions?
I'm temporarily suspending the review and will return as soon as possible |
Signed-off-by: Little-Wallace <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #11903 +/- ##
==========================================
+ Coverage 69.71% 69.97% +0.25%
==========================================
Files 1427 1410 -17
Lines 237432 237064 -368
==========================================
+ Hits 165518 165876 +358
+ Misses 71914 71188 -726
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 376 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Signed-off-by: Little-Wallace <[email protected]>
Bound::Excluded(key) => key_range.compare_right_with_user_key(key.as_ref()), | ||
Bound::Unbounded => { | ||
let right_key = FullKey::decode(&key_range.right); | ||
assert!(right_key.user_key.table_id.table_id == table_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we directly check SstableInfo only contains the single table_id
?
can_partition | ||
} | ||
|
||
pub fn partition_level( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can_partition_level
and can_partition_level
seems duplicated, can we unify them to the single function ? (It might result in more copies)
use crate::hummock::level_handler::LevelHandler; | ||
|
||
#[derive(Default, Clone, Debug)] | ||
pub struct PartitionLevelInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move PartitionLevelInfo
and LevelPartition
to mod.rs
, that are shared
{ | ||
return Some(ret); | ||
} | ||
self.pick_l0_intra(levels, level_handlers, partitions, stats) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have the dedicated IntraPicker
, we don't need to keep pick_l0_intra in the LevelCompactionPicker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emmmm, this method is to call IntraPicker
.
) -> Option<CompactionInput> { | ||
let l0 = levels.l0.as_ref().unwrap(); | ||
let target_level = levels.get_level(self.target_level); | ||
let min_sub_level_id = l0_partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about directly to check l0_partitions.is_empty()
.
Externally we determine if the level can trigger the can_partition
. In what case would we use the PartitionLevelCompactionPicker but l0_partitions is empty?
This will degrade to the whole level merging l0
, is this expected? Maybe we need a metrics to monitor
if level.level_type() != LevelType::Nonoverlapping | ||
|| level_handlers[0].is_level_pending_compact(level) | ||
{ | ||
stats.skip_by_pending_files += 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect metrics are reported when we encounter Overlapping level
if is_new_user_key { | ||
if switch_builder { | ||
need_seal_current = true; | ||
} else if builder.reach_capacity() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to avoid the huge sst if we remove the reach_max_sst_size
check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we do not request vnode_change
, so it is not necessary to meet the condition reach_max_sst_size
.
Can we refactor it before merge to unify the restriction code into validator, like the main branch? |
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
This PR changes too much code. To make our system keep stable in some time, I decide to only port partial of this PR to main branch in #12534 |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
decide base level size by partition.
We used to increase base level size directly no matter whether these sstables are partitioned by vnode. It may cause some bad case which can not choose a compact task. This PR will bring a simple strategy, which partition sstables by vnode before select files. RisingWave only chooses files in one partition, it could avoid choose a big partition.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.