Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compaction): partition all levels by vnode #11903

Closed
wants to merge 19 commits into from

Conversation

Little-Wallace
Copy link
Contributor

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

decide base level size by partition.

We used to increase base level size directly no matter whether these sstables are partitioned by vnode. It may cause some bad case which can not choose a compact task. This PR will bring a simple strategy, which partition sstables by vnode before select files. RisingWave only chooses files in one partition, it could avoid choose a big partition.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license-eye has totally checked 3994 files.

Valid Invalid Ignored Fixed
1758 1 2235 0
Click to see the invalid file list
  • src/meta/src/hummock/compaction/picker/intral_sub_level_picker.rs

Signed-off-by: Little-Wallace <[email protected]>
@Little-Wallace Little-Wallace force-pushed the wallace/partition-group branch from 7838018 to 0ce7761 Compare August 29, 2023 07:02
@Little-Wallace Little-Wallace requested a review from a team as a code owner August 30, 2023 12:20
Signed-off-by: Little-Wallace <[email protected]>
@Little-Wallace Little-Wallace force-pushed the wallace/partition-group branch from 75e02df to 71a0b73 Compare August 31, 2023 02:15
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
@@ -55,6 +56,7 @@ message IntraLevelDelta {
uint64 l0_sub_level_id = 2;
repeated uint64 removed_table_ids = 3;
repeated SstableInfo inserted_table_infos = 4;
uint32 partition_vnode_count = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about unify partition_vnode_count -> vnode_partition_count

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. I will rename it.

@@ -890,7 +890,7 @@ pub mod default {
}

pub fn partition_vnode_count() -> u32 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -188,6 +222,17 @@ impl DynamicLevelSelectorCore {
std::cmp::min(base_bytes_max, cur_level_size)
};

if levels.can_partition_by_vnode()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if levels.can_partition_by_vnode() {
    partition_level(
                levels.member_table_ids[0],
                levels.vnode_partition_count as usize,
                levels.get_level(ctx.base_level),
                &mut ctx
            );

}

IMO, its more intuitive to remove line:176 and directly modify or clear ctx.target_partitions in function partition_level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... Because it may return in line 199

max_size * SCORE_BASE * (level.vnode_partition_count as u64)
/ ctx.level_max_bytes[level_idx],
);
// Reduce the level num of l0 non-overlapping sub_level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix comments

picker_type: PickerType::BaseLevelCompaction(ctx.target_partitions.clone()),
});
} else {
// Reduce the level num of l0 non-overlapping sub_level
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Some(compact_task)
}

pub fn is_trivial_move_task(task: &CompactTask) -> bool {
if task.split_weight_by_vnode > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comments

let mut compaction_bytes = 0;
let mut compaction_file_count = 0;
let mut input_levels = vec![];
if l0.sub_levels.len() - idx < self.config.level0_sub_level_compact_level_count as usize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put it as the first judgment condition to reduce calculations?

input_levels.reverse();

let vnode_partition_count =
if compaction_bytes >= self.config.sub_level_max_compaction_bytes || wait_enough {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this pr, shall we consider to increase the sub_level_max_compaction_bytes as default ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not a good idea, because if the sub_level_max_compaction_bytes increase, the avg count of input level compacting to base-level may decrease.


let mut skip_pending_compact = false;

for part in &self.partitions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about encapsulating the logic of basic and partition into separate functions?

@Li0k
Copy link
Contributor

Li0k commented Sep 11, 2023

I'm temporarily suspending the review and will return as soon as possible

Signed-off-by: Little-Wallace <[email protected]>
@codecov
Copy link

codecov bot commented Sep 12, 2023

Codecov Report

Merging #11903 (2325058) into main (a7b7688) will increase coverage by 0.25%.
Report is 7 commits behind head on main.
The diff coverage is 91.13%.

❗ Current head 2325058 differs from pull request most recent head 1c8ab1c. Consider uploading reports for the commit 1c8ab1c to get more accurate results

@@            Coverage Diff             @@
##             main   #11903      +/-   ##
==========================================
+ Coverage   69.71%   69.97%   +0.25%     
==========================================
  Files        1427     1410      -17     
  Lines      237432   237064     -368     
==========================================
+ Hits       165518   165876     +358     
+ Misses      71914    71188     -726     
Flag Coverage Δ
rust 69.97% <91.13%> (+0.25%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
src/meta/src/hummock/compaction/picker/mod.rs 100.00% <ø> (ø)
...ummock/compaction/tombstone_compaction_selector.rs 0.00% <0.00%> (ø)
src/meta/src/hummock/mod.rs 0.00% <ø> (ø)
src/meta/src/hummock/test_utils.rs 93.98% <0.00%> (-2.78%) ⬇️
src/storage/src/hummock/sstable/builder.rs 90.05% <ø> (-1.68%) ⬇️
...action/picker/partition_level_compaction_picker.rs 75.75% <75.75%> (ø)
...ck_sdk/src/compaction_group/hummock_version_ext.rs 75.70% <89.65%> (+0.33%) ⬆️
src/storage/hummock_test/src/compactor_tests.rs 95.11% <92.20%> (-0.36%) ⬇️
src/meta/src/hummock/compaction/mod.rs 89.68% <92.45%> (+3.11%) ⬆️
...action/picker/partition_intral_sub_level_picker.rs 93.16% <93.16%> (ø)
... and 17 more

... and 376 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Signed-off-by: Little-Wallace <[email protected]>
Bound::Excluded(key) => key_range.compare_right_with_user_key(key.as_ref()),
Bound::Unbounded => {
let right_key = FullKey::decode(&key_range.right);
assert!(right_key.user_key.table_id.table_id == table_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we directly check SstableInfo only contains the single table_id ?

can_partition
}

pub fn partition_level(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can_partition_level and can_partition_level seems duplicated, can we unify them to the single function ? (It might result in more copies)

use crate::hummock::level_handler::LevelHandler;

#[derive(Default, Clone, Debug)]
pub struct PartitionLevelInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move PartitionLevelInfo and LevelPartition to mod.rs, that are shared

{
return Some(ret);
}
self.pick_l0_intra(levels, level_handlers, partitions, stats)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have the dedicated IntraPicker, we don't need to keep pick_l0_intra in the LevelCompactionPicker

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emmmm, this method is to call IntraPicker.

) -> Option<CompactionInput> {
let l0 = levels.l0.as_ref().unwrap();
let target_level = levels.get_level(self.target_level);
let min_sub_level_id = l0_partitions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about directly to check l0_partitions.is_empty().

Externally we determine if the level can trigger the can_partition. In what case would we use the PartitionLevelCompactionPicker but l0_partitions is empty?

This will degrade to the whole level merging l0, is this expected? Maybe we need a metrics to monitor

if level.level_type() != LevelType::Nonoverlapping
|| level_handlers[0].is_level_pending_compact(level)
{
stats.skip_by_pending_files += 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect metrics are reported when we encounter Overlapping level

if is_new_user_key {
if switch_builder {
need_seal_current = true;
} else if builder.reach_capacity() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to avoid the huge sst if we remove the reach_max_sst_size check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we do not request vnode_change, so it is not necessary to meet the condition reach_max_sst_size .

@Li0k
Copy link
Contributor

Li0k commented Sep 14, 2023

Can we refactor it before merge to unify the restriction code into validator, like the main branch?

@Little-Wallace
Copy link
Contributor Author

This PR changes too much code. To make our system keep stable in some time, I decide to only port partial of this PR to main branch in #12534

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants