Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compaction): support merge compaction group #18188

Merged
merged 21 commits into from
Sep 10, 2024

Conversation

Li0k
Copy link
Contributor

@Li0k Li0k commented Aug 22, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

part of #17898, This PR introduces a new ability to merge existing Compaction Groups by adding a new GroupMerge delta, some changes introduced in this pr:

  1. ctl support merge group
  2. support GroupMerge for hummock
  3. reorg code
  4. rename GroupSplit timer event to GroupSchedule

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@Li0k Li0k requested review from wenym1, zwang28 and hzxa21 August 22, 2024 08:57
@Li0k
Copy link
Contributor Author

Li0k commented Aug 22, 2024

@wenym1
hi. After merging this PR, we will have the ability to Merge, but this PR does not support auto-scheduling yet. We can explore how to proactively merge newly generated Groups through the Merge interface after Backfill.

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most comments are minor. Thanks for the efforts!

new_version_delta.pre_apply();

// purge right_group_id
compaction_groups_txn.purge(HashSet::from_iter(get_compaction_group_ids(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we directly call compaction_groups_txn.remove(...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Catch, We've already triggered Group Destory in the Merge Delta process, so we really don't need to purge again here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain more on this? I think here we are manipulating the CompactionGroup metadata in CompactionGroupManager, not in HummockVersion. Why is this related to GroupDestroy delta?

src/storage/hummock_sdk/src/compaction_group/mod.rs Outdated Show resolved Hide resolved
src/storage/hummock_sdk/src/compaction_group/mod.rs Outdated Show resolved Hide resolved
src/storage/hummock_sdk/src/compaction_group/mod.rs Outdated Show resolved Hide resolved
src/storage/hummock_sdk/src/compaction_group/mod.rs Outdated Show resolved Hide resolved
src/storage/hummock_sdk/src/compaction_group/mod.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Given that the sub_level_id assignment is changed to not rely on epoch in this PR, I suggest we run another round of testing before merging.

Copy link
Contributor

@zwang28 zwang28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

.get_sst_ids_by_group_id(right_group_id),
)
{
if !sst_id_set.insert(sst_id) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case will we have duplicate SST ids? Is it impossible and we just have the check here as a safeguard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#18431

Before 18431, commit_epoch may generate the duplicated sst_id, This check is just for forward compatibility.

}

// TODO(li0k): remove this check (Since the current split_sst does not change key_range, this check can not be removed, otherwise concate will fail.)
// check branched sst on non-overlap level
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: this comment is stale?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, we can always maintain this check

Comment on lines 141 to 143
for level_idx in 0..max_level {
let left_level = left_levels.levels.get(level_idx).unwrap();
let right_level = right_levels.levels.get(level_idx).unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is confusing that levels.get(level_idx) is getting level level_idx + 1. At the first glance, I thought we are checking level 0 as well here. How about using the get_level here?

@Li0k Li0k enabled auto-merge September 10, 2024 07:59
@Li0k Li0k added this pull request to the merge queue Sep 10, 2024
Merged via the queue into main with commit 99d6121 Sep 10, 2024
32 of 33 checks passed
@Li0k Li0k deleted the li0k/storage_support_merge_cg_part branch September 10, 2024 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants