-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): store compaction group id in per table info #17125
Conversation
…mmock-snapshot-group
…mmock-snapshot-group
…mmock-snapshot-group
…mmock-snapshot-group
…mmock-snapshot-group
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Li0k @Little-Wallace PTAL
option deprecated = true; | ||
repeated uint32 table_ids_add = 1 [deprecated = true]; | ||
repeated uint32 table_ids_remove = 2 [deprecated = true]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that we can completely deprecate GroupMetaChange
in GroupDelta
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because we won't update the member_table_ids
anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add deprecated
in L107 then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added deprecated
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we miss [deprecated = true]
in L107 for GroupMetaChange
.
); | ||
} else if let Some(group_change) = &summary.group_table_change { | ||
// TODO: may deprecate this branch? This enum variant is not created anywhere |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed. @Li0k@Little-Wallace What do you think? DO we still need to keep this branch for future usage?
@@ -894,6 +927,7 @@ pub fn build_initial_compaction_group_levels( | |||
vnode_partition_count: 0, | |||
}); | |||
} | |||
#[expect(deprecated)] // for backward-compatibility of previous hummock version delta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we planning to deprecate the whole Levels struct? I thought only member_table_ids
will be deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, only the field is deprecated. The annotation is applied over the whole struct because we can't apply on a single fie.d
option deprecated = true; | ||
repeated uint32 table_ids_add = 1 [deprecated = true]; | ||
repeated uint32 table_ids_remove = 2 [deprecated = true]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add deprecated
in L107 then?
@@ -395,7 +378,12 @@ impl HummockManager { | |||
let group = CompactionGroupInfo { | |||
id: levels.group_id, | |||
parent_id: levels.parent_group_id, | |||
member_table_ids: levels.member_table_ids.clone(), | |||
member_table_ids: current_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: how about sorting it before returning to keep the output consistent with what we did prior to this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we may somehow depend on that member_table_ids
is sorted, I have switched to use BtreeSet
instead of HashSet
so that we don't need to explicitly sort the collected vector.
@@ -337,7 +337,12 @@ pub(super) fn calc_new_write_limits( | |||
new_write_limits.insert( | |||
*id, | |||
WriteLimit { | |||
table_ids: levels.member_table_ids.clone(), | |||
table_ids: version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: how about sorting the table_ids here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to btreeset already.
compaction_group_member_tables: Arc<HashMap<CompactionGroupId, HashSet<TableId>>>, | ||
table_compaction_group_id: Arc<HashMap<TableId, CompactionGroupId>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use Arc
here? I don't see any caller of fn compaction_group_member_tables
or table_compaction_group_id
holding the Arc. Using Arc means we need to do a full in memory index rebuild instead of maintain the index incrementally on apply_delta.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I have remove maintaining table_compaction_group_id
and change to rebuild it when used, which is the same as before this PR.
For compaction_group_member_tables
, I have removed the Arc around it and changed to incrementally maintain it. A debug_assert_eq is applied to compare with the fully rebuilt version to ensure correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed the comment. @hzxa21 PTAL
option deprecated = true; | ||
repeated uint32 table_ids_add = 1 [deprecated = true]; | ||
repeated uint32 table_ids_remove = 2 [deprecated = true]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added deprecated
.
@@ -395,7 +378,12 @@ impl HummockManager { | |||
let group = CompactionGroupInfo { | |||
id: levels.group_id, | |||
parent_id: levels.parent_group_id, | |||
member_table_ids: levels.member_table_ids.clone(), | |||
member_table_ids: current_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we may somehow depend on that member_table_ids
is sorted, I have switched to use BtreeSet
instead of HashSet
so that we don't need to explicitly sort the collected vector.
@@ -337,7 +337,12 @@ pub(super) fn calc_new_write_limits( | |||
new_write_limits.insert( | |||
*id, | |||
WriteLimit { | |||
table_ids: levels.member_table_ids.clone(), | |||
table_ids: version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to btreeset already.
compaction_group_member_tables: Arc<HashMap<CompactionGroupId, HashSet<TableId>>>, | ||
table_compaction_group_id: Arc<HashMap<TableId, CompactionGroupId>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I have remove maintaining table_compaction_group_id
and change to rebuild it when used, which is the same as before this PR.
For compaction_group_member_tables
, I have removed the Arc around it and changed to incrementally maintain it. A debug_assert_eq is applied to compare with the fully rebuilt version to ensure correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the PR
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
After we introduce per state table info in hummock version, we can further store the corresponding compaction group id of a table inside the per table info. Member table ids in each
Levels
compaction group and related fields in version delta will be deprecated, so that we don't have to maintain the consistency between table info map and compaction group member table ids.Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.