-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(storage): maintain sub_level id order #18915
Conversation
.latest_version() | ||
.max_sub_level_id | ||
.map(|id| id + 1) | ||
.unwrap_or(FIRST_SUB_LEVEL_ID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be backward compatible, should we init the max_sub_level_id
on meta startup to be the actual max sub-level id and we use the max_sub_level_id
field only for the new sublevel's id?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be backward compatible, should we init the max_sub_level_id on meta startup
This has been done in src/storage/hummock_sdk/src/version.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, reaching line 169 indicates that the version is indeed a fresh new one.
Alternatively we can make the in-mem HummockVersion:: max_sub_level_id
u64 directly instead of Option<64>, in order to avoid such unwrap_or
whenever reading max_sub_level_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok for now, the newly introduced fields keep the Option restrictions.
When line 169 is reached, the sub_level will no longer exist in the compaciton group, so it is correct to use FIRST_SUB_LEVEL_ID
for initialisation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use the max_sub_level_id field only for the new sublevel's id
True, it's a bug. Thanks.
@@ -764,6 +759,14 @@ impl HummockVersion { | |||
&version_delta.state_table_info_delta, | |||
&changed_table_info, | |||
); | |||
|
|||
// The max_sub_level_id may have been increased, e.g. during commit_epoch or merge_group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should max with self.max_sub_level_id
and otherwise it will be reset to 0 when all data are clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Thanks.
@@ -883,7 +886,7 @@ impl HummockVersion { | |||
) | |||
}); | |||
|
|||
group_split::merge_levels(left_levels, right_levels); | |||
group_split::merge_levels(left_levels, right_levels, self.max_sub_level_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's better to maintain a per-compaction group max_sub_level_id instead of a global one, and then we don't need to pass the global one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to either option. @Li0k
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer global uniqueness, for me at the moment this sub level id has no more semantics. So globally unique is more concise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM, thanks for the quick fix
.latest_version() | ||
.max_sub_level_id | ||
.map(|id| id + 1) | ||
.unwrap_or(FIRST_SUB_LEVEL_ID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok for now, the newly introduced fields keep the Option restrictions.
When line 169 is reached, the sub_level will no longer exist in the compaciton group, so it is correct to use FIRST_SUB_LEVEL_ID
for initialisation.
@@ -883,7 +886,7 @@ impl HummockVersion { | |||
) | |||
}); | |||
|
|||
group_split::merge_levels(left_levels, right_levels); | |||
group_split::merge_levels(left_levels, right_levels, self.max_sub_level_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer global uniqueness, for me at the moment this sub level id has no more semantics. So globally unique is more concise.
This comment was marked as resolved.
This comment was marked as resolved.
.latest_version() | ||
.max_sub_level_id | ||
.map(|id| id + 1) | ||
.unwrap_or(FIRST_SUB_LEVEL_ID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For extra safety, I suggest we fill in max_sub_level_id
using Epoch::now()
if it is None on meta startup (in load_meta_store_state_impl) after all existing deltas are applied. In this case, we can make sure max_sub_level_id
is larger than any previously generated sub level id.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this should be the correct way.
Previously I didn't consider the delta replay for backward-compatibility. Backward compatibility shouldn't be done during pb deserialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored:
- In-mem max_sub_level_id's type is u64 instead of Option now.
- backward compatibility is ensured during
load_meta_store_state_impl
.
I prefer #18893 to this PR.
|
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR reverts #18886, then resolves the same issue by maintaining a new field
max_sub_level_id
inHummockVersion
.It has advantage of keeping the non-decreasing order of sub_level_id, in contrast to #18886.
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.