-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(storage): remove writing delete ranges in flush #14776
Conversation
mut largest_table_key, | ||
point_range_pairs, | ||
} = Self::get_table_key_ends(table_id, point_range_pairs); | ||
let mut largest_table_key = Bound::Included(Bytes::new()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the removed method get_table_key_ends
, we have the following initial value. When there is no delete range, the value should remain unchanged.
let mut largest_table_key = Bound::Included(Bytes::new());
let mut smallest_table_key = BytesMut::new();
let mut smallest_empty = true;
|
||
if let Some(item) = payload.last() { | ||
if whether_update_largest_key(&largest_table_key, &item.0) { | ||
largest_table_key = Bound::Included(item.0.clone().0); | ||
} | ||
} | ||
if let Some(item) = payload.first() { | ||
if smallest_empty || item.0.as_ref().lt(smallest_table_key.as_ref()) { | ||
if item.0.as_ref().lt(smallest_table_key.as_ref()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of smallest_empty
is always true when there is no range delete.
next_idx: usize, | ||
} | ||
|
||
impl SharedBufferDeleteRangeIterator { | ||
pub(crate) fn new(inner: Arc<SharedBufferBatchInner>) -> Self { | ||
Self { inner, next_idx: 0 } | ||
pub(crate) fn new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic is moved from SharedBufferBatchInner::new
. This method is only used in test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is only used in test, should we add #[cfg(any(test, feature = "test"))]
?
@@ -1261,6 +1260,9 @@ pub(crate) mod tests { | |||
assert_eq!(key_count, scan_count); | |||
} | |||
|
|||
#[ignore] | |||
// may update the test after https://github.com/risingwavelabs/risingwave/pull/14320 is merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link #14320
del_iter.earliest_delete_since(epoch_with_gap.pure_epoch()); | ||
if value.is_delete() { | ||
table_key_last_delete_epoch = epoch_with_gap.pure_epoch(); | ||
} else if earliest_range_delete_which_can_see_key < table_key_last_delete_epoch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there is no range delete in del_iter, the earliest_delete_since will return u64::MAX, and then we will never reach this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the efforts to cleanup the codes.
@@ -77,7 +68,6 @@ pub(crate) struct SharedBufferBatchInner { | |||
imm_ids: Vec<ImmId>, | |||
/// The epochs of the data in batch, sorted in ascending order (old to new) | |||
epochs: Vec<HummockEpoch>, | |||
monotonic_tombstone_events: Vec<MonotonicDeleteEvent>, | |||
largest_table_key: Bound<Bytes>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can largest_table_key
not be Include
after removing delete range? I think we can make it Bytes
instead of Bound<Bytes>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that without range delete, the largest_table_key
and smallest_table_key
is always the last and first key in the payload. If so, we don't have to manually store these two keys, but instead always take the last or first key when we want to get the largest or smallest key.
I have updated the PR to remove tracking the largest and smallest key in imm. The correctness depends on the imm payload is non-empty. I have inspected our current codebase. All call path to build an imm will ensure that the payload is non-empty and well sorted, and debug_assert is added to ensure this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code of in memory compaction is updated. @StrikeW PTAL.
next_idx: usize, | ||
} | ||
|
||
impl SharedBufferDeleteRangeIterator { | ||
pub(crate) fn new(inner: Arc<SharedBufferBatchInner>) -> Self { | ||
Self { inner, next_idx: 0 } | ||
pub(crate) fn new( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is only used in test, should we add #[cfg(any(test, feature = "test"))]
?
@@ -582,7 +574,7 @@ impl<S: StateStoreWrite + StateStoreRead> LocalStateStore for MemtableLocalState | |||
} | |||
self.inner.ingest_batch( | |||
kv_pairs, | |||
delete_ranges, | |||
vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove delete_ranges
in StateStoreWrite::ingest_batch
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporarily we cannot, because in memory and sled state store, it depends on passing range delete to remove the keys below table watermark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
Since in streaming side, we now write per vnode table watermark instead of range tombstones, in this PR we remove the delete ranges parameter of
flush
method of local state store. The immutable memtable will not contain any range tombstone, and related code is removed to make our code cleaner.Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.