Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): support reverse scan #12570

Merged
merged 31 commits into from
May 21, 2024
Merged

feat(storage): support reverse scan #12570

merged 31 commits into from
May 21, 2024

Conversation

Little-Wallace
Copy link
Contributor

@Little-Wallace Little-Wallace commented Sep 27, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

support reverse scan for range cache expand.
Maybe in some batch query just like select a from table order by b desc limit 1, we can also use reverse-scan to optimize time complex

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@Little-Wallace Little-Wallace marked this pull request as ready for review October 8, 2023 08:11
@codecov
Copy link

codecov bot commented Oct 8, 2023

Codecov Report

Merging #12570 (6fee500) into main (129ab28) will increase coverage by 0.04%.
Report is 4 commits behind head on main.
The diff coverage is 60.84%.

@@            Coverage Diff             @@
##             main   #12570      +/-   ##
==========================================
+ Coverage   69.32%   69.37%   +0.04%     
==========================================
  Files        1470     1471       +1     
  Lines      241541   242154     +613     
==========================================
+ Hits       167448   167991     +543     
- Misses      74093    74163      +70     
Flag Coverage Δ
rust 69.37% <60.84%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/storage/hummock_test/src/state_store_tests.rs 84.53% <100.00%> (ø)
src/storage/src/hummock/iterator/forward_user.rs 97.11% <100.00%> (+0.05%) ⬆️
src/storage/src/hummock/iterator/mod.rs 46.07% <ø> (ø)
src/storage/src/hummock/sstable/mod.rs 91.62% <ø> (+3.91%) ⬆️
src/storage/src/hummock/utils.rs 79.18% <100.00%> (-0.35%) ⬇️
src/storage/src/store.rs 51.03% <ø> (ø)
...storage/hummock_test/src/bin/replay/replay_impl.rs 0.00% <0.00%> (ø)
src/stream/src/common/table/state_table.rs 88.57% <75.00%> (-0.08%) ⬇️
src/storage/src/hummock/iterator/backward_user.rs 97.58% <96.87%> (+0.33%) ⬆️
...age/src/hummock/sstable/delete_range_aggregator.rs 93.86% <94.11%> (-0.04%) ⬇️
... and 8 more

... and 19 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Li0k Li0k self-requested a review October 8, 2023 09:05
@MrCroxx MrCroxx self-requested a review October 11, 2023 08:25
@wenym1 wenym1 requested review from hzxa21 and wenym1 October 11, 2023 08:26
@stdrc stdrc self-requested a review November 3, 2023 08:09
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Contributor

@MrCroxx MrCroxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some microbenches for reverse iterators and a comparison between forward iterators and backward iterators? There was a microbench for forward and backward block iterators.

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
@Little-Wallace
Copy link
Contributor Author

image

The reverse-scan performs much worse than normal scan

Signed-off-by: Little-Wallace <[email protected]>
@Little-Wallace
Copy link
Contributor Author

image For short scan, the are similar

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

src/storage/src/hummock/iterator/mod.rs Outdated Show resolved Hide resolved
src/storage/src/hummock/store/version.rs Outdated Show resolved Hide resolved
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

src/storage/src/hummock/store/version.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Comment on lines 449 to 457
if first_full_key > key {
// The semantic of `seek_fn` will ensure that `first_key` <= table_key of `key`.
// At the beginning we have checked that `self.table_id` <= table_id of `key`.
// Therefore, when `first_full_key` > `key`, the only possibility is that
// `first_key` == table_key of `key`, and `self.table_id` == table_id of `key`,
// the `self.epoch` > epoch of `key`.
assert_eq!(first_key, key.user_key.table_key);
}
self.iter = Some((RustIteratorOfBuilder::Seek(iter), first_key, first_value));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking first_full_key > key, I think we should check first_key == key && key.epoch_with_gap > self.epoch instead. When the condition is hit, we need to call iter.next.

Example:

Memtable in epoch2:
 k1
 k2

rev iter seek(FullKey(k2, epoch3)) should point to k1 instead of k2 
because (k1, epoch2) < (k2, epoch3) < k2, epoch2)

Note that we cannot simply check fisrt_full_key < key because this won't guarantee that first_key == key under the semantic of reverse iter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a test case for this as well.

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Comment on lines 473 to 475
DirectionEnum::Backward => {
assert_lt!(next_key, first_key);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it fixes the corner case mentioned here, it introduces a bug for a case that work before:

Memtable in epoch2:
 k1

rev iter seek(FullKey(k2, epoch2))

The code will panic in L466 because (k1, epoch2) < (k2, epoch2) but k1 != k2.

This is exactly why I said in my previous comment:

Note that we cannot simply check fisrt_full_key < key because this won't guarantee that first_key == key under the semantic of reverse iter.

If the UT passes with the current code, I think we lack a UT for this straight-forward case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some test for this case

Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Copy link
Collaborator

@hzxa21 hzxa21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

match iter.next() {
Some((first_key, first_value)) => {
if first_key.eq(&key.user_key.table_key) && self.epoch < key.epoch_with_gap {
// The semantic of `seek_fn` will ensure that `first_key` >= table_key of `key`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: should be first_key <= table_key of key in the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Some((first_key, first_value)) => {
if first_key.eq(&key.user_key.table_key) && self.epoch < key.epoch_with_gap {
// The semantic of `seek_fn` will ensure that `first_key` >= table_key of `key`.
// At the beginning we have checked that `self.table_id` >= table_id of `key`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: should be self.table_id <= table_id of key in the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@Little-Wallace Little-Wallace added this pull request to the merge queue May 21, 2024
Merged via the queue into main with commit dac892e May 21, 2024
27 of 28 checks passed
@Little-Wallace Little-Wallace deleted the reverse-iter branch May 21, 2024 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants