Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: ensure each storage read only involves one vnode #15289

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

hzxa21
Copy link
Collaborator

@hzxa21 hzxa21 commented Feb 27, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Given that we already ensure storage iter won't read across vnodes in executors, we add strict assertion check in HummockStorage::iter and LocalHummockStorage::iter to ensure the provided key range is within one vnode to make this assumption explicit. There are several benefits:

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@fuyufjh fuyufjh changed the title chore: ensure each storage read only involve one vnode (WIP) fix: ensure each storage read only involve one vnode (WIP) Feb 29, 2024
@github-actions github-actions bot added type/fix Bug fix and removed type/chore labels Feb 29, 2024
@hzxa21 hzxa21 changed the title fix: ensure each storage read only involve one vnode (WIP) fix: ensure each storage read only involve one vnode Mar 1, 2024
@hzxa21 hzxa21 changed the title fix: ensure each storage read only involve one vnode feat: ensure each storage read only involve one vnode Mar 1, 2024
@github-actions github-actions bot added type/feature and removed type/fix Bug fix labels Mar 1, 2024
@hzxa21 hzxa21 requested review from wenym1 and Li0k March 1, 2024 11:22
@hzxa21 hzxa21 changed the title feat: ensure each storage read only involve one vnode feat: ensure each storage read only involves one vnode Mar 1, 2024
@hzxa21 hzxa21 force-pushed the patrick/ensure-single-vnode-read branch from e832693 to 2684da6 Compare March 1, 2024 11:37
Copy link
Contributor

@wenym1 wenym1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM!

// Only 1 staging sst is provided
assert_eq!(1, hummock_read_snapshot.1.len());
}
// #[tokio::test]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to comment out the test?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a test only for read_filter_for_batch

let read_version_vec = {
let read_guard = self.read_version_mapping.read();
read_guard
.get(&table_id)
.map(|v| {
v.values()
.filter(|v| !v.read_arc().is_replicated())
.filter(|v| {
let read_version = v.read();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will make it fail to read the replicated data by batch read. May add a warning or panic here to avoid misuse.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Added a warning log when there are no non-replicated read version but replicated read version found.

Note that it is also not possible to read replicated data by batch read prior to this PR.

@@ -416,6 +429,14 @@ impl StateStoreRead for HummockStorage {
epoch: u64,
read_options: ReadOptions,
) -> impl Future<Output = StorageResult<Self::IterStream>> + '_ {
let (l_vnode_inclusive, r_vnode_exclusive) = vnode_range(&key_range);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making it unified, moving assert logic to functions vnode and vnode_part.

I'm not sure whether we need to strictly enforce checks on every call?

Copy link
Collaborator Author

@hzxa21 hzxa21 Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making it unified, moving assert logic to functions vnode and vnode_part.

fn vnode(range) also have the same assertion. Given that the cost of the assertion is small, I would rather make the assertion more explicit by putting in at the beginning of all the iter calls.

I'm not sure whether we need to strictly enforce checks on every call?

This is the main motivation of this PR to avoid misuage of storage iter.

@@ -260,6 +259,14 @@ impl LocalStateStore for LocalHummockStorage {
key_range: TableKeyRange,
read_options: ReadOptions,
) -> StorageResult<Self::IterStream<'_>> {
let (l_vnode_inclusive, r_vnode_exclusive) = vnode_range(&key_range);
assert_eq!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

// Only 1 staging sst is provided
assert_eq!(1, hummock_read_snapshot.1.len());
}
// #[tokio::test]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a test only for read_filter_for_batch

@@ -102,79 +102,54 @@ impl TableWatermarksIndex {
self.read_watermark(vnode, HummockEpoch::MAX)
}

pub fn range_watermarks(
pub fn rewrite_range_with_table_watermark(
&self,
epoch: HummockEpoch,
key_range: &mut TableKeyRange,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just question: why not return a new key_range after rewrite instead of "mut ref "?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we also do in-place update prior to this PR. I am okay both ways so I can also change that if you think it is important.

@Li0k
Copy link
Contributor

Li0k commented Mar 4, 2024

After this PR, we remove the SkipWaterIterator from the read path. How about move it to compactor/iterator mod

@hzxa21
Copy link
Collaborator Author

hzxa21 commented Mar 5, 2024

After this PR, we remove the SkipWaterIterator from the read path. How about move it to compactor/iterator mod

compactor/iterator.rs is already big enough with ~800 LoC. I would rather keep SkipWatermarkIterator in a separate file.

Copy link
Contributor

@Li0k Li0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the efforts

@hzxa21 hzxa21 force-pushed the patrick/ensure-single-vnode-read branch from 94679aa to 6d6beb2 Compare March 5, 2024 05:49
@hzxa21 hzxa21 force-pushed the patrick/ensure-single-vnode-read branch from c91da17 to 72e6c9e Compare March 6, 2024 04:45
@hzxa21 hzxa21 added this pull request to the merge queue Mar 7, 2024
Merged via the queue into main with commit 3a244ec Mar 7, 2024
26 of 27 checks passed
@hzxa21 hzxa21 deleted the patrick/ensure-single-vnode-read branch March 7, 2024 02:38
hzxa21 added a commit that referenced this pull request Mar 29, 2024
hzxa21 added a commit that referenced this pull request Apr 2, 2024
hzxa21 added a commit that referenced this pull request Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants