Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(storage): separate key and values in imm to reduce allocation #15300

Merged
merged 6 commits into from
Mar 4, 2024

Conversation

wenym1
Copy link
Contributor

@wenym1 wenym1 commented Feb 27, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Previously, to store values of multiple versions of a key in imm, we use a two-dimensional vector to store the values, where the outer vector stores all entries and each inner vector stores the multi-version values. However, each inner vector is small, and when imm merge is disabled, each inner vector only stores a single value, which involves unnecessary vector allocation. In this PR, we change to store the values of multi-version of all keys in a single vector. Each key entry will store an extra value offset to indicate the starting offset of its own values. The end offset is the value offset of the next entry, or the vector end if the current entry is the last one.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

Copy link
Contributor

@StrikeW StrikeW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new structure LGTM, and I think you may conduct a perf test to check the improvement.

&values[entries[i].value_offset
..entries
.get(i + 1)
.map(|entry| entry.value_offset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clever trick to use next entry as end offset 😄

@wenym1
Copy link
Contributor Author

wenym1 commented Mar 4, 2024

the new structure LGTM, and I think you may conduct a perf test to check the improvement.

added a benchmark

main

bench-imm-merge/single-epoch time:   [261.48 ms 262.47 ms 263.49 ms]
bench-imm-merge/multi-epoch   time:   [337.32 ms 339.80 ms 343.92 ms]

this branch

bench-imm-merge/single-epoch time:   [204.69 ms 205.04 ms 205.48 ms]
bench-imm-merge/multi-epoch   time:   [326.13 ms 326.91 ms 327.72 ms]

@wenym1 wenym1 added this pull request to the merge queue Mar 4, 2024
Merged via the queue into main with commit 940d6c7 Mar 4, 2024
28 of 29 checks passed
@wenym1 wenym1 deleted the yiming/separate-imm-key-values branch March 4, 2024 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants