Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mito): implements row group level page cache #2688

Merged
merged 15 commits into from
Nov 20, 2023

Conversation

evenyag
Copy link
Contributor

@evenyag evenyag commented Nov 2, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

This PR implements a row group-level page cache for the mito engine.

It adds a new page reader CachedPageReader that returns pages of a row group from the cache.

The first time we read a row group, we load all pages of the row group and put it into the cache. The next time we can fetch cached pages from the cache and build a CachedPageReader.

We use region id, file id, row group index, column index as cache key, which means we can only cache a column's row group level pages.

Cached pages are decompressed, so we can skip the decompression step and reduce 20% ~ 30% of the total scan time if a query hits the cache.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

@evenyag evenyag force-pushed the feat/page-cache branch 2 times, most recently from e0a00b4 to f5ed41a Compare November 9, 2023 04:08
@evenyag evenyag marked this pull request as ready for review November 9, 2023 04:32
Copy link

codecov bot commented Nov 9, 2023

Codecov Report

Merging #2688 (8d43cb7) into develop (730a3fa) will decrease coverage by 0.33%.
The diff coverage is 89.39%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2688      +/-   ##
===========================================
- Coverage    85.47%   85.14%   -0.33%     
===========================================
  Files          777      779       +2     
  Lines       125919   126339     +420     
===========================================
- Hits        107626   107573      -53     
- Misses       18293    18766     +473     

Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

config/standalone.example.toml Show resolved Hide resolved
config/standalone.example.toml Show resolved Hide resolved
src/mito2/src/cache.rs Show resolved Hide resolved
src/mito2/src/config.rs Outdated Show resolved Hide resolved
src/mito2/src/sst/parquet/page_reader.rs Show resolved Hide resolved
src/mito2/src/sst/parquet/page_reader.rs Show resolved Hide resolved
src/mito2/src/sst/parquet/reader.rs Outdated Show resolved Hide resolved
src/mito2/src/sst/parquet/row_group.rs Show resolved Hide resolved
@evenyag evenyag requested a review from killme2008 November 17, 2023 03:16
Copy link
Contributor

@killme2008 killme2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. There is a confict @evenyag

@evenyag evenyag enabled auto-merge November 20, 2023 02:49
@evenyag evenyag added this pull request to the merge queue Nov 20, 2023
Merged via the queue into GreptimeTeam:develop with commit ce959dd Nov 20, 2023
18 checks passed
@evenyag evenyag deleted the feat/page-cache branch November 20, 2023 03:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants