-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mito): Add cache manager #2488
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## develop #2488 +/- ##
===========================================
- Coverage 84.98% 84.61% -0.37%
===========================================
Files 724 727 +3
Lines 115298 115603 +305
===========================================
- Hits 97987 97822 -165
- Misses 17311 17781 +470 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Finally, we have a cache!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sst_meta_cache_size
doesn't take effect for now right?
It takes effect. The default value is a non-zero value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* feat: add cache manager * feat: add cache to reader builder * feat: add AsyncFileReaderCache * feat: Impl AsyncFileReaderCache * chore: move moka dep to workspace * feat: add moka cache to the manager * feat: implement parquet meta cache * test: test cache manager * feat: consider vec size * style: fix clippy * test: fix config api test * feat: divide cache * test: test disabling meta cache * test: fix config api test * feat: remove meta cache if file is purged
I hereby agree to the terms of the GreptimeDB CLA
What's changed and what's your intention?
This PR adds a manager
CacheManager
to cache data for the mito engine. This is the first step to implement a cache component for the engine.It implements a cache layer
AsyncFileReaderCache
for parquet'sAsyncFileReader
trait so we can return our cached metadata toParquetRecordBatchStreamBuilder
. TheAsyncFileReaderCache
also has local cached metadata. It is unused now but is useful when we implement row group readers that require buildingParquetRecordBatchStreamBuilder
multiple times.Now the
CacheManager
only caches metadata of the parquet file. This might provide less improvement in performance. I'll start refactoring ourParquetReader
to support caching other data after this PR is merged.Checklist
Refer to a related PR or issue link (optional)