-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: add Compactor
trait to abstract the compaction
#4097
Conversation
@evenyag I'm unsure if it's appropriate to place the |
3c6a6da
to
8ef6dde
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4097 +/- ##
==========================================
- Coverage 84.94% 84.60% -0.34%
==========================================
Files 1018 1020 +2
Lines 178688 178846 +158
==========================================
- Hits 151780 151310 -470
- Misses 26908 27536 +628 |
@v0y4g3r Please take a look |
8354f52
to
71191af
Compare
71191af
to
93f5a74
Compare
Note: I add some new important refactors beyond the code review comments:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks GTM. It's better that let @v0y4g3r take a look.
Regarding the interaction between database instance and compactor, it's better to put the pick and manifest update work in database instance, and the compactor focuses on merging SST files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I add the For the real compactor service, the logic can be:
When the datanode receives the response, it can use the @v0y4g3r PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Maybe there's no need to specifically add a send_to_dn
method. The compactor just finish merging sst files and report the results and datanode can poll this result in its RemoteJobScheduler
.
I hereby agree to the terms of the GreptimeDB CLA.
Refer to a related PR or issue link (optional)
What's changed and what's your intention?
The main purpose of the PR is to create a small module to encapsulate the logic of compaction without initializing the whole Mito engine.
It's very useful for creating the cli compact tool and making compaction a service.
For the compact tool that I'm working based on the PR, we can use the
Compactor
as the following code(greptime cli compact --region-id 4398046511104
) :Add the
Compactor
traitThe new file
src/mito2/src/compaction/compactor.rs
creates the new traitCompactor
to encapsulate the compaction. The trait will split the compaction into multiple steps:merge_ssts()
update_manifest()
compact()
: The default implementation will combine themerge_ssts()
andupdate_manifest()
.The implementation of
merge_ssts()
andupdate_manifests()
are from the originalhandle_compaction()
.The PR also provides the
DefaultCompactor
, which is the default implementation ofCompactor
. The original compaction will be replaced byDefaultCompactor
.Refactor the
Picker
traitTo better collaborate with
Compactor
, thePicker
outputPickerOutput
instead of creating the internal compaction task:Add
open_compaction_region()
It's heavy to use
RegionOpener::open()
to open the region, so I decided to create a lightweight version:open_compaction_region()
. The function will acceptCompactionRequest
and other necessary arguments and returnCompactionRegion
, which is a subset ofMitoRegion
.Checklist