Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mito): merge reader for mito2 #2210

Merged
merged 25 commits into from
Aug 24, 2023

Conversation

evenyag
Copy link
Contributor

@evenyag evenyag commented Aug 19, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

This PR implements a MergeReader to merge multiple sorted batch sources into one and also remove duplications.

The MergeReader requires all input source yields sorted and deduplicated Batches. The reader uses a heap to sort and collect batches for each series.

Currently, the reader only removes deleted rows but doesn't filter rows by sequence for simplicity and performance reason. So it is possible to read rows in another write request that is still writing.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

@evenyag evenyag self-assigned this Aug 22, 2023
@evenyag evenyag marked this pull request as ready for review August 22, 2023 12:52
@codecov
Copy link

codecov bot commented Aug 22, 2023

Codecov Report

Merging #2210 (7f60281) into develop (fdb5ad2) will decrease coverage by 0.33%.
Report is 2 commits behind head on develop.
The diff coverage is 91.45%.

@@             Coverage Diff             @@
##           develop    #2210      +/-   ##
===========================================
- Coverage    84.92%   84.59%   -0.33%     
===========================================
  Files          704      705       +1     
  Lines       114826   115167     +341     
===========================================
- Hits         97512    97425      -87     
- Misses       17314    17742     +428     

Copy link
Contributor

@v0y4g3r v0y4g3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/mito2/src/read/merge.rs Outdated Show resolved Hide resolved
src/mito2/src/read/merge.rs Show resolved Hide resolved
src/mito2/src/read/merge.rs Outdated Show resolved Hide resolved
@evenyag evenyag added this pull request to the merge queue Aug 24, 2023
Merged via the queue into GreptimeTeam:develop with commit 4ee1034 Aug 24, 2023
@evenyag evenyag deleted the feat/mito2-merge branch August 24, 2023 03:51
paomian pushed a commit to paomian/greptimedb that referenced this pull request Oct 19, 2023
* feat: Implement slice and first/last timestamp for Batch

* feat(mito): implements sort/concat for Batch

* chore: fix typo

* chore: remove comments

* feat: sort and dedup

* test: test batch operations

* chore: cast enum to test op type

* test: test filter related api

* sytle: fix clippy

* feat: implement Node and CompareFirst

* feat: merge reader wip

* feat: merge wip

* feat: use batch's operation to sort and dedup

* feat: implement BatchReader for MergeReader

* feat: simplify codes

* test: test merge reader

* refactor: use test util to create batch

* refactor: remove unused imports

* feat: update comment

* chore: remove metadata() from Source

* chroe: update comment

* feat: source supports batch iterator

* chore: update comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants