Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(remote_wal): impl kafka log store #2971

Merged
merged 25 commits into from
Dec 25, 2023
Merged

Conversation

niebayes
Copy link
Contributor

@niebayes niebayes commented Dec 21, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

  • Implemented a ClientManager to manage construction and accesses of rskafka client. Through the manager, the kafka log store is able to contact kafka cluster.
  • Implemented a KafkaLogStore to provide read/write interfaces for interacting with Kafka remote wal.
  • Built and initialized KafkaLogStore upon starting the datanode.

TODO

  • Integration tests.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

@niebayes niebayes self-assigned this Dec 21, 2023
@niebayes niebayes marked this pull request as draft December 21, 2023 03:25
@niebayes niebayes requested review from fengjiachun, v0y4g3r, killme2008, MichaelScofield and WenyXu and removed request for fengjiachun December 21, 2023 09:47
@niebayes niebayes marked this pull request as ready for review December 21, 2023 09:47
Copy link

codecov bot commented Dec 21, 2023

Codecov Report

Attention: 186 lines in your changes are missing coverage. Please review.

Comparison is base (d4ac873) 85.83% compared to head (0f4ef3e) 85.72%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2971      +/-   ##
===========================================
- Coverage    85.83%   85.72%   -0.11%     
===========================================
  Files          780      783       +3     
  Lines       126196   126435     +239     
===========================================
+ Hits        108319   108389      +70     
- Misses       17877    18046     +169     

src/log-store/src/kafka/client_manager.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Show resolved Hide resolved
src/log-store/Cargo.toml Outdated Show resolved Hide resolved
src/log-store/src/error.rs Outdated Show resolved Hide resolved
src/log-store/src/error.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/client_manager.rs Show resolved Hide resolved
src/log-store/src/kafka/client_manager.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/client_manager.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/record_utils.rs Show resolved Hide resolved
@niebayes niebayes added the docs-not-required This change does not impact docs. label Dec 22, 2023
@niebayes niebayes enabled auto-merge December 22, 2023 08:37
@v0y4g3r
Copy link
Contributor

v0y4g3r commented Dec 23, 2023

Had an offline discussion with @WenyXu . We all agree that the split/merge of WAL entries should happen inside logstore. It's the logstore that know the exact log entry size limit and how to handle the split/merge. AFAIK rskafka employs some batch mechanism so that we can simply forwrd wal entries to Kafka records in a 1:1 manner and let rskafka to handle the batching stuff. Only one thing to notice: be careful to handle large single wal entries, split those large entries inside logstore and ensure the splited parts get produced sequentially and atomically.

src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/log_store.rs Outdated Show resolved Hide resolved
src/log-store/src/kafka/record_utils.rs Outdated Show resolved Hide resolved
@niebayes niebayes disabled auto-merge December 23, 2023 16:53
@niebayes niebayes enabled auto-merge December 25, 2023 08:03
@niebayes
Copy link
Contributor Author

Splitting a way-too large log entry is in planning. However, it may be implemented in a subsequent PR since we need to investigate how rskafka client handles a way-too large log entry and then design an appropriate splitting strategy.

@waynexia waynexia mentioned this pull request Dec 25, 2023
8 tasks
@niebayes niebayes requested a review from WenyXu December 25, 2023 08:19
Copy link
Contributor

@killme2008 killme2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost LGTM

config/datanode.example.toml Outdated Show resolved Hide resolved
src/log-store/src/error.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@niebayes niebayes added this pull request to the merge queue Dec 25, 2023
Merged via the queue into develop with commit bab198a Dec 25, 2023
20 checks passed
@niebayes niebayes deleted the feat/impl_kafka_log_store branch December 25, 2023 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants