Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce iceberg-rust catalog #17308

Merged
merged 3 commits into from
Jun 26, 2024
Merged

feat: introduce iceberg-rust catalog #17308

merged 3 commits into from
Jun 26, 2024

Conversation

ZENOTME
Copy link
Contributor

@ZENOTME ZENOTME commented Jun 18, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

iceberg-rust now is the iceberg rust SDK maintained by upstream. It's under more active development now and we will graduate migrate to it from icelake gradually.
For now, iceberg-rust has powerful support for reading ability. The #17277 is aimed at introducing iceberg-rust and using it in our batch iceberg scan operator.

This PR is the first part of #17277. This PR:

  1. introduce the iceberg-rust crate
  2. introduce the catalog for iceberg-rust. Due to our aim at first stage is to read, so the catalog implementation mainly focuses on loading table interface and lefting other interfaces todo. We can implement them in the future.
    • rest catalog (upstream full support)
    • jni catalog (only implement load_table)
    • storage catalog (only implement load_table)
      the upstream is not plan to support storage catalog so we need to implement it in rw.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@ZENOTME ZENOTME requested a review from a team as a code owner June 18, 2024 08:04
@graphite-app graphite-app bot requested a review from a team June 18, 2024 08:04
@ZENOTME ZENOTME requested a review from chenzl25 June 18, 2024 08:05
Cargo.toml Outdated
Comment on lines 131 to 134
iceberg = { git = "https://github.com/risingwavelabs/iceberg-rust.git", rev = "e6ae6229dfd0c0a8793cde42a7d626c704ea9088" }
iceberg-catalog-rest = { git = "https://github.com/risingwavelabs/iceberg-rust.git", rev = "e6ae6229dfd0c0a8793cde42a7d626c704ea9088" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we need to use a forked version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now, the upstream reader is easy to cause redundant. detail apache/iceberg-rust#398
We can refactor to use the upstream interface when the upstream PR merge. apache/iceberg-rust#401

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment the reason for fork in Cargo.toml?

Copy link
Contributor

@chenzl25 chenzl25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ZENOTME ZENOTME force-pushed the zj/update_iceberg_1 branch 2 times, most recently from 71974f0 to e93dca0 Compare June 18, 2024 16:34
@ZENOTME ZENOTME enabled auto-merge June 19, 2024 02:23
@ZENOTME ZENOTME disabled auto-merge June 19, 2024 02:25
BugenZhao
BugenZhao previously approved these changes Jun 19, 2024
Cargo.lock Outdated
@@ -238,6 +238,29 @@ dependencies = [
"backtrace",
]

[[package]]
name = "apache-avro"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have two different versions of avro. 😕

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is used in iceberg-rust and it will not expose the interface about apache-avro.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have three, another one in icelake

@ZENOTME ZENOTME force-pushed the zj/update_iceberg_1 branch 4 times, most recently from e958d2e to c3ad081 Compare June 21, 2024 09:05
@ZENOTME ZENOTME force-pushed the zj/update_iceberg_1 branch from c3ad081 to eca2f64 Compare June 21, 2024 09:05
@chenzl25 chenzl25 enabled auto-merge June 24, 2024 02:53
@chenzl25 chenzl25 disabled auto-merge June 24, 2024 05:55
@chenzl25 chenzl25 enabled auto-merge June 24, 2024 05:55
@chenzl25 chenzl25 requested a review from BugenZhao June 24, 2024 07:14
@chenzl25 chenzl25 requested a review from fuyufjh June 26, 2024 08:20
@chenzl25
Copy link
Contributor

@BugenZhao @xxchan Any concerns on the Cargo.toml update?

@chenzl25 chenzl25 added this pull request to the merge queue Jun 26, 2024
Merged via the queue into main with commit 4e22820 Jun 26, 2024
29 of 30 checks passed
@chenzl25 chenzl25 deleted the zj/update_iceberg_1 branch June 26, 2024 08:47
@chenzl25
Copy link
Contributor

chenzl25 commented Jul 3, 2024

#17548

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants