feat: add file-level scan api for iceberg source #15783

chenzl25 · 2024-03-19T08:37:39Z

Is your feature request related to a problem? Please describe.

Currently, iceberg source does scan planning in the frontend node and sends the files needed to be scanned to compute nodes. Technically, we can use a file level scan API to scan those files. However, icelake lacks this API, so we need to reuse the table-level API and filter out the assigned files. I think we can add a file-level read API to icelake to avoid redundant scan planning.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

ZENOTME · 2024-05-29T07:35:11Z

After apache/iceberg-rust#377, I think the iceberg-rust can distribute the file scan task to different compute nodes. And the iceberg-rust has better support for reading and is under active development. I find that we can add the new interface like load_table_v2 using iceberg-rust so that we can replace the icelake implementation about read with iceberg-rust.

risingwave/src/connector/src/sink/iceberg/mod.rs

Line 421 in 234f657

pub async fn load_table(&self) -> ConnectorResult<Table> {

github-actions · 2024-08-01T02:08:16Z

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

chenzl25 added the type/feature label Mar 19, 2024

chenzl25 assigned ZENOTME Mar 19, 2024

github-actions bot added this to the release-1.8 milestone Mar 19, 2024

chenzl25 removed this from the release-1.8 milestone Apr 8, 2024

github-actions bot added the no-issue-activity label Aug 1, 2024

chenzl25 mentioned this issue Aug 7, 2024

feat(iceberg): reduce iceberg catalog fetch rpc number for iceberg scan #17939

Merged

9 tasks

chenzl25 closed this as completed Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add file-level scan api for iceberg source #15783

feat: add file-level scan api for iceberg source #15783

chenzl25 commented Mar 19, 2024

ZENOTME commented May 29, 2024 •

edited

Loading

github-actions bot commented Aug 1, 2024

feat: add file-level scan api for iceberg source #15783

feat: add file-level scan api for iceberg source #15783

Comments

chenzl25 commented Mar 19, 2024

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

ZENOTME commented May 29, 2024 • edited Loading

github-actions bot commented Aug 1, 2024

ZENOTME commented May 29, 2024 •

edited

Loading