Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iceberg sink: struct type with metadata doesn't work #16545

Closed
xxchan opened this issue Apr 29, 2024 — with Slack · 5 comments · Fixed by #18463
Closed

iceberg sink: struct type with metadata doesn't work #16545

xxchan opened this issue Apr 29, 2024 — with Slack · 5 comments · Fixed by #18463
Assignees
Milestone

Comments

Copy link
Member

xxchan commented Apr 29, 2024

Hi, I've recently been thinking about support for Struct type in Iceberg sink, since I'm testing if I can utilise RisingWave at work and such functionality is a necessity. As of now when someone tries to sink struct data to iceberg catalog they receive an error
Field response's type not compatible, risingwave converted data type Struct([Field { name: "responseStatus", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "statusCode", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }]), iceberg's data type: Struct([Field { name: "responseStatus", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {"PARQUET:field_id": "17", "column_id": "17"} }, Field { name: "statusCode", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {"PARQUET:field_id": "18", "column_id": "18"} }])
Looking at this it seems it is only a matter of a mismatch on metadata field in each Field. The code just does left == right comparison https://github.com/risingwavelabs/risingwave/blob/main/src/connector/src/sink/iceberg/mod.rs#L1046

Is Struct support in Iceberg sink just a matter of lack of correct comparison or is there more context to that?

Slack Message

@github-actions github-actions bot added this to the release-1.9 milestone Apr 29, 2024
@fuyufjh
Copy link
Member

fuyufjh commented May 14, 2024

cc. @chenzl25

@chenzl25
Copy link
Contributor

@ZENOTME Could you please check whether we support struct type in iceberg sink? IIUC, after this PR #16567 , we could support it directly.

@ZENOTME
Copy link
Contributor

ZENOTME commented May 14, 2024

@ZENOTME Could you please check whether we support struct type in iceberg sink? IIUC, after this PR #16567 , we could support it directly.

Sure, I test it later. BTW, there is also no test for struct type in icelake so I am not sure whether it's supported.

Copy link
Contributor

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

@xxchan xxchan assigned xxchan and unassigned ZENOTME Sep 9, 2024
@xxchan
Copy link
Member Author

xxchan commented Sep 10, 2024

This is not supported because when comparing the 2 struct types, metadata will be compared.
In iceberg, metadata: {"PARQUET:field_id": "18"} will be always present. But in RW we don't have the field id. Even if we have, the id may not match the iceberg one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants