Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datalake: Convert protobuf repeated fields into Arrow lists #3

Draft
wants to merge 2 commits into
base: jcipar/proto-to-arrow
Choose a base branch
from

Conversation

jcipar
Copy link
Owner

@jcipar jcipar commented Jul 30, 2024

This change parses Protobuf repeated fields into Arrow lists.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

jcipar added 2 commits July 29, 2024 17:49
Introduces a protobuf_to_arrow converter that accepts messages in
protobuf format, parses them, and adds them to an Arrow table. The Arrow
table can be used to write a Parquet file.
This change parses Protobuf repeated fields into Arrow lists.
@jcipar jcipar force-pushed the jcipar/proto-to-arrow branch 2 times, most recently from 9a73355 to 003f82e Compare August 2, 2024 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant