Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): Add StringView and BinaryView IO to Python bindings #637

Merged
merged 6 commits into from
Sep 30, 2024

Conversation

paleolimbot
Copy link
Member

@paleolimbot paleolimbot commented Sep 27, 2024

This PR implements StringView support in the Python bindings. It is a thin wrapper around the C functions added, although we should perhaps abstract some of the buffer info calculation into the C library since I had to work around that in the R bindings as well.

import nanoarrow as na

array = na.Array(["abc", "def", None, "longer than 12 bytes"], na.string_view())
array
#> nanoarrow.Array<string_view>[4]
#> 'abc'
#> 'def'
#> None
#> 'longer than 12 bytes'
array.buffers
#> (nanoarrow.c_buffer.CBufferView(bool[1 b] 11010000),
#>  nanoarrow.c_buffer.CBufferView(string_view[64 b] b'\x03\x00\x00\x00abc\x00\x00\x00\x00\x00\x00\x00\x00\x00'...),
#>  nanoarrow.c_buffer.CBufferView(string[20 b] b'longer than 12 bytes'),
#>  nanoarrow.c_buffer.CBufferView(int64[8 b] 20))

@paleolimbot paleolimbot marked this pull request as ready for review September 27, 2024 21:39
@paleolimbot
Copy link
Member Author

@WillAyd If you have bandwidth, this could use a look!

Copy link
Contributor

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I only have really minor comments

and i == (2 + self._ptr.n_variadic_buffers)
):
return (
NANOARROW_BUFFER_TYPE_DATA,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the type of the variadic buffer is actually missing from the C++ test; we might want to go back and add this there too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll follow this up with a C library PR!

python/src/nanoarrow/_array.pyx Outdated Show resolved Hide resolved
python/src/nanoarrow/_array.pyx Outdated Show resolved Hide resolved
@paleolimbot paleolimbot merged commit d6ef480 into apache:main Sep 30, 2024
11 checks passed
@paleolimbot paleolimbot deleted the python-string-view branch September 30, 2024 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants