Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enh]: Pass through the Arrow PyCapsule Interface #784

Closed
jonmmease opened this issue Aug 13, 2024 · 3 comments · Fixed by #786
Closed

[Enh]: Pass through the Arrow PyCapsule Interface #784

jonmmease opened this issue Aug 13, 2024 · 3 comments · Fixed by #786
Labels
enhancement New feature or request high priority

Comments

@jonmmease
Copy link

We would like to learn about your use case. For example, if this feature is needed to adopt Narwhals in an open source project, could you please enter the link to it below?

The Arrow PyCapsule Protocol makes it possible to share Arrow data between libraries without requiring a dependency on pyarrow. Many of the DataFrame libraries that narwhals wraps already support it (See apache/arrow#39195 for ecosystem summary), so my thinking is that it wouldn't be too much work to support the protocol on the Narwhals DataFrame itself.

Please describe the purpose of the new feature or describe the problem to solve.

Vega-Altair recently adopted Narwhals as a way to support polars without a pyarrow dependency, and I'm looking at how I could do the same in VegaFusion. The next VegaFusion release is going to support ingesting DataFrames that support the PyCapsule Interface (WIP by @kylebarron in vega/vegafusion#501), so it would be nice to be able to pass the Narwhals DataFrame directly to VegaFusion and process it using this protocol.

Suggest a solution if possible.

No response

If you have tried alternatives, please describe them below.

No response

Additional information that may help us understand your needs.

No response

@MarcoGorelli
Copy link
Member

thanks @jonmmease for the request! yup, definitely interested in this!

@MarcoGorelli
Copy link
Member

Done, thanks for the suggestion! This should allow you to support older versions of pandas/Polars which didn't yet support the interface but which did support converting to PyArrow. So, PyArrow is only used as a fallback: if the library supports the PyCapsule Interface natively without requiring PyArrow, then that will be used directly 😇

@jonmmease
Copy link
Author

Awesome, thanks for the remarkably fast turnaround on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request high priority
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants