Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore using Narwhals in Plotly Express #4749

Closed
LiamConnors opened this issue Sep 3, 2024 · 5 comments · Fixed by #4790
Closed

Explore using Narwhals in Plotly Express #4749

LiamConnors opened this issue Sep 3, 2024 · 5 comments · Fixed by #4790
Assignees
Labels
feature something new P2 considered for next cycle task one-off task

Comments

@LiamConnors
Copy link
Member

Narwhals is a compatibility layer between Polars, pandas, and other dataframes.
https://narwhals-dev.github.io/narwhals/

This issue is to explore what changes we would need to make to use Narwhals in Plotly Express.

@gvwilson gvwilson added feature something new P2 considered for next cycle task one-off task labels Sep 3, 2024
@firai
Copy link

firai commented Sep 11, 2024

I'm glad that this is being explored! Especially since pandas 3.0 could still add a hard dependency on pyarrow (seems to be their current plan unless the new PDEP delaying the dependency is approved), and since both pyarrow and pandas seem to be required required now for polars to work with Plotly Express.

Not sure if pointing this out here is kosher, and you may already be aware of this, but I understand that this is the (main) PR where altair switched to using narwhals, just for reference: vega/altair#3452. If it's not kosher, feel free to edit this out.

@FBruzzesi
Copy link
Contributor

FBruzzesi commented Sep 24, 2024

Hey there! Thanks for considering Narwhals as an option to make the Plotly Express module more dataframe agnostic.
I started to take a look at how that would look like, and I believe that there are a couple of things worth mentioning:

  • narwhals still misses a few features that would make the integration seamless (e.g. DataFrame.unpivot and DataFrame.cast methods, both are work in progress)
  • the case in which a trendline is requested, we can only go so far, at a certain point statsmodels is used and we need to pass something which statsmodels supports (i.e. pandas or numpy), so we will need to trigger a conversion for the user. I am saying that just to bring awareness of the fact.

Time permitting, when those WIP features will be merged and released, I will take a closer look again.

Edit: For progress updates 😁 branch I am working on

@ndrezn
Copy link
Member

ndrezn commented Oct 7, 2024

Hey Francesco! It looks like you've made great progress so far. Is there anything the Plotly team can do to support what you're working on? We're very interested in taking this feature further. For now we're working on typed array support in #4470, and I can imagine that Narwhals support could take this integration even further. Cheers!

@FBruzzesi
Copy link
Contributor

Yes the PR is almost ready, I am able to run the entire test suite successfully with polars and pyarrow on a narwhals branch (with features from narwhals-dev/narwhals#1145). Other required feature are also in main but not released just yet.

As soon as we make a new release I should be able to open the PR. I would expect to be of a similar size of #4470 in terms on line changes. I was wondering if there is a good approach to make it easier for review other than commenting it in great details.

Let me cc @MarcoGorelli as well into the thread 😁

@FBruzzesi
Copy link
Contributor

Opening a PR as draft very soon, I am a bit unsure where to set the test dependencies and how CI is run 🙈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new P2 considered for next cycle task one-off task
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants