-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Refactor functions into Expressions namespaces and functions on the Expr directly #876
Comments
My primary concerns about doing this are:
That being said, I have found that some of the functions I expected to be on |
I think this can work but with the goal to deprecate the functions after X releases/months. One thing that is problematic for example with PySpark codebases is that there is no consistency due to the aliases. I don't think constant change is an issue, unless you don't document well how to change.
In some way this might speak more to PySpark users the current API, but arguably and I think many will agree if you come from polars and pandas the current API isn't close to their familiarity.
I can make a draft of mapping Functions -> Expr, so we can get a full picture on how this will look like |
One thing I disliked a lot about pyspark was the Functions. Since most of if not all DataFusion.functions take in an expression and return an expression, we could rework this in the Expr namespaces, this should help finding the right expressions much easier, some examples:
col().dt.current_date()
col().list.ndims()
I am happy to start working on this and creating a full list of how the Expr and Expr.namespaces can look like
The text was updated successfully, but these errors were encountered: