feat: support std and var with ddof !=1 in pandas-like group by #1645

FBruzzesi · 2024-12-22T13:06:30Z

What type of PR is this? (check all applicable)

Related issues

Related issue [Bug]: group_by context ignores expr arguments #1606

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

MarcoGorelli · 2024-12-22T13:52:54Z

narwhals/_pandas_like/group_by.py

+                        # Invert the dict to have root_name: output_name
+                        # TODO(FBruzzesi): Account for duplicates
+                        columns={v: k for k, v in output_to_root_name_mapping.items()},


🤔 in case someone does .agg(nw.col('a').std(ddof=1).alias('b'), nw.col('a').std(ddof=1).alias('c'))? hmm yeah i guess it's possible

btw, since cuDF have no introduced cudf.NamedAgg, we could use that if it makes all this logic a bit easier

Yes correct, similarly to simple aggs.
To avoid an explosion in the number of lists to keep track of, maybe there is a better data structure. I will think about it a bit, yet aside that specific case, this PR should be ready

FBruzzesi · 2024-12-22T14:57:32Z

narwhals/_pandas_like/group_by.py

+                [
+                    grouped[std_root_names]
+                    .std(ddof=ddof)
+                    .set_axis(std_output_names, axis="columns", copy=False)


@MarcoGorelli how bad is it to use set_axis to rename the columns? What's the alternative? Our rename does a mapping, yet here we could do:

grouped[["b", "b"]].std(ddof=2).set_axis(["c", "d"], axis="columns")

For once we are exploiting some pandas weirdness 😁

Should be fine let's just use copy=false

MarcoGorelli · 2024-12-22T17:05:30Z

oh nice

_______ test_group_by_depth_1_std_var[pandas_pyarrow_constructor-var-2] ________
[XPASS(strict)] 
_______ test_group_by_depth_1_std_var[pandas_pyarrow_constructor-var-0] ________
[XPASS(strict)]

MarcoGorelli

seriously awesome work here @FBruzzesi !

feat: support std and var with ddof !=1 in pandas-like group by

a104310

MarcoGorelli reviewed Dec 22, 2024

View reviewed changes

FBruzzesi mentioned this pull request Dec 22, 2024

feat: Add support for .shift(n).over('col') for pandas-like DataFrames #1627

Merged

10 tasks

FBruzzesi added 2 commits December 22, 2024 15:15

handle dups

dfd940c

set_columns

5cc5f46

FBruzzesi commented Dec 22, 2024

View reviewed changes

FBruzzesi marked this pull request as ready for review December 22, 2024 15:24

FBruzzesi changed the title ~~WIP, feat: support std and var with ddof !=1 in pandas-like group by~~ feat: support std and var with ddof !=1 in pandas-like group by Dec 22, 2024

improve error message, remove unnecessary xfail

638cbc9

MarcoGorelli approved these changes Dec 22, 2024

View reviewed changes

MarcoGorelli added 2 commits December 22, 2024 17:23

correct xfail

3b30876

fixup

fe72f9d

MarcoGorelli added the enhancement New feature or request label Dec 22, 2024

MarcoGorelli merged commit 9f09ea0 into main Dec 22, 2024
24 checks passed

FBruzzesi deleted the feat/group-by-specific-paths branch December 22, 2024 20:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support std and var with ddof !=1 in pandas-like group by #1645

feat: support std and var with ddof !=1 in pandas-like group by #1645

FBruzzesi commented Dec 22, 2024

MarcoGorelli Dec 22, 2024

FBruzzesi Dec 22, 2024

FBruzzesi Dec 22, 2024 •

edited

Loading

MarcoGorelli Dec 22, 2024

MarcoGorelli commented Dec 22, 2024

MarcoGorelli left a comment

feat: support std and var with ddof !=1 in pandas-like group by #1645

feat: support std and var with ddof !=1 in pandas-like group by #1645

Conversation

FBruzzesi commented Dec 22, 2024

What type of PR is this? (check all applicable)

Related issues

Checklist

If you have comments or can explain your changes, please do so below

MarcoGorelli Dec 22, 2024

Choose a reason for hiding this comment

FBruzzesi Dec 22, 2024

Choose a reason for hiding this comment

FBruzzesi Dec 22, 2024 • edited Loading

Choose a reason for hiding this comment

MarcoGorelli Dec 22, 2024

Choose a reason for hiding this comment

MarcoGorelli commented Dec 22, 2024

MarcoGorelli left a comment

Choose a reason for hiding this comment

FBruzzesi Dec 22, 2024 •

edited

Loading