Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Support kurtosis (kurt) in DataFrameGroupBy and SeriesGroupBy #60433

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

snitish
Copy link
Contributor

@snitish snitish commented Nov 27, 2024

DataFrameGroupBy and SeriesGroupBy currently support mean, std and skew (the first 3 moments) but not kurtosis (the 4th moment). This change addresses that. I implemented kurtosis in cython in similar fashion to skewness. I've verified that the output of this function matches that of DataFrame.kurt().

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! A few test requests (I think these are not covered yet):

  • skipna with NA values in the data
  • Float64 and float64[pyarrow] dtypes
  • Constant data (e.g. [1, 1, 1, 1])

pandas/_libs/groupby.pyx Outdated Show resolved Hide resolved
pandas/_libs/groupby.pyx Outdated Show resolved Hide resolved
pandas/core/groupby/generic.py Outdated Show resolved Hide resolved
pandas/tests/groupby/methods/test_kurt.py Outdated Show resolved Hide resolved
pandas/tests/groupby/methods/test_kurt.py Show resolved Hide resolved
@snitish
Copy link
Contributor Author

snitish commented Dec 4, 2024

Thanks for the review @rhshadrach.

  • Addressed your comments
  • Added test case for skipna=False (by default it's true)
  • Added test case for float64[pyarrow] (we already have one for float64)
  • Added test case for constant data. Note that the result here is 0.0, consistent with DataFrame.kurt() and Series.kurt()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Groupby Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH:AttributeError: 'SeriesGroupBy' object has no attribute 'kurtosis'
3 participants