Skip to content

Commit

Permalink
DOC: Added extra sentences to clarify series.GroupBy snippets in exam…
Browse files Browse the repository at this point in the history
…ples (pandas-dev#59331)

* Added messages for each releveant snippet

* some small corrections to clarify further

* removed trailing whitespace

* more formatting correction

* more cleanup

* reverting changes

* trying to format documentation correctly

* removed some part of addee text

* testing if removing list works

* reverting some changes

* reverting changes

* checking if minor changes also leads to failures

* reverting all changes to pass the tests

* checking is small changes causes errors as well

* pusing the changes back
  • Loading branch information
ApoorvApoorv authored Jul 31, 2024
1 parent 88a5668 commit 73b5578
Showing 1 changed file with 26 additions and 0 deletions.
26 changes: 26 additions & 0 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -1815,14 +1815,30 @@ def _set_name(
Parrot 30.0
Parrot 20.0
Name: Max Speed, dtype: float64
We can pass a list of values to group the Series data by custom labels:
>>> ser.groupby(["a", "b", "a", "b"]).mean()
a 210.0
b 185.0
Name: Max Speed, dtype: float64
Grouping by numeric labels yields similar results:
>>> ser.groupby([0, 1, 0, 1]).mean()
0 210.0
1 185.0
Name: Max Speed, dtype: float64
We can group by a level of the index:
>>> ser.groupby(level=0).mean()
Falcon 370.0
Parrot 25.0
Name: Max Speed, dtype: float64
We can group by a condition applied to the Series values:
>>> ser.groupby(ser > 100).mean()
Max Speed
False 25.0
Expand All @@ -1845,11 +1861,16 @@ def _set_name(
Parrot Captive 30.0
Wild 20.0
Name: Max Speed, dtype: float64
>>> ser.groupby(level=0).mean()
Animal
Falcon 370.0
Parrot 25.0
Name: Max Speed, dtype: float64
We can also group by the 'Type' level of the hierarchical index
to get the mean speed for each type:
>>> ser.groupby(level="Type").mean()
Type
Captive 210.0
Expand All @@ -1865,12 +1886,17 @@ def _set_name(
b 3
dtype: int64
To include `NA` values in the group keys, set `dropna=False`:
>>> ser.groupby(level=0, dropna=False).sum()
a 3
b 3
NaN 3
dtype: int64
We can also group by a custom list with NaN values to handle
missing group labels:
>>> arrays = ['Falcon', 'Falcon', 'Parrot', 'Parrot']
>>> ser = pd.Series([390., 350., 30., 20.], index=arrays, name="Max Speed")
>>> ser.groupby(["a", "b", "a", np.nan]).mean()
Expand Down

0 comments on commit 73b5578

Please sign in to comment.