Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Ensure reduce returns a NestedFrame #56

Merged
merged 7 commits into from
Sep 30, 2024
Merged

Conversation

dougbrn
Copy link
Collaborator

@dougbrn dougbrn commented Sep 27, 2024

Change Description

Resolves #54

  • My PR includes a link to the issue that I am addressing

Solution Description

The issue seems to be that when meta is passed using dask's shorthand notation of dicts and tuples, the result is interpreted as a dask result. All this PR does is get in front of Dask whenever these shorthands are used, and casts them to npd.NestedFrames (even in the case of a series result as is consistent with Nested-Pandas)

I originally intended to wrap map_partitions (so ignore branch name), but believe this is unnecessary as meta handling works as expected for npd.NestedFrame meta

Code Quality

  • I have read the Contribution Guide
  • My code follows the code style of this project
  • My code builds (or compiles) cleanly without any errors or warnings
  • My code contains relevant comments and necessary documentation

Project-Specific Pull Request Checklists

Bug Fix Checklist

  • My fix includes a new test that breaks as a result of the bug (if possible)
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

New Feature Checklist

  • I have added or updated the docstrings associated with my feature using the NumPy docstring format
  • I have updated the tutorial to highlight my new feature (if appropriate)
  • I have added unit/End-to-End (E2E) test cases to cover my new feature
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

Documentation Change Checklist

Build/CI Change Checklist

  • If required or optional dependencies have changed (including version numbers), I have updated the README to reflect this
  • If this is a new CI setup, I have added the associated badge to the README

Other Change Checklist

  • Any new or updated docstrings use the NumPy docstring format.
  • I have updated the tutorial to highlight my new feature (if appropriate)
  • I have added unit/End-to-End (E2E) test cases to cover any changes
  • My change includes a breaking change
    • My change includes backwards compatibility and deprecation warnings (if possible)

Copy link

github-actions bot commented Sep 27, 2024

Before [f3b201e] <v0.2.1> After [c9d94a8] Ratio Benchmark (Parameter)
151M 153M 1.01 benchmarks.NestedFrameAddNested.peakmem_run
228±0.5ms 230±2ms 1.01 benchmarks.NestedFrameAddNested.time_run
151M 154M 1.01 benchmarks.NestedFrameQuery.peakmem_run
149M 151M 1.01 benchmarks.NestedFrameReduce.peakmem_run
475±2ms 474±2ms 1 benchmarks.NestedFrameQuery.time_run
234±1ms 232±2ms 0.99 benchmarks.NestedFrameReduce.time_run

Click here to view all benchmarks.

@dougbrn dougbrn changed the title Better Ensure Reduce returns a NestedFrame Better Ensure reduce returns a NestedFrame Sep 27, 2024
@dougbrn dougbrn marked this pull request as ready for review September 27, 2024 18:18
Copy link
Contributor

@smcguire-cmu smcguire-cmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Just one little thing to add to the tests

tests/nested_dask/test_nestedframe.py Show resolved Hide resolved
@dougbrn dougbrn merged commit 7b9c408 into main Sep 30, 2024
7 of 8 checks passed
@dougbrn dougbrn deleted the wrap_map_partitions branch September 30, 2024 16:56
@dougbrn dougbrn mentioned this pull request Sep 30, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

reduce returns a Dask DataFrame, not a NestedFrame
2 participants