-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add by_band
kwarg to batch, allow each lightcurve band to be processed independently
#327
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
by_band
kwarg to batch, allow each lightcurve band to be processed independently
9ba8df0
to
0efba3d
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #327 +/- ##
==========================================
+ Coverage 94.56% 94.61% +0.04%
==========================================
Files 24 24
Lines 1510 1540 +30
==========================================
+ Hits 1428 1457 +29
- Misses 82 83 +1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I have few clarification questions and a request to make batch()
more readable with helper functions
src/tape/ensemble.py
Outdated
@@ -1032,20 +1032,26 @@ def batch(self, func, *args, meta=None, use_map=True, compute=True, on=None, lab | |||
the results. Overridden by TAPE for TAPE and `light-curve` | |||
functions. If none, attempts to coerce the result to a | |||
pandas.Series. | |||
by_band: `boolean` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add here a sentence about output column names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote the sentence that tried to explain how the columns affected, let me know if that adds any clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for doing this, Doug!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look great, thank you so much!
Resolves #317
Resolves #241
A bit of scope creep on this PR, the main purpose is to address #317, giving more support to users looking to apply a function to each band within a lightcurve within the batch interface. This is done by adding the
by_band
kwarg toEnsemble.batch
, which makes sure that the band column is part of the groupby and also modifies the output to return an output column for each band. As a result, this also standardizes the output of batch to always return an EnsembleFrame object (even when this keyword is not used), where as previously in some cases it would return a series.I decided to go ahead with addressing #241 as part of this as well, removing the
compute
kwarg withinEnsemble.batch
, I opted not to replace it with persist, as I believe that batch should just return a lazy object that users can chain into a compute or persist command as they see fit.I also went ahead and added a batch showcase tutorial as part of this PR, in that I tried to outline how this change actually is used, and also tried to be transparent about some issues regarding dataframe outputs.