You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ensemble.batch allows a user to apply custom functions, and the method does this internally by utilizing pandas.Dataframe.apply. Current documentation in my opinion leaves this relationship unclear so that the user will not realize that they need to specify their custom functions in the same way as if they were using apply
Example: A common use case for batch is to produce a result dataframe, however due to apply's behavior, this will only occur if a pd.Series is returned with a list of columns as its data and the column names as its index.
Alternatively we could allow the user to return dataframe by having their function simply return an iterable of the output frame's columns if we pass result_type='expand' when batch calls apply. Regardless if that approach is taken, this behavior should be better documented.
The text was updated successfully, but these errors were encountered:
I'm hoping to address this as part of #327, where for one the output of batch will always be a dataframe. And additionally I'm planning to add some kind of "batch showcase" tutorial, where we present a bunch of different styled functions and show how they interact with batch, the idea being that a user should be able to find a function that looks like their function somewhere in that showcase in most cases when trying to learn how to use batch.
With #327 merged, batch should now always return a dataframe. Additionally, this PR added a batch function showcase, which walks through several custom function examples, and how batch and it's various kwargs interact with them. @wilsonbb let me know if you think these changes address this issue, or if there's still more to do on this
Ensemble.batch
allows a user to apply custom functions, and the method does this internally by utilizingpandas.Dataframe.apply
. Current documentation in my opinion leaves this relationship unclear so that the user will not realize that they need to specify their custom functions in the same way as if they were usingapply
Example: A common use case for
batch
is to produce a result dataframe, however due toapply
's behavior, this will only occur if apd.Series
is returned with a list of columns as its data and the column names as its index.Alternatively we could allow the user to return dataframe by having their function simply return an iterable of the output frame's columns if we pass
result_type='expand'
whenbatch
callsapply
. Regardless if that approach is taken, this behavior should be better documented.The text was updated successfully, but these errors were encountered: