Source

tl;dr

Integration with scikit-learn, sklearn.compose can be used to create column transformer to be integrated in sklearn.pipeline.
Extending pandas cyberpanda official repo example of using extending pandas.
Fletcher use, another example of a tool using extensionArray interface that allows to use arrow columns in pandas dataframe with multiple benefits (debug information, random data generation, etc ...)

import fletcher as fr
import pandas as pd

df = pd.DataFrame({
    'str_column': fr.FletcherArray(['Test', None, 'Strings'])
})
df.info()

# <class 'pandas.core.frame.DataFrame'>
# # RangeIndex: 3 entries, 0 to 2
# # Data columns (total 1 columns):
# # str_column    2 non-null string
# # dtypes: string(1)
# # memory usage: 108.0 bytes

Little assign trick, in python3 you can use a created value in the same assign right after creation.
Inplace param deprecation, to foster more "functionnal" data pipeline based on chaining of assign, map, apply, etc ...
Lazy evaluation, in pandas use nlargest instead of sort and head for more perf.
Mentions cool projects for efficient high level operation, like dask for distributed computation, Ibis a abstractor for SQL and storage plateform.
Arrow, the solution for pandas memory perf in the futur, fletcher give a nice introduction to what this futur will be.
Kernel Density Estimate, new pandas.Series.plot.kde added during a spring, that can be useful for supervised learning like SVM to normalize input.

Afterword

Pandas'community recognize the current memory backend limitation and prepare for a switch to arrow in the futur, but you already can use Fletcher in your project 😲 .
Even if most of the API stay the same you better don't forgot to check for new features.
I got another nail the coffin to my use of inplace 😬.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

toward_pandas_1.0.md

toward_pandas_1.0.md

Source

tl;dr

Afterword

Files

toward_pandas_1.0.md

Latest commit

History

toward_pandas_1.0.md

File metadata and controls

Source

tl;dr

Afterword