Releases: epoch8/datapipe
Releases · epoch8/datapipe
v0.13.8
v0.13.7
- Add
BytesFile
adapter forTableStoreFiledir
- Add
delete_stale
argument toBatchGenerate
- Fix duplicated indices in
BaseBatchTransformStep.get_full_process_ids
- Fix empty result in
BaseBatchTransformStep.get_full_process_ids
in special case - Fix pandas warning #286
- Fix SQLAlchemy 2.0 warnings
- Enable SQLAlchemy 2.0
- Optimize join in batch transform when there's an input without intersecting keys with transform
v0.13.6
- Add support for base64 encoded images in
TableStoreFiledir
PILFile
adapter
v0.13.5
- Add create_engine_kwargs for
DBConn
- Fix desc/asc order in batch transform when ordering by multiple columns
- Add logging of log_step_full for
DatatableTransformStep
v0.13.4
- Fix
TableStoreFiledir
usage ofauto_mkdir
(enable only for "file://")
v0.13.3
- Fix
TableStoreFiledir
ignoringfsspec_kwargs
- Added dropna and idx check for
TransformMetaTable
v0.13.2-post.1
- Allow
pandas >= 2
andnumpy >= 1.21
v0.13.2
- Add
GPU
support for RayExecutor - Add
auto_mkdir
toTableStoreFiledir
, fixes issues with local filedir - Add Python 3.11 support.
v0.13.1
- Add
api_key
toQdrantStore
constructor. Now can run pipelines with Qdrant authentication - Fix
TableStoreDB.update_rows
method crashing when trying to store pandas none-types
v0.13.0
Changes
Core
- Add
datapipe.metastore.TransformMetaTable
. Now each transform gets it's own
meta table that tracks status of each transformation - Generalize
BatchTransform
andDatatableBatchTransform
through
BaseBatchTransformStep
- Add
transform_keys
to*BatchTransform
- Move changed idx computation out of
DataStore
toBaseBatchTransformStep
- Add column
priority
to transform meta table, sort work by priority - Switch from vanilla
tqdm
totqdm_loggable
for better display in logs TableStoreFiledir
constructor accepts new argumentfsspec_kwargs
- Add
filters
,order_by
,order
arguments to*BatchTransformStep
- Add magic injection of
ds
,idx
,run_config
to transform function via
parameters introspection toBatchTransform
- Add magic
ds
inject intoBatchGenerate
- Split
core_steps
intostep.batch_transform
,step.batch_generate
,
step.datatable_transform
,step.update_external_table
- Move
metatable.MetaTable
todatatable
- Enable WAL mode for sqlite database by default
CLI
- Add
step reset-metadata
CLI command - Add
step fill-metadata
CLI command that populates transform meta-table with
all indices to process - Add
step run-idx
CLI command - CLI
step run_changelist
command accepts new argument--chunk-size
- New CLI command
table migrate_transform_tables
for0.13
migration - Add
--start-step
parameter tostep run-changelist
CLI - Move
--executor
parameter fromdatapipe step
todatapipe
command
Execution
- Executors:
datapipe.executor.SingleThreadExecutor
,
datapipe.executor.ray.RayExecutor
Bugfixes
- Fix
QdrantStore.read_rows
when no idx is specified - Fix
RedisStore
serialization for Ray