Releases
v0.13.0
Changes
Core
Add datapipe.metastore.TransformMetaTable
. Now each transform gets it's own
meta table that tracks status of each transformation
Generalize BatchTransform
and DatatableBatchTransform
through
BaseBatchTransformStep
Add transform_keys
to *BatchTransform
Move changed idx computation out of DataStore
to BaseBatchTransformStep
Add column priority
to transform meta table, sort work by priority
Switch from vanilla tqdm
to tqdm_loggable
for better display in logs
TableStoreFiledir
constructor accepts new argument fsspec_kwargs
Add filters
, order_by
, order
arguments to *BatchTransformStep
Add magic injection of ds
, idx
, run_config
to transform function via
parameters introspection to BatchTransform
Add magic ds
inject into BatchGenerate
Split core_steps
into step.batch_transform
, step.batch_generate
,
step.datatable_transform
, step.update_external_table
Move metatable.MetaTable
to datatable
Enable WAL mode for sqlite database by default
CLI
Add step reset-metadata
CLI command
Add step fill-metadata
CLI command that populates transform meta-table with
all indices to process
Add step run-idx
CLI command
CLI step run_changelist
command accepts new argument --chunk-size
New CLI command table migrate_transform_tables
for 0.13
migration
Add --start-step
parameter to step run-changelist
CLI
Move --executor
parameter from datapipe step
to datapipe
command
Execution
Executors: datapipe.executor.SingleThreadExecutor
,
datapipe.executor.ray.RayExecutor
Bugfixes
Fix QdrantStore.read_rows
when no idx is specified
Fix RedisStore
serialization for Ray
You can’t perform that action at this time.