v0.13.0-alpha.4
Pre-release
Pre-release
elephantum
released this
19 Jul 20:26
·
408 commits
to master
since this release
WIP 0.13.0
Changes
Core
- Add
datapipe.metastore.TransformMetaTable
. Now each transform gets it's own
meta table that tracks status of each transformation - Generalize
BatchTransform
andDatatableBatchTransform
through
BaseBatchTransformStep
- Add
transform_keys
to*BatchTransform
- Move changed idx computation out of
DataStore
toBaseBatchTransformStep
- Add column
priority
to transform meta table, sort work by priority - Switch from vanilla
tqdm
totqdm_loggable
for better display in logs TableStoreFiledir
constructor accepts new argumentfsspec_kwargs
- Add
filters
,order_by
,order
arguments to*BatchTransformStep
- Add magic injection of
ds
,idx
,run_config
to transform function via
parameters introspection
CLI
- Add
step reset-metadata
CLI command - Add
step fill-metadata
CLI command that populates transform meta-table with
all indices to process - Add
step run-idx
CLI command - CLI
step run_changelist
command accepts new argument--chunk-size
- New CLI command
table migrate_transform_tables
for0.13
migration
Execution
- Executors:
datapipe.executor.SingleThreadExecutor
,
datapipe.executor.ray.RayExecutor
Deployment
- Add helm chart for running regular loops in k8s as
CronJob
Bugfixes
- Fix
QdrantStore.read_rows
when no idx is specified