Skip to content

Releases: epoch8/datapipe

v0.13.8

16 Jan 07:30
c4f248c
Compare
Choose a tag to compare

v0.13.7

02 Jan 18:16
Compare
Choose a tag to compare
  • Add BytesFile adapter for TableStoreFiledir
  • Add delete_stale argument to BatchGenerate
  • Fix duplicated indices in BaseBatchTransformStep.get_full_process_ids
  • Fix empty result in BaseBatchTransformStep.get_full_process_ids in special case
  • Fix pandas warning #286
  • Fix SQLAlchemy 2.0 warnings
  • Enable SQLAlchemy 2.0
  • Optimize join in batch transform when there's an input without intersecting keys with transform

v0.13.6

01 Nov 15:05
47bf44b
Compare
Choose a tag to compare
  • Add support for base64 encoded images in TableStoreFiledir PILFile adapter

v0.13.5

20 Oct 17:49
Compare
Choose a tag to compare
  • Add create_engine_kwargs for DBConn
  • Fix desc/asc order in batch transform when ordering by multiple columns
  • Add logging of log_step_full for DatatableTransformStep

v0.13.4

20 Sep 20:47
Compare
Choose a tag to compare
  • Fix TableStoreFiledir usage of auto_mkdir (enable only for "file://")

v0.13.3

20 Sep 11:30
Compare
Choose a tag to compare
  • Fix TableStoreFiledir ignoring fsspec_kwargs
  • Added dropna and idx check for TransformMetaTable

v0.13.2-post.1

19 Sep 14:16
ed23b40
Compare
Choose a tag to compare
  • Allow pandas >= 2 and numpy >= 1.21

v0.13.2

19 Sep 08:35
Compare
Choose a tag to compare
  • Add GPU support for RayExecutor
  • Add auto_mkdir to TableStoreFiledir, fixes issues with local filedir
  • Add Python 3.11 support.

v0.13.1

05 Sep 16:11
d1a9fb0
Compare
Choose a tag to compare
  • Add api_key to QdrantStore constructor. Now can run pipelines with Qdrant authentication
  • Fix TableStoreDB.update_rows method crashing when trying to store pandas none-types

v0.13.0

02 Sep 08:29
Compare
Choose a tag to compare

Changes

Core

  • Add datapipe.metastore.TransformMetaTable. Now each transform gets it's own
    meta table that tracks status of each transformation
  • Generalize BatchTransform and DatatableBatchTransform through
    BaseBatchTransformStep
  • Add transform_keys to *BatchTransform
  • Move changed idx computation out of DataStore to BaseBatchTransformStep
  • Add column priority to transform meta table, sort work by priority
  • Switch from vanilla tqdm to tqdm_loggable for better display in logs
  • TableStoreFiledir constructor accepts new argument fsspec_kwargs
  • Add filters, order_by, order arguments to *BatchTransformStep
  • Add magic injection of ds, idx, run_config to transform function via
    parameters introspection to BatchTransform
  • Add magic ds inject into BatchGenerate
  • Split core_steps into step.batch_transform, step.batch_generate,
    step.datatable_transform, step.update_external_table
  • Move metatable.MetaTable to datatable
  • Enable WAL mode for sqlite database by default

CLI

  • Add step reset-metadata CLI command
  • Add step fill-metadata CLI command that populates transform meta-table with
    all indices to process
  • Add step run-idx CLI command
  • CLI step run_changelist command accepts new argument --chunk-size
  • New CLI command table migrate_transform_tables for 0.13 migration
  • Add --start-step parameter to step run-changelist CLI
  • Move --executor parameter from datapipe step to datapipe command

Execution

  • Executors: datapipe.executor.SingleThreadExecutor,
    datapipe.executor.ray.RayExecutor

Bugfixes

  • Fix QdrantStore.read_rows when no idx is specified
  • Fix RedisStore serialization for Ray