Skip to content

Releases: delta-io/delta-rs

python-v0.5.4

17 Nov 08:35
8a0475c
Compare
Choose a tag to compare
  • Clean up expired delta table commit logs after checkpoint (#484)
  • Add authorization options for azure storage backend (#486)
  • Bump arrow to 6.1.0 (#494)
  • Add DeltaTableError in Python binding. Add markers for integration tests with pytest. (#496)
  • Change Rust edition from 2018 to 2021 (#490)
  • Add docs for ADLS Gen2. (#492)
  • Add gt, gte, lt and lte partition filters. (#478)
  • Fix python build (#487)
  • Try to fix flaky rename under Windows (#485)
  • Update azure crates (#474)
  • Update README.adoc (#482)
  • Fix documentation for the DeltaStorageHandler (#483)
  • Throw an error when filter key is not in partitioned columns. (#475)
  • Add GCS feature to the Python Cargo.toml file (#476)
  • Make file storage backend's atomic rename async (#471)
  • materialize tables in python via native storage backend (#463)
  • Fix coverage of the Python tests (#467)
  • Support hash lookup by path string for Remove action (#462)
  • Add new module for DeltaTableState (#464)
  • Avoid table stats override in datafusion extension. (#459)
  • Fix action reconciliation for add after remove (#456)
  • Add pool_idle_timeout options for s3 and sts clients (#458)
  • Generate new session name on assume role credentials provider refresh (#451)
  • return lazy iterator in get tombstone methods (#452)
  • Support no tombstone loading & new table builder API (#445)
  • Fix broken tombstones metadata when extended_file_metadata is different between tomstones in state (#450)
  • README: mark Checkpoint creation as done for Rust (#449)
  • Add maturin develop command with extra (#448)
  • Run all tests under s3 feature flag (#447)
  • Update datafusion links (#446)
  • Batch-apply remove actions in tombstone handling (#444)
  • Fixing test to compare sorted vec (#443)
  • Add delete_lock and fix release_lock (#440)

Credits:
Liang-Chi Hsieh, Robert Pack, Mykhailo Osypov, Florian Valeye, Thomas Vollmer, Yuan Zhou, roeap, Denny Lee, Yuan Zhou, Kelvin S. do Prado, QP Hou, Thomas Peiselt, Bruno Bigras, Akshay Ghiya

python-v0.5.3

21 Sep 10:14
40d3d90
Compare
Choose a tag to compare
  • Add history command in delta-rs (#428)
  • reenable datafusion integration with temporary fork (#436)
  • Decode path in Add and Remove actions. (#434)
  • Optimize remove action apply with early iteration exit #424 (#431)
  • Clean up DeltaTransactionError (#432)
  • Add is_non_acquirable field to the dynamodb lock (#429)
  • Expose valid primitive type list to public doc (#430)
  • Support partition value string deserialization for timestamp/binary (#371)
  • Bump arrow to 6.0.0-SNAPSHOT and bring map support to schema (#375)
  • Update README.adoc (#426)
  • Introduce DeltaConfig and tombstones retention policy (#420)
  • Sync Action attributes with delta (#380)
  • Add LICENSE file in the Python binding and refer it in the pyproject.toml (#422)
  • Change checkpoint creation logs from info to debug (#423)
  • Add the Glue Data Catalog for reading the DeltaTable (#419)
  • Add S3StorageOptions to allow configuring S3 backend explicitly (#418)
  • BUGFIX: writes to gcs must include the content length header
  • Ensures that all table schemas are of StructType (#415)
  • Fix reading nullable action fields from parquet (#417)
  • Add filesystem argument for reading DeltaTable in Python binding (#414)
  • Add implementation for load_with_datetime in Python package. (#411)
  • Add a Makefile build task in the Python binding (#410)
  • Use update_incremental in update (#398)
  • Use tokio::fs::rename in put_obj. (#403)
  • Update python readme (#406)
  • Update pyproject definition in pyproject.toml (#405)
  • Add examples for reading delta table with Rust API. (#400)
  • Implement delete_objs in fs and s3 storage backends. (#395)
  • Remove version param from create_checkpoint_from_table (#399)
  • Google cloud storage backend (#355)
  • added initial commit info on create method for a DeltaTable (#387)
  • Upgrade to DataFusion 5.0 (#389)
  • additional error handling to atomic_rename (#386)
  • Reuse table/storage instances in checkpoints (#384)
  • Add sts assume role creds for S3 (#383)
  • Update datafusion and ballista links in README (#382)
  • Merge Cargo.toml into pyproject.toml (#381)
  • Implement consistent behavior in Windows with regard to swap parameter. (#379)
  • Refactoring of black, isort, mypy tools usages into pyproject.toml (#378)
  • Wrap DeltaTransactionError with DeltaTableError. (#374)
  • Allow filesystem backend put_obj to overwrite existing (#376)
  • Make Format.options to be required field (#370)
  • Implement atomic put_obj. (#367)
  • support partition value string deserialization for float/double/date (#363)
  • Add '.tmp' suffix to temporary file of prepared commit (#366)
  • cache cargo builds in CI (#359)

python-v0.5.2

09 Aug 07:38
7dc0c6c
Compare
Choose a tag to compare
  • new update_incremental API for streaming table update
  • fix a bug in load_version method causing duplicated data @zijie0
  • fix crash on table load caused by null partition value @zijie0
  • support filtering on null partition value in table load predicate @zijie0

python-v0.5.1

20 Jul 02:09
5ad31d0
Compare
Choose a tag to compare
  • added columns argument to to_pyarrow_table method to support projections on PyArrow Table conversion @zijie0
  • added to_pandas shortcut method to convert a DeltaTable directly to pandas dataframe @bramrodenburg

rust-v0.4.0

19 Jul 00:20
Compare
Choose a tag to compare
  • added primitive writer API
  • added new DeltaTable method get_file_paths_by_partitions
  • added new DeltaTable method get_active_add_actions
  • added new DeltaTable method update_incremental
  • renamed log_bytes_from_actions to log_entry_from_actions
  • avoided clone in DeltaTable's get_files method
  • renamed camelCased fields to snake_case thanks to @nfx
  • added multi-writer support for S3 backend
  • optimized vacuum operation
  • added checkpoint writer
  • added s3-rustls feature
  • added checkpoint lambda function
  • made lock optional when creating s3 backend
  • added missing partition component in parquet path thanks to @viirya
  • started using delta statistics (row nums, null count, total bytes) in datafusion integration, thanks to @Dandandan @viirya
  • avoid unnecessary object head in DeltaTable's get_latest_version method thanks to @viirya
  • fixed file content write flush in fs storage backend

python-v0.5.0

02 Jun 19:59
d922b31
Compare
Choose a tag to compare
  • Optimize vacuum operation
  • Manage empty delta table in Python binding
  • Improve Python binding development
  • Add repair of failed/expired rename in S3 backend
  • Fix Rename camelCased fields to snake_case
  • Introduce the primitive writer API which will put a file and create the "add" action
  • Add commit_version method to DeltaTransaction
  • Add tests and Err restructure for commit_version
  • Inline JSON action join when creating delta log entry
  • Avoid commit loop call from commit_version
  • Rename log_bytes_from_actions to log_entry_from_actions
  • Add get_file_paths_by_partitions in Python bindings

python-v0.4.8

10 May 07:31
a3b27b3
Compare
Choose a tag to compare
  • Bump the version of pyarrow needed for Python bindings
  • Honor AWS_REGION env var for S3 endpoint override
  • Add valueContainsNull for the MapType in DeltaTable schema
  • Add Vacuum command in DeltaTable
  • Add "element" as the arrow list field name
  • Add Date, StructArray, and Map for Pyarrow types in Python schema bindings
  • Make list object streams send

python-v0.4.7

28 Apr 19:14
e9dc51b
Compare
Choose a tag to compare
  • Add createdTime in Metadata
  • Update arrow dependency to 4.0.0
  • Add dry_run vacuum command
  • Accommodate AWS_ENDPOINT_URL in the python binding to allow alternatives to S3
  • Add support for hive style partitioning when reading a table with to_pyarrow_dataset
  • Allow to_pyarrow_table() to take an optional list of partitions

python-v0.4.6

15 Apr 15:51
8aa9551
Compare
Choose a tag to compare
  • Add documentation for Python bindings
  • Change the default Delta timestamp to Nanoseconds for Arrow
  • Add pyarrow floatingpoint in Python bindings
  • Add Metadata in Python bindings

python-v0.4.5

05 Apr 20:50
c6d49ae
Compare
Choose a tag to compare
  • Add the functionality of filtering partitions when reading a partitioned DeltaTable
  • Improve the Python documentation using docstring
  • Enable date column support for delta to arrow schema conversion
  • Fix the struct data type generation in the Python schema method
  • Add the schema when reading DeltaTable parquet files with pyarrow in Python
  • Support custom endpoint URL for S3 backend