Changelog

Untagged

4.5.0 (2024-03-18)

Features

Support GCS files in configs list command (#1108) (b49e1c3)

Bug Fixes

Add table names to report results when source and/or target dataframes are empty (#1104) (812ed62)
Fixes issue casting Snowflake decimal with scale>0 to string (#1110) (34446a4)
force cast for aggregates (#1114) (44b60cf)
Teradata large timestamp handling (#1117) (842d8b7)

4.4.0 (2024-02-22)

Features

Add --url to Oracle connections add options (#1083) (2f078c2)
Add PostgreSQL OID support (#1076) (58f8fcb)
Add support to generate a JSON config file only for applications purposes (#1089) (d463038)
set default oracle sql alchemy arraysize to 500 (#1088) (1672ac5)
Support for Kubernetes (#1058) (fdbdbe0)

Bug Fixes

Add support for cx_Oracle's DB_TYPE_LONG_RAW (#1095) (90547ef)
Better casts to string for binary floats/doubles (#1078) (15bfc4c)
case-insensitive comparison field support (#1103) (d28786f)
Fix merge issue for Teradata empty dataframes (#1100) (cc91fa2)
increase upper limit on recursion columns (#1090) (c599ebf)
Remove DDL automatically issued by Ibis for Postgres connections (#1067) (c2b660b)
Row validation primary key columns >64bit int/float are cast to string (#1080) (9e70e9e)
Spanner generate-partition to use BQ dialect (#1066) (f3cc565)
spanner hash function to return string instead of bytes (#1062) (722dff9)

Documentation

Add Airflow Kubernetes pod operator samples (#1087) (7d5ea91)
Updates on nested column limitations, contributing guide examples and incorrect example (#1082) (cc0f60a)

4.3.0 (2023-11-28)

Features

Adding Exclude columns flag for aggregations in column validations (#961) (faa32dc)
support query parameter for MSSQL connection (#1026) (48b0355)

Bug Fixes

--dry-run for SQLAlchemy clients with valid raw SQL (#1047) (c1e0e34)
Add Spanner RawSQL operation to enable filtering (#1054) (3a01503)
Adding credentials as parameter for Spanner (#1031) (367658e)
Adjust find-tables to properly get Oracle and Postgres schemas (#1034) (45fb40a)
Cast should treat nullable and non-nullables as the same (#1037) (5e5c5eb)
Fix --grouped-columns issue for Oracle validation (#1050) (3473a27)
Fix decimal separator to "." (dot) on Oracle (#1042) (14cc7ef)
Teradata SSLMODE issue fix (#1014) (e7aab6b)

Documentation

Add CLOB to Oracle BLOB validation document (#1029) (8c76c1b)
Update connections.md to add supported version of DB2 (#1030) (44b4be7)

4.2.0 (2023-09-28)

Features

Add more mappings to the allowlist configuration files for Oracle schema validations (#953) (0fed588)
Include date columns for min/max/sum validations (#984) (6de9921)
Include date columns in scope of wildcard_include_timestamp option (#989) (a4cf773)
Support BQ decimal precision and scale for schema validation (#960) (b1d4942)
Support standard deviation for column agg (#964) (bb81701)

Bug Fixes

Add exception handling for invalid value to cast a comparison field (#957) (703ca75)
Add missing SnowflakeDialect mapping for BINARY data type (#959) (9ad529a)
Add not-null string to accepted date types in append_pre_agg_calc_field() (#980) (76fcfc6)
Adjust set up for randow row batch size default value, but it maintains as 10,000 (#986) (a20ccab)
custom query row validation failing when SQL contains upper cased columns (#994) (a9fed41)
Fix warning and precision detection when target precision higher than source (#965) (5f00ce1)
generate-table-partitions- fixes Issue 945 and Issue 950 (#962) (c53f2fc)
Prevent failure of column validation config generation if source column other than allow-list not present in target table. (#974) (40a073e)
Prevent Oracle blob throwing exceptions during column validation (#1005) (8df1cfa)
support for case insensitive PKs and Snowflake random row (#998) (1a157ae)
support for null columns, support for access locks (#976) (f54bb4d)
yaml validation files in gcs (#977) (bf0fa0a)

Documentation

Add a new sample code for row hash validation of Oracle BLOB (#997) (0bd48a2)

4.1.0 (2023-08-18)

Features

support timestamp aggregation for Oracle and TD (#941) (911bae8)

Bug Fixes

Issues with validate column for time zoned timestamps (#930) (ee7ae9a)
Schema validations ignore not null on Teradata and BigQuery (#935) (936744b)
Support casting TD PKs to VARCHAR (#946) (2171532)

4.0.0 (2023-08-02)

⚠ BREAKING CHANGES

Ibis Upgrade to 5.1.0 (#894)
Partition based on non-numeric and multiple keys (#889)

Features

Adding Random-Row support for Custom Query (#891) (fc42c61)
Adding RawSQL function for Redshift (#903) (c25d690)
Enhance validate schema to support time zoned timestamp columns (#919) (aed1505)
generate-table-partitions: Works on all 7 platforms - BigQuery, Hive, MySQL, Oracle, Postgres, SQL Server and Teradata. (#922) (aa84d7a)
Ibis Upgrade to 5.1.0 (#894) (b5db4c0)
Partition based on non-numeric and multiple keys (#889) (7b6a530)
Snowflake support (#921) (e1d590b)
Support allow list decimals having a range for precision and scale. Also add --allow-list-file. (#888) (7783beb)

Bug Fixes

Adding date and timestamp formatting for Hive (#876) (65a090a)
Adding enhancements to allow-list in schema validation (#881) (c83df2b)
Adding UTF encoding for Oracle hash generation (#878) (2e24eae)
No column filtering for csv/json text output. Reverts part of change for issue 753 (#890) (ba641e0)
redshift bug for custom query (#911) (f1018b5)
teradata NUMBER with no precision/scale, small doc fix after Ibis upgrade (#914) (f9db68f)
validate column sum/min/max issue for decimals with precision beyond int64/float64 (#918) (5a8d691)

Documentation

Add sample shell script and documentation to execute validations at a BigQuery dataset level (#910) (a84da45)

3.2.0 (2023-05-31)

Features

Add --dry-run option to validate. (#778) (8989350)
Add Impala flags for http_transport and http_path (#829) (d966b9e)
Add support for SQL Server's IMAGE, BINARY, VARBINARY, NCHAR, NTEXT, NVARCHAR data types (#859) (6ebece3)
Add support for SQL Server's MONEY data type (#837) (0749c9e)
Move source credentials to secret manager (#824) (1dd5fea)
Redshift integration for Normal row and Custom-Query Validation. (#817) (92ab215)

Bug Fixes

Add missing operations for SQL Server - ExtractEpochSeconds, ExtractDayOfYear, ExtractWeekOfYear (#870) (709dd4c)
Adding datetime and timestamp format logic (#840) (eb095c9)
dry-run bug when running configs, added CODEOWNERS, and docs (#865) (1779772)
handle numeric datatype mapping in teradata schema and fix int mapping as per teradata doc (#874) (333eadb)
split connection names from second last period instead of first from front (#864) (1462deb)
Support for sum/min/max included for oracle number greater than int64 (#809) (73bda66)

Documentation

Fix typos on README (#801) (14ddcc5)
update installation guide about Python 3.11 (#815) (88cd281)
Update our documentation about find-tables command and the score-cutoff parameter (#846) (54403e3)

3.1.0 (2023-04-21)

Features

add db2 hash and concat support (#800) (c16e2f7)
add Impala connection optional parameters (#743) (#790) (414d7f8)
added source_type in output while listing connections list (#803) (056275b)
Adding Custom-Query support for DB2. (#807) (a8085d3)
Option for simpler report output grid (#802) (b92eb91)

Bug Fixes

Mysql fix to support row hash validations, random row validation, and filter (#812) (ae07fa4)
schema validation fixes for Oracle/SQL Server float64 and SQL Server datetimeoffset (#796) (ad0e64f)

Documentation

add README for Airflow DAG sample, update code formatting in other docs (#722) (f4c3241)
score-cutoff changed to 1 (#779) (d3aabca)

3.0.0 (2023-03-28)

⚠ BREAKING CHANGES

issue673 optimize CLI tools arg parser (#701)

Features

✨ Add support for source/target inline sql queries for validate custom-query command (#734) (c5e7a37)
gcp secret manger support for DVT (#704) (d6c40f1)
ibis_bigquery strftime support for DATETIME columns (#737) (b1141de)

Bug Fixes

Add support for numeric and precision with length and precision in Postgres Custom Query (#723) (742b77e)
Adding Decimal datatype support for MSSQL custom query validation (#771) (0d5c5eb)
Better detection of Oracle client (#736) (efce0b8)
Cater for query driven comparisons in date format override code (#733) (0a22643)
issue 740 teradata strftime function (#747) (9fd102a)
issue673 optimize CLI tools arg parser (#701) (26bb8e9)
Protect column and row validation calculated column names from Oracle 30 character identifier limit (#749) (89413c1)
remove secret manager warnings (#781) (7e72bfd)

Documentation

formatting fixes and fix broken link (#739) (7306dfc)
oracleuserpriv (#746) (a7889bf)

2.9.0 (2023-02-16)

Features

Added Partition support to generate multiple YAML config files (#653) (Issue #619,#662) (f79c308)
added run_id to output (#708) (17720f2)
Divert cast of PostgreSQL decimal with scale>0 to to_char (#721) (3542851)
Use centralized date/time format in order to compare row data across engines (#720) (0de823b)

Bug Fixes

Error handling for batch processing of config files (#663) (21a26af)
Protect non-date columns from astype(str) date workaround (#726) (489ee27)
schema validation fix for different base names of source and destination data types (#710) (d7b44b0)

Documentation

updated Oracle parameter from user_name to user and changed underscores to hypens across the document (#689) (8777e00)

2.8.0 (2023-01-19)

Features

Logic to add allow-list to support datatype matching with a provided list in case of mismatched datatypes between source and target (#643) (269f8dc)

Bug Fixes

making logmech as optional for TD connection (#665) (500caa3)

2.7.0 (2023-01-06)

Features

Add AlloyDB support (#645) (cfedc22)
Add Integration test for Oracle (#651) (de3bbcc)
Added custom query support for Oracle (#646) (3f8771a)
Added custom query support for PostgreSQL (#644) (88dcfd3)
extend TO_CHAR to cover date, time and timestamp types (#641) (e0c184f)
SQL Server custom query support (#640) (98ab010)
Support config directory for running validations and add multithreading for DB queries (#654) (c67b51a)
Support custom calculated fields (#637) (14b506b)

2.6.0 (2022-11-28)

Features

add random row support for MSSQL (#633) (3041bd1)
to_char support for Oracle and Postgres (#632) (78f1ce9)

Bug Fixes

bare data-validation command throws exception (#627) (7595c50)
column validation casing to allow for case-insensitive match (#626) (c694357)

2.5.0 (2022-10-18)

Features

adding scaffold for concatenate as a cli operation (#566) (ec4ef33)

Bug Fixes

Custom query validation throwing error with sql files ending with semicolon(;) (#591) (16a89ac)
Row validation optimization to avoid select all columns (#599) (de3758e)
update function to return non-unicode string (#615) (e334c65)

2.4.0 (2022-10-06)

⚠ BREAKING CHANGES

Add Python 3.10 support (#564)

Features

Add Python 3.10 support (#564) (38284a5)
New flag to filter results by status in all supported validations (#593) (97e8bb0)
Oracle random row (#588) (ac3460a)
Postgres row hash validation support (#589) (01765b3)

Miscellaneous Chores

release 2.4.0 (#600) (b704505)

2.3.0 (2022-09-15)

Features

Addition of log level as an argument for DVT logging and replac… (#577) (dbd9bc3)
Oracle row level validation support (#583) (489654c)

Bug Fixes

Add RawSQL support for Postgres and SQL Server (#576) (0693782)
fixing String to varchar for teradata (a979931)
random rows with filter option (#582) (da4faaf)
support NUMBER with no precision/scale (#572) (03219ba)
Teradata limit on column name, bug when casting to VARCHAR (#580) (c8700be)

Documentation

remove snowflake, add row supported DBs (#587) (1d923f5)

2.2.0 (2022-08-29)

⚠ BREAKING CHANGES

Added teradata custom query support (#547)

Features

Added teradata custom query support (#547) (97c3203)
Improve schema validation debugging, Support DATE for Hive validations (#558) (e67de5b)
Support for MSSQL row validation (#570) (61dabe0)

Bug Fixes

Issue422 replace print with logging (#543) (78222b4)

Miscellaneous Chores

release 2.2.0 (#571) (c29b4c1)

2.1.0 (2022-07-14)

Features

new flag to exclude columns from schema validation (#507) (53ac41a)
Remove dependency on tables list for custom query (#541) (7dca5bd)

Bug Fixes

added new result columns to schema validation (#512) (478bb2d)
close Teradata connection via object delete (#524) (181b865)
editing contributing.md (#509) (c01d730)
fixing teradata doc (#513) (6a10356)
issue-256-bug fixes to generate docker file (#531) (adc528e)
issue-256-Release docker image for dvt repo (#527) (e3d42cc)
issue-256-Release docker image for dvt repo (#529) (e87d0ef)
Oracle support for decimals (#530) (0d73207)
primary key casting (#521) (1a7667b)
support for cast to timestamp in TD, support for random row (#538) (f7ed739)

Documentation

fix typo on ibis_snowflake (#516) (de8a4bd)
supported hive version (#515) (923d4ff)

2.0.1 (2022-06-10)

Bug Fixes

Schema validation to make case insensitive column name comparision (#500) (ee8c542)

2.0.0 (2022-05-26)

⚠ BREAKING CHANGES

Add 'primary_keys' and 'num_random_rows' fields to result handler (#372)

Features

Add 'primary_keys' and 'num_random_rows' fields to result handler (#372) (b123279)
add a new DAG example to run DVT (#485) (e3dd7ed)
adding impala random function (#483) (93d2072)
Enable sum/avg/bit_xor for BigQuery datetime type (#488) (083de07)

Documentation

Alpha-order Connection Types (#491) (39e0dd8)
GA README updates (#492) (b63ef3b)

1.7.2 (2022-05-12)

⚠ BREAKING CHANGES

Adds custom query row level hash validation feature. (#440)

Features

Add example of BigQuery cast to NUMERIC, update chore release version (#476) (50fac28)
Adds custom query row level hash validation feature. (#440) (f057fe8)
Issue356 db2 test (#383) (70fb7bc)
Support cast to BIGINT before aggregation (#461) (ca598a0)
support float and decimal types in Hive (#470) (5936f60)

Bug Fixes

add get_ibis_table_schema (#410) (#411) (4093625)
only replaces datatypes and not column names (#453) (6143794)
supports NULL datetime/timestamps, fixes bug with validation_status in PR 455 (#460) (57896f4)
Updated schema validation logic to column as 'validation_status' (#455) (e30c337)
updating teradata docs for sha256 UDF and swapping string_join for concat (#457) (23dbf56)

1.7.1 (2022-04-14)

⚠ BREAKING CHANGES

Changed result schema 'status' column to 'validation_status' (#420)

Features

added timestamp to supported types for min and max (#431) (e8b4860)
Allow aggregation over length of string columns (#430) (201f0a2)
Changed result schema 'status' column to 'validation_status' (#420) (dfcd0d5)
hash filter support (#408) (46b3723)
Hash selective columns (#407) (88b6620)
Implement sum/avg/bit_xor aggs for Timestamp (#442) (51f3af3)
improve postgres tests (#443) (6a54527)
Random Sort for Pandas Queries (#404) (2051039)
Support for custom query (#390) (7a218d2)

Bug Fixes

bug introduced with new pr (#429) (a6cf3f0)
Hash all bug, noxfile updates (#413) (fc73e21)
Hive boolean nan to None, Unsupported ibis data types in structs and arrays (#444) (e94a1da)
ibis default sql option limits query results at 10k rows (#418) (7539efe)
Impala strings/objects now return None instead of NaN (#406) (9d3c5ec)
issue 265 add cloud spanner functionality (#394) (783cdf8)
support labels for schema validation (#260) (#381) (f787701)
Treat both source and target values being NULL as a success (#437) (c4da5ca)

Miscellaneous Chores

release 1.7.1 (#446) (99916ba)

1.7.0 (2022-03-23)

Features

deploy flask app via CLI (#344) (b1dc82a)
first class support for row level hashing (#345) (3d78ee5)
GCS support for validation configs (#340) (b09cd29)
Hive hash function support (#392) (0ca0ccf)
Hive partitioned tables support (#375) (8f1af27)
Issue339 ldap logmech (#347) (ad7f1fc)
Random Row Validation Logic (#357) (229d870)

Bug Fixes

add to_hex for bigquery hash (#400) (e5c7ded)
Comparison fields Key Error fix (#396) (a597b56)
ensure all statuses are success or fail, particularly after _join_pivots (#329) (#370) (310747d)
make status values consistent across validation types (#377) (#378) (5c08463)
Multiple updates (#359) (6b2614d)
revert change from #345 that causes filters, threshold and labels to be ignored for column validations (#376) (#379) (8b295cf)
Status when source and target agg values are 0 (#393) (6a41f68)
support schema validation for more clients (#355) (#380) (ed46295)
supporting non default schemas for mssql (#365) (100b3ea)
test for nan when calculating fail/success in combiner (#341) (#366) (a9720c2)
use an appropriate column filter list for schema validation (#350) (#371) (806151a)

Documentation

Add Hive as a supported data source to docs (#354) (be2a49d)

1.6.0 (2021-12-01)

Features

teradata hashing implementation (#324) (b74e03e)

Bug Fixes

Include StringIO into teradata ibis compiler.py (#336) (1dba63b)
Issue348 casting (#349) (1560c7e)

Documentation

add local development nox docs (#342) (80d26c6)

1.5.0 (2021-10-19)

Features

added kerberos service name flag for Impala connections, fixed bug in row validation with YAML (#320) (351994c)
Track DVT GCS connections (#326) (b384b1f)

Bug Fixes

Issue323 row hash (#328) (1a03ad7)

Documentation

add new release process (#332) (6015127)
Added python install commands (#264) (0936d84)

1.4.0 (2021-09-30)

Features

add state manager client (#311) (e893ea5)
Allow user to specify a format for stdout (#242) (#293) (f0a9fa1)
Allow user to specify a format for stdout T2 (#242) (#296) (ec1af22)
cast aggregates (#306) (e3da4c3)
Issue262 impala connect (#281) (eaa052f)
logic to deploy dvt on Cloud Run (#280) (9076286)
promote 3.9 to main version (as it is in Cloudtops now for local testing) and add a small unit test for persoanl use (#292) (eb0f21a)
Refactor CLI to fit Command Pattern (#303) (f6d2b9d)
Updated Cloud Functions sample (#297) (923413d)

Bug Fixes

updated code so that BQ target schema would not set to None for FileSystem to BQ validations (#309) (5016d65)

1.3.2 (2021-06-29)

Documentation

add secrets logic to ci (#273) (3c21ee5)
Issue263 Installation doc updates (#270) (0328c0e)

1.3.1 (2021-06-28)

Documentation

clean setup (#272) (08d393b)
Update docs with examples (#261) (fd90096)

1.3.0 (2021-06-28)

Features

add table matching score as a param incase adjusted is needed (#267) (b02aed5)
CI/CD Release to PyPi via Cloud Build (#258) (0870fc7)

Bug Fixes

correct issues blocking impala and hive (#266) (5110d1f)

1.2.0 (2021-05-27)

Features

add data source for Cloud Spanner (#206) (c63f68e)
added an optional beta flag in CLI (#249) (e8e75de)
Added FileSystem connection type (#254) (be7824d)

Bug Fixes

Cli tools bug fix (#253) (b41e625)
Remove JSON arguments in CLI (#247) (5a309f7)
Update connections.md (#248) (9c1ae40)
Update Readme.md (#257) (c968024)

1.1.8

Adding and documenting find-tables CLI feature with schema filter
Correct filter errors caused by SQL Alchemy errors
Adding beta calculated fields logic

1.1.7

Adding tests to validate BIGNUMERIC BQ type behavior

1.1.6

Minor fix for Teradata client from breaking IBis changes

1.1.5

Add support for running raw queries against a connection
Upgraded Ibis to v1.4 with large client organizational and design changes
Added support for "use_no_lock_tables" Teradata config to optionally avoid table locking

1.1.4

Added an options to add key:value labels to validation runs
Oracle and SQL Alchemy now support RawSql filters
Add support for Cloud Functions in samples
Added schema information to result set

1.1.3

Release find-tables logic too help build table lists
Teradata client improvements
Remove rarely used dependencies into extras

1.1.2

Teradata numeric column and general bug fixes
Fix Ibis query compliation order causing cross join

1.1.1

Bug fixes to support case insensitivity
Allow null values to be handled in grouped columns
Oracle client improvements

1.1.0

Added Row validations for cell level validation with primary keys
Client support for Oracle, SQL Server, Postgres, and GCS files

1.0

Support for Column and GroupedColumn validations
Allow custom filter via YAML config
BigQuery result handlers supported
Client support for BigQuery, MySQL, and Teradata

0.1.1 (release date TBD)

Bug Fixes

update BigQuery dependencies to fix group-by results handler #64

Documentation

remove references to unsupported validations from README #63
includes wheel file installation steps in README #57
add filters and data sources to README #56

Internal / Testing Changes

move ibis addons to third-party directory #61

0.1.0 (2020-07-16)

Initial alpha release.

Features

Add data-validation CLI, which can run from CLI arguments, store a configuration YAML file, or run from a run-config YAML file.
Add support for querying Teradata.
Add support for querying BigQuery.
Write report output to BigQuery.

Dependencies

To use Teradata support, you must manually install the teradatasql PIP package.

Documentation

See the README.md file for getting started instructions.

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

Changelog

Untagged

4.5.0 (2024-03-18)

Features

Bug Fixes

4.4.0 (2024-02-22)

Features

Bug Fixes

Documentation

4.3.0 (2023-11-28)

Features

Bug Fixes

Documentation

4.2.0 (2023-09-28)

Features

Bug Fixes

Documentation

4.1.0 (2023-08-18)

Features

Bug Fixes

4.0.0 (2023-08-02)

⚠ BREAKING CHANGES

Features

Bug Fixes

Documentation

3.2.0 (2023-05-31)

Features

Bug Fixes

Documentation

3.1.0 (2023-04-21)

Features

Bug Fixes

Documentation

3.0.0 (2023-03-28)

⚠ BREAKING CHANGES

Features

Bug Fixes

Documentation

2.9.0 (2023-02-16)

Features

Bug Fixes

Documentation

2.8.0 (2023-01-19)

Features

Bug Fixes

2.7.0 (2023-01-06)

Features

2.6.0 (2022-11-28)

Features

Bug Fixes

2.5.0 (2022-10-18)

Features

Bug Fixes

2.4.0 (2022-10-06)

⚠ BREAKING CHANGES

Features

Miscellaneous Chores

2.3.0 (2022-09-15)

Features

Bug Fixes

Documentation

2.2.0 (2022-08-29)

⚠ BREAKING CHANGES

Features

Bug Fixes

Miscellaneous Chores

2.1.0 (2022-07-14)

Features

Bug Fixes

Documentation

2.0.1 (2022-06-10)

Bug Fixes

2.0.0 (2022-05-26)

⚠ BREAKING CHANGES