4.5.0 (2024-03-18)
- Add table names to report results when source and/or target dataframes are empty (#1104) (812ed62)
- Fixes issue casting Snowflake decimal with scale>0 to string (#1110) (34446a4)
- force cast for aggregates (#1114) (44b60cf)
- Teradata large timestamp handling (#1117) (842d8b7)
4.4.0 (2024-02-22)
- Add --url to Oracle connections add options (#1083) (2f078c2)
- Add PostgreSQL OID support (#1076) (58f8fcb)
- Add support to generate a JSON config file only for applications purposes (#1089) (d463038)
- set default oracle sql alchemy arraysize to 500 (#1088) (1672ac5)
- Support for Kubernetes (#1058) (fdbdbe0)
- Add support for cx_Oracle's DB_TYPE_LONG_RAW (#1095) (90547ef)
- Better casts to string for binary floats/doubles (#1078) (15bfc4c)
- case-insensitive comparison field support (#1103) (d28786f)
- Fix merge issue for Teradata empty dataframes (#1100) (cc91fa2)
- increase upper limit on recursion columns (#1090) (c599ebf)
- Remove DDL automatically issued by Ibis for Postgres connections (#1067) (c2b660b)
- Row validation primary key columns >64bit int/float are cast to string (#1080) (9e70e9e)
- Spanner generate-partition to use BQ dialect (#1066) (f3cc565)
- spanner hash function to return string instead of bytes (#1062) (722dff9)
- Add Airflow Kubernetes pod operator samples (#1087) (7d5ea91)
- Updates on nested column limitations, contributing guide examples and incorrect example (#1082) (cc0f60a)
4.3.0 (2023-11-28)
- Adding Exclude columns flag for aggregations in column validations (#961) (faa32dc)
- support query parameter for MSSQL connection (#1026) (48b0355)
- --dry-run for SQLAlchemy clients with valid raw SQL (#1047) (c1e0e34)
- Add Spanner RawSQL operation to enable filtering (#1054) (3a01503)
- Adding credentials as parameter for Spanner (#1031) (367658e)
- Adjust
find-tables
to properly get Oracle and Postgres schemas (#1034) (45fb40a) - Cast should treat nullable and non-nullables as the same (#1037) (5e5c5eb)
- Fix --grouped-columns issue for Oracle validation (#1050) (3473a27)
- Fix decimal separator to "." (dot) on Oracle (#1042) (14cc7ef)
- Teradata SSLMODE issue fix (#1014) (e7aab6b)
- Add CLOB to Oracle BLOB validation document (#1029) (8c76c1b)
- Update connections.md to add supported version of DB2 (#1030) (44b4be7)
4.2.0 (2023-09-28)
- Add more mappings to the allowlist configuration files for Oracle schema validations (#953) (0fed588)
- Include date columns for min/max/sum validations (#984) (6de9921)
- Include date columns in scope of wildcard_include_timestamp option (#989) (a4cf773)
- Support BQ decimal precision and scale for schema validation (#960) (b1d4942)
- Support standard deviation for column agg (#964) (bb81701)
- Add exception handling for invalid value to cast a comparison field (#957) (703ca75)
- Add missing SnowflakeDialect mapping for BINARY data type (#959) (9ad529a)
- Add not-null string to accepted date types in append_pre_agg_calc_field() (#980) (76fcfc6)
- Adjust set up for randow row batch size default value, but it maintains as 10,000 (#986) (a20ccab)
- custom query row validation failing when SQL contains upper cased columns (#994) (a9fed41)
- Fix warning and precision detection when target precision higher than source (#965) (5f00ce1)
- generate-table-partitions- fixes Issue 945 and Issue 950 (#962) (c53f2fc)
- Prevent failure of column validation config generation if source column other than allow-list not present in target table. (#974) (40a073e)
- Prevent Oracle blob throwing exceptions during column validation (#1005) (8df1cfa)
- support for case insensitive PKs and Snowflake random row (#998) (1a157ae)
- support for null columns, support for access locks (#976) (f54bb4d)
- yaml validation files in gcs (#977) (bf0fa0a)
4.1.0 (2023-08-18)
- Issues with validate column for time zoned timestamps (#930) (ee7ae9a)
- Schema validations ignore not null on Teradata and BigQuery (#935) (936744b)
- Support casting TD PKs to VARCHAR (#946) (2171532)
4.0.0 (2023-08-02)
- Adding Random-Row support for Custom Query (#891) (fc42c61)
- Adding RawSQL function for Redshift (#903) (c25d690)
- Enhance validate schema to support time zoned timestamp columns (#919) (aed1505)
- generate-table-partitions: Works on all 7 platforms - BigQuery, Hive, MySQL, Oracle, Postgres, SQL Server and Teradata. (#922) (aa84d7a)
- Ibis Upgrade to 5.1.0 (#894) (b5db4c0)
- Partition based on non-numeric and multiple keys (#889) (7b6a530)
- Snowflake support (#921) (e1d590b)
- Support allow list decimals having a range for precision and scale. Also add --allow-list-file. (#888) (7783beb)
- Adding date and timestamp formatting for Hive (#876) (65a090a)
- Adding enhancements to allow-list in schema validation (#881) (c83df2b)
- Adding UTF encoding for Oracle hash generation (#878) (2e24eae)
- No column filtering for csv/json text output. Reverts part of change for issue 753 (#890) (ba641e0)
- redshift bug for custom query (#911) (f1018b5)
- teradata NUMBER with no precision/scale, small doc fix after Ibis upgrade (#914) (f9db68f)
- validate column sum/min/max issue for decimals with precision beyond int64/float64 (#918) (5a8d691)
- Add sample shell script and documentation to execute validations at a BigQuery dataset level (#910) (a84da45)
3.2.0 (2023-05-31)
- Add --dry-run option to validate. (#778) (8989350)
- Add Impala flags for http_transport and http_path (#829) (d966b9e)
- Add support for SQL Server's IMAGE, BINARY, VARBINARY, NCHAR, NTEXT, NVARCHAR data types (#859) (6ebece3)
- Add support for SQL Server's MONEY data type (#837) (0749c9e)
- Move source credentials to secret manager (#824) (1dd5fea)
- Redshift integration for Normal row and Custom-Query Validation. (#817) (92ab215)
- Add missing operations for SQL Server - ExtractEpochSeconds, ExtractDayOfYear, ExtractWeekOfYear (#870) (709dd4c)
- Adding datetime and timestamp format logic (#840) (eb095c9)
- dry-run bug when running configs, added CODEOWNERS, and docs (#865) (1779772)
- handle numeric datatype mapping in teradata schema and fix int mapping as per teradata doc (#874) (333eadb)
- split connection names from second last period instead of first from front (#864) (1462deb)
- Support for sum/min/max included for oracle number greater than int64 (#809) (73bda66)
- Fix typos on README (#801) (14ddcc5)
- update installation guide about Python 3.11 (#815) (88cd281)
- Update our documentation about
find-tables
command and thescore-cutoff
parameter (#846) (54403e3)
3.1.0 (2023-04-21)
- add db2 hash and concat support (#800) (c16e2f7)
- add Impala connection optional parameters (#743) (#790) (414d7f8)
- added source_type in output while listing connections list (#803) (056275b)
- Adding Custom-Query support for DB2. (#807) (a8085d3)
- Option for simpler report output grid (#802) (b92eb91)
- Mysql fix to support row hash validations, random row validation, and filter (#812) (ae07fa4)
- schema validation fixes for Oracle/SQL Server float64 and SQL Server datetimeoffset (#796) (ad0e64f)
- add README for Airflow DAG sample, update code formatting in other docs (#722) (f4c3241)
- score-cutoff changed to 1 (#779) (d3aabca)
3.0.0 (2023-03-28)
- issue673 optimize CLI tools arg parser (#701)
- ✨ Add support for source/target inline sql queries for
validate custom-query
command (#734) (c5e7a37) - gcp secret manger support for DVT (#704) (d6c40f1)
- ibis_bigquery strftime support for DATETIME columns (#737) (b1141de)
- Add support for numeric and precision with length and precision in Postgres Custom Query (#723) (742b77e)
- Adding Decimal datatype support for MSSQL custom query validation (#771) (0d5c5eb)
- Better detection of Oracle client (#736) (efce0b8)
- Cater for query driven comparisons in date format override code (#733) (0a22643)
- issue 740 teradata strftime function (#747) (9fd102a)
- issue673 optimize CLI tools arg parser (#701) (26bb8e9)
- Protect column and row validation calculated column names from Oracle 30 character identifier limit (#749) (89413c1)
- remove secret manager warnings (#781) (7e72bfd)
2.9.0 (2023-02-16)
- Added Partition support to generate multiple YAML config files (#653) (Issue #619,#662) (f79c308)
- added run_id to output (#708) (17720f2)
- Divert cast of PostgreSQL decimal with scale>0 to to_char (#721) (3542851)
- Use centralized date/time format in order to compare row data across engines (#720) (0de823b)
- Error handling for batch processing of config files (#663) (21a26af)
- Protect non-date columns from astype(str) date workaround (#726) (489ee27)
- schema validation fix for different base names of source and destination data types (#710) (d7b44b0)
- updated Oracle parameter from user_name to user and changed underscores to hypens across the document (#689) (8777e00)
2.8.0 (2023-01-19)
- Logic to add allow-list to support datatype matching with a provided list in case of mismatched datatypes between source and target (#643) (269f8dc)
2.7.0 (2023-01-06)
- Add AlloyDB support (#645) (cfedc22)
- Add Integration test for Oracle (#651) (de3bbcc)
- Added custom query support for Oracle (#646) (3f8771a)
- Added custom query support for PostgreSQL (#644) (88dcfd3)
- extend TO_CHAR to cover date, time and timestamp types (#641) (e0c184f)
- SQL Server custom query support (#640) (98ab010)
- Support config directory for running validations and add multithreading for DB queries (#654) (c67b51a)
- Support custom calculated fields (#637) (14b506b)
2.6.0 (2022-11-28)
- add random row support for MSSQL (#633) (3041bd1)
- to_char support for Oracle and Postgres (#632) (78f1ce9)
- bare data-validation command throws exception (#627) (7595c50)
- column validation casing to allow for case-insensitive match (#626) (c694357)
2.5.0 (2022-10-18)
- Custom query validation throwing error with sql files ending with semicolon(;) (#591) (16a89ac)
- Row validation optimization to avoid select all columns (#599) (de3758e)
- update function to return non-unicode string (#615) (e334c65)
2.4.0 (2022-10-06)
- Add Python 3.10 support (#564)
- Add Python 3.10 support (#564) (38284a5)
- New flag to filter results by status in all supported validations (#593) (97e8bb0)
- Oracle random row (#588) (ac3460a)
- Postgres row hash validation support (#589) (01765b3)
2.3.0 (2022-09-15)
- Addition of log level as an argument for DVT logging and replac… (#577) (dbd9bc3)
- Oracle row level validation support (#583) (489654c)
- Add RawSQL support for Postgres and SQL Server (#576) (0693782)
- fixing String to varchar for teradata (a979931)
- random rows with filter option (#582) (da4faaf)
- support NUMBER with no precision/scale (#572) (03219ba)
- Teradata limit on column name, bug when casting to VARCHAR (#580) (c8700be)
2.2.0 (2022-08-29)
- Added teradata custom query support (#547)
- Added teradata custom query support (#547) (97c3203)
- Improve schema validation debugging, Support DATE for Hive validations (#558) (e67de5b)
- Support for MSSQL row validation (#570) (61dabe0)
2.1.0 (2022-07-14)
- new flag to exclude columns from schema validation (#507) (53ac41a)
- Remove dependency on tables list for custom query (#541) (7dca5bd)
- added new result columns to schema validation (#512) (478bb2d)
- close Teradata connection via object delete (#524) (181b865)
- editing contributing.md (#509) (c01d730)
- fixing teradata doc (#513) (6a10356)
- issue-256-bug fixes to generate docker file (#531) (adc528e)
- issue-256-Release docker image for dvt repo (#527) (e3d42cc)
- issue-256-Release docker image for dvt repo (#529) (e87d0ef)
- Oracle support for decimals (#530) (0d73207)
- primary key casting (#521) (1a7667b)
- support for cast to timestamp in TD, support for random row (#538) (f7ed739)
2.0.1 (2022-06-10)
2.0.0 (2022-05-26)
- Add 'primary_keys' and 'num_random_rows' fields to result handler (#372)
- Add 'primary_keys' and 'num_random_rows' fields to result handler (#372) (b123279)
- add a new DAG example to run DVT (#485) (e3dd7ed)
- adding impala random function (#483) (93d2072)
- Enable sum/avg/bit_xor for BigQuery datetime type (#488) (083de07)
1.7.2 (2022-05-12)
- Adds custom query row level hash validation feature. (#440)
- Add example of BigQuery cast to NUMERIC, update chore release version (#476) (50fac28)
- Adds custom query row level hash validation feature. (#440) (f057fe8)
- Issue356 db2 test (#383) (70fb7bc)
- Support cast to BIGINT before aggregation (#461) (ca598a0)
- support float and decimal types in Hive (#470) (5936f60)
- add get_ibis_table_schema (#410) (#411) (4093625)
- only replaces datatypes and not column names (#453) (6143794)
- supports NULL datetime/timestamps, fixes bug with validation_status in PR 455 (#460) (57896f4)
- Updated schema validation logic to column as 'validation_status' (#455) (e30c337)
- updating teradata docs for sha256 UDF and swapping string_join for concat (#457) (23dbf56)
1.7.1 (2022-04-14)
- Changed result schema 'status' column to 'validation_status' (#420)
- added timestamp to supported types for min and max (#431) (e8b4860)
- Allow aggregation over length of string columns (#430) (201f0a2)
- Changed result schema 'status' column to 'validation_status' (#420) (dfcd0d5)
- hash filter support (#408) (46b3723)
- Hash selective columns (#407) (88b6620)
- Implement sum/avg/bit_xor aggs for Timestamp (#442) (51f3af3)
- improve postgres tests (#443) (6a54527)
- Random Sort for Pandas Queries (#404) (2051039)
- Support for custom query (#390) (7a218d2)
- bug introduced with new pr (#429) (a6cf3f0)
- Hash all bug, noxfile updates (#413) (fc73e21)
- Hive boolean nan to None, Unsupported ibis data types in structs and arrays (#444) (e94a1da)
- ibis default sql option limits query results at 10k rows (#418) (7539efe)
- Impala strings/objects now return None instead of NaN (#406) (9d3c5ec)
- issue 265 add cloud spanner functionality (#394) (783cdf8)
- support labels for schema validation (#260) (#381) (f787701)
- Treat both source and target values being NULL as a success (#437) (c4da5ca)
1.7.0 (2022-03-23)
- deploy flask app via CLI (#344) (b1dc82a)
- first class support for row level hashing (#345) (3d78ee5)
- GCS support for validation configs (#340) (b09cd29)
- Hive hash function support (#392) (0ca0ccf)
- Hive partitioned tables support (#375) (8f1af27)
- Issue339 ldap logmech (#347) (ad7f1fc)
- Random Row Validation Logic (#357) (229d870)
- add to_hex for bigquery hash (#400) (e5c7ded)
- Comparison fields Key Error fix (#396) (a597b56)
- ensure all statuses are success or fail, particularly after _join_pivots (#329) (#370) (310747d)
- make status values consistent across validation types (#377) (#378) (5c08463)
- Multiple updates (#359) (6b2614d)
- revert change from #345 that causes filters, threshold and labels to be ignored for column validations (#376) (#379) (8b295cf)
- Status when source and target agg values are 0 (#393) (6a41f68)
- support schema validation for more clients (#355) (#380) (ed46295)
- supporting non default schemas for mssql (#365) (100b3ea)
- test for nan when calculating fail/success in combiner (#341) (#366) (a9720c2)
- use an appropriate column filter list for schema validation (#350) (#371) (806151a)
1.6.0 (2021-12-01)
1.5.0 (2021-10-19)
- added kerberos service name flag for Impala connections, fixed bug in row validation with YAML (#320) (351994c)
- Track DVT GCS connections (#326) (b384b1f)
1.4.0 (2021-09-30)
- add state manager client (#311) (e893ea5)
- Allow user to specify a format for stdout (#242) (#293) (f0a9fa1)
- Allow user to specify a format for stdout T2 (#242) (#296) (ec1af22)
- cast aggregates (#306) (e3da4c3)
- Issue262 impala connect (#281) (eaa052f)
- logic to deploy dvt on Cloud Run (#280) (9076286)
- promote 3.9 to main version (as it is in Cloudtops now for local testing) and add a small unit test for persoanl use (#292) (eb0f21a)
- Refactor CLI to fit Command Pattern (#303) (f6d2b9d)
- Updated Cloud Functions sample (#297) (923413d)
- updated code so that BQ target schema would not set to None for FileSystem to BQ validations (#309) (5016d65)
1.3.2 (2021-06-29)
1.3.1 (2021-06-28)
1.3.0 (2021-06-28)
- add table matching score as a param incase adjusted is needed (#267) (b02aed5)
- CI/CD Release to PyPi via Cloud Build (#258) (0870fc7)
1.2.0 (2021-05-27)
- add data source for Cloud Spanner (#206) (c63f68e)
- added an optional beta flag in CLI (#249) (e8e75de)
- Added FileSystem connection type (#254) (be7824d)
- Cli tools bug fix (#253) (b41e625)
- Remove JSON arguments in CLI (#247) (5a309f7)
- Update connections.md (#248) (9c1ae40)
- Update Readme.md (#257) (c968024)
- Adding and documenting
find-tables
CLI feature with schema filter - Correct filter errors caused by SQL Alchemy errors
- Adding beta calculated fields logic
- Adding tests to validate BIGNUMERIC BQ type behavior
- Minor fix for Teradata client from breaking IBis changes
- Add support for running raw queries against a connection
- Upgraded Ibis to v1.4 with large client organizational and design changes
- Added support for "use_no_lock_tables" Teradata config to optionally avoid table locking
- Added an options to add key:value labels to validation runs
- Oracle and SQL Alchemy now support RawSql filters
- Add support for Cloud Functions in samples
- Added schema information to result set
- Release find-tables logic too help build table lists
- Teradata client improvements
- Remove rarely used dependencies into extras
- Teradata numeric column and general bug fixes
- Fix Ibis query compliation order causing cross join
- Bug fixes to support case insensitivity
- Allow null values to be handled in grouped columns
- Oracle client improvements
- Added Row validations for cell level validation with primary keys
- Client support for Oracle, SQL Server, Postgres, and GCS files
- Support for Column and GroupedColumn validations
- Allow custom filter via YAML config
- BigQuery result handlers supported
- Client support for BigQuery, MySQL, and Teradata
- update BigQuery dependencies to fix group-by results handler #64
- remove references to unsupported validations from README #63
- includes wheel file installation steps in README #57
- add filters and data sources to README #56
- move ibis addons to third-party directory #61
Initial alpha release.
- Add
data-validation
CLI, which canrun
from CLI arguments,store
a configuration YAML file, or run from arun-config
YAML file. - Add support for querying Teradata.
- Add support for querying BigQuery.
- Write report output to BigQuery.
- To use Teradata support, you must manually install the
teradatasql
PIP package.
- See the
README.md
file for getting started instructions.