DataHub v0.15.0 Release Notes
User Experience
-
Structured Properties
- Added comprehensive support for managing structured properties, including creation, editing, deletion, and display preferences. Introduced timestamps for tracking creation and modification. [#12100, #11419]
- Enhanced property display options with badge styling, custom column types, and configurable visibility settings in asset sidebars and schema fields. [#12111, #12052]
- Added structured property filtering in UI with improved aggregation logic and entity metadata display. Introduced new property validators and display settings. [#12097, #12099]
-
UI Enhancements
- Enhanced container organization with parent hierarchy labels. [#11705]
- Added support for markdown in incident descriptions, enabling rich formatting capabilities. [#11759]
- Improved ingestion reporting with better visibility of successful ingestions with warnings. Enhanced browse paths display for business attributes and schema fields. [#11704, #11585]
- Added support for timeseries aspects in OpenAPI and customizable date range fields for Analytics charts. [#12096, #11366]
-
Authorization & Authentication
Metadata Ingestion
Ingestion Framework Improvements
-
Enhanced Data Source Support: Expanded ingestion capabilities for multiple platforms, including Superset (with dataset entities, schema fields, and column-level lineage), Feast (supporting tags and owners ingestion), Neo4j, and Cassandra. Added stateful ingestion support for file sources. [#11688, #11784, #11804, #11526, #11822]
-
SQL Processing Improvements: Replaced vulnerable sqlparse dependency with an in-house SQL parser, optimized CLL generation with reduced memory usage, and added special handling for MSSQL case sensitivity. Enhanced multi-query lineage support for Snowflake temporary tables. [#11645, #11708, #11920, #12020]
-
CLI Enhancements: Introduced new commands for managing ingestion, including listing source runs with filtering capabilities, undoing soft deletes with platform filtering, and listing structured properties. Added an offline flag to the SQL parser CLI. [#11740, #11980, #12012, #12283, #11635]
-
Ownership and Metadata Management: Extended ownership transformer capabilities across entities, improved glossary sync to preserve custom ownership types, and added support for multiple ownership types in glossaries and terms. Enhanced Forms CLI with additional filters for subtypes, platform instances, owners, tags, and glossary terms. [#11700, #11545, #12050, #10979]
-
Core Infrastructure Improvements: Implemented unique URN generation for all entities, added support for efficient entity ingestion through
get_entity_as_mcps
, improved empty field handling, and introduced progress reporting during ingestion. Added execution request cleanup job and support for dropping duplicate schema fields. [#11676, #11425, #11613, #12117, #11765, #12308]
Source-Specific Ingestion Improvements
Airflow
- Upgraded infrastructure with support for Airflow 2.10, deprecated versions below 2.3, and improved template handling with Jinja support. Added configuration options for dag patterns and environment variables. [#11300, #11371, #11472, #11537, #11579, #12056]
- Enhanced error handling and debugging with improved logging, fixed plugin stability issues on EMR, and added support for AthenaOperator lineage extraction. Introduced ability to disable plugin without restart. [#11857, #11877, #11880, #12098]
BigQuery
- Enhanced data modeling capabilities with support for foreign/primary keys, BigLake tables, and improved handling of external tables. Added support for region qualifiers and partition management. [#11686, #11728, #11874, #11940]
- Improved lineage tracking with GCS data source support and optimized query performance. Added platform resource entity generation from BigQuery labels. [#11442, #11492, #11534, #11602]
- Enhanced profiling and performance with better type handling and size limits. Fixed issues with tag synchronization and platform instance settings. [#11807, #12060]
Dagster
- Added support for skipping Asset ingestion, fixed input/output value formatting, and improved compatibility with latest Dagster versions (v1.9.6). Deprecated Python 3.8 support. [#11262, #11481, #12121, #12189]
dbt
- Improved performance and functionality with node_name_patterns for faster CLL processing, support for multiple test paths, and better handling of custom owner types. [#11450, #11460, #11848]
- Enhanced lineage handling by preventing cycles in SQL parsing and supporting multiple dataset assertions for tests. Added support for dbt Cloud's Explore page. [#11666, #11451, #12223]
Snowflake
- Expanded support for various table types, including secure, dynamic, and hybrid tables. Enhanced lineage capabilities for renames, swaps, and external tables. [#11600, #12039, #12094, #12179]
- Improved authentication with OAuth support and token management. Added incremental property processing and structured property support for tags. [#11888, #12048, #12080, #12285]
- Enhanced error handling and logging with better parse failure reporting and dot handling in table names. [#12105, #12110, #12153]
Tableau
- Enhanced project management with new path pattern filtering and improved handling of hidden assets. Added support for access roles and group permissions. [#10855, #11157, #11559]
- Improved API integration with retry logic for various error codes (502, 504), better authentication handling, and consistent page size application. [#12213, #12216, #12233]
- Enhanced reporting and debugging capabilities while maintaining efficient performance and proper permission handling. [#12015, #12024, #12175]
PowerBI
- Improved M-query parsing with support for comments, better handling of quotes, and DatabricksMultiCloud native query functionality. [#12177, #11743, #11756]
- Enhanced workspace management with cross-workspace dataset linking and app ingestion support. Added timeouts for M-query parsing. [#11560, #11629, #11753]
- Improved error reporting and performance optimization with reduced type casting and better organization of responsibilities. [#11763, #12004]
Developer Experience
-
Entity Management: Introduced entity versioning for Datasets and ML Models, with support for version set linking. Improved timeline functionality with better handling of primary key changes and rename events. Added data transformation logic models to enhance data processing capabilities. [#11819, #11843, #12166, #12198]
-
Enhanced Configuration Management: Added new customization options through environment variables and Helm charts, including editable dataset names and configurable garbage collection scheduling. The bootstrap process has been optimized to reduce latency during installation. [#11391, #11518]
-
Development Environment Updates: Added Git support to the ingestion-base image, enabling better source control integration for ingestion workflows. [#11477]
-
Security Logging Enhancement: Improved security audit trails by adding actor URN tracking for unauthorized access attempts. [#12030]
NEW: Garbage Collection
-
Comprehensive Metadata Cleanup: Introduced a new ingestion source: DataHubGC to function as a garbage collector for managing dataflows, data jobs, and data process instances, with configurable retention policies and deletion parameters. Added dry run mode for testing cleanup operations. [#11102, #11413]
-
Performance Optimizations: Significantly improved processing speed from 1 hour to 15 minutes by implementing batch processing, optimizing queries, and removing unnecessary operations. Increased default hard delete limit from 10k to 25k entities. [#11809, #12093, #12238]
-
Reliability Improvements: Enhanced garbage collection stability with additional validation checks, improved error handling, and better process visibility through ingestion stage reporting. Fixed issues with entity deletion logic and reference handling to preserve critical lineage relationships. [#12011, #12013, #12027, #12049, #12124, #12226]
Thank You to Our Contributors!
First-Time Contributors
@AColocho, @alberttwong, @Alice-608, @Bumyu, @chakru-r, @chriscc2, @dejan2609, @donovan-acryl, @eagle-25, @hwmarkcheng, @k-bartlett, @kanavnarula, @kartikey-visa, @kevinkarchacryl, @kousiknandy, @kris48k, @llance, @margaridafernandes-trip, @mikeburke24, @raudzis, @ronybony1990, @ryota-cloud, @shepherd44, @siong-tcha, @ssidorenko, @tanguyantoine, @th0ger, @udays-visa, @udbhav-hbk, @vejeta
Repeat Contributors
@aviv-julienjehannet, @bda618, @bossenti, @darnaut, @deepgarg-visa, @DSchmidtDev, @dushayntAW, @eboneil, @ethan-cartwright, @feldjay, @githendrik, @haeniya, @Jorricks, @Masterchen09, @mkamalas, @Nbagga14, @nicholas-fwang, @noggi, @pankajmahato-visa, @pinakipb2, @rtekal, @sagar-salvi-apptware, @steffengr
DataHub Maintainers
@acrylJonny, @anshbansal, @asikowitz, @chriscollins3456, @david-leifker, @gabe-lyons, @hsheth2, @jayacryl, @jjoyce0510, @maggiehays, @mayurinehate, @pedro93, @RyanHolstien, @sakethvarma397, @sgomezvillamor, @shirshanka, @sid-acryl, @skrydal, @treff7es, @yoonhyejin
What's Changed
- fix(ingest): override setdefault in file-backed dict by @hsheth2 in #11359
- fix(ingest/airflow): simplify env configuration by @hsheth2 in #11371
- fix(airflow): added support for jinja template for datahub emitter operator by @dushayntAW in #11300
- fix(smoke-test): add wait for sync to smoke-test by @david-leifker in #11405
- fix(customSearch): apply query string interpolation to function score by @RyanHolstien in #11406
- fix(docs): Fix typo in bigquery permissions error by @gabe-lyons in #11401
- build(deps-dev): bump vite from 4.5.3 to 4.5.5 in /datahub-web-react by @dependabot in #11410
- feat(ingest/gc): Add dataflow and soft deleted entities cleanup by @treff7es in #11102
- feat(ingestion): adds env property in ContainerProperties by @sgomezvillamor in #11214
- feat(ingest/gc): Add dry run mode to gc recipe by @treff7es in #11413
- feat(kafka-setup): allow override KAFKA_HEAP_OPTS by @david-leifker in #11400
- chore(docs): update release notes v0.14.1 by @david-leifker in #11408
- Advance search - Added case sensitive flag for wildcard searches by @kanavnarula in #11272
- fix(changeGenerator): fixes schema change generator corner cases v2 by @RyanHolstien in #11386
- feat(docs-site) datahub homepage v2 by @jayacryl in #11342
- feat(structuredProps) Add created and lastModified timestamps to structured prop entity by @chriscollins3456 in #11419
- feat(docs-site) tours to open in a modal by @jayacryl in #11420
- feat(structuredProps) Add delete structured props endpoint and handle null props by @chriscollins3456 in #11418
- test(ingest/mcp_diff): Fallback to overwriting file on more complicated diffs by @asikowitz in #11407
- feat(docs-site) fixed home paddings on mobile site by @jayacryl in #11431
- feat(ingest): add
get_entity_as_mcps
method to client by @hsheth2 in #11425 - fix(siblings) Combine siblings in embedded search results by @chriscollins3456 in #11421
- fix(structuredProps) Fix adding new allowed types in updateStructuredProp endpoint by @chriscollins3456 in #11424
- build(deps): bump path-to-regexp from 1.8.0 to 1.9.0 in /datahub-web-react by @dependabot in #11356
- chore(bump): bump spring versions by @david-leifker in #11435
- chore(bump): pac4j version by @david-leifker in #11436
- fix(graphql/getDataset): Do not fetch parent for schema fields by @asikowitz in #11434
- build: allow gradle daemon by @hsheth2 in #11437
- fix(ingest/dbt): handle null index values by @hsheth2 in #11433
- build(deps): bump dompurify from 2.3.8 to 2.5.4 in /datahub-web-react by @dependabot in #11387
- build(deps): bump dset from 3.1.3 to 3.1.4 in /datahub-web-react by @dependabot in #11361
- fix(ingest/dbt): fix dbt catalog version check by @sid-acryl in #11350
- Add STARTS_WITH policy condition to allow for URN-wildcard-based policies by @githendrik in #11441
- fix(restoreIndices): fix bug in urn paginated restoreIndices exit code by @david-leifker in #11443
- feat(Analytics) Allow dateRangeField to be configurable for timeSeries chart by @mkamalas in #11366
- fix(SearchDocumentTransformer): Use correct variable to update ES by @pinakipb2 in #11430
- chore(version): bump protobuf version by @david-leifker in #11446
- fix(search): restore prefix phrase match on quoted search by @david-leifker in #11444
- fix(oidc): apply acr values to redirect url by @RyanHolstien in #11447
- refactor(ui/lineage): Replace FetchedEntities Object with Map by @asikowitz in #11440
- feat(ingest/sink): report datahub-rest sink mode by @hsheth2 in #11422
- docs(ingest): add docs on pydantic compatibility by @hsheth2 in #11423
- chore: use unique docker log artifact names by @hsheth2 in #11445
- test(graphql): fix searchFlags in searchAcrossLineage by @david-leifker in #11448
- Update pr-labeler.yml by @donovan-acryl in #11393
- fix(docs): fix layout in documentation after #11380 by @Masterchen09 in #11390
- Group Modal Css fix by @kanavnarula in #11403
- feat(graph): graph index soft-delete support by @david-leifker in #11453
- config(reindex): create reindex timeout configuration by @david-leifker in #11456
- fix(ingest): sort by last modified not working in the UI by @sid-acryl in #11343
- feat(data-contracts): support custom assertions in the data contracts builder by @jayacryl in #11454
- fix(ingest/sqlglot): Make detach_ctes more robust by @asikowitz in #11449
- fix(ingest/mode): add connection timeouts to avoid RemoteDisconnected errors by @sagar-salvi-apptware in #11245
- build(gradle): Update gradle.properties by @david-leifker in #11458
- build(yarn): increase yarn timeout and version bump by @david-leifker in #11461
- fix(ingest/dbt): allow custom owner types for dbt meta by @hsheth2 in #11460
- doc: fix typo by @anshbansal in #11464
- fix(ingestion/looker): skip personal folder independent looks by @sid-acryl in #11415
- bump(version): zookeeper by @david-leifker in #11465
- fix(ingest): do not cache temporary tables schema resolvers by @mayurinehate in #11432
- fix(structuredProps) Allow upserting structured props on schema fields that don't exist by @chriscollins3456 in #11466
- fix(docs-website): disable dark mode by @hsheth2 in #11468
- fix: fix broken global style by @yoonhyejin in #11470
- feat(ingest/dbt): speed up dbt CLL with node_name_patterns by @hsheth2 in #11450
- feat(ingest/dbt): produce multiple assertions for multi-table dbt tests by @hsheth2 in #11451
- feat(ingest): add
git
to ingestion-base image by @hsheth2 in #11477 - fix(ingest): include platform instance in looker usage urns by @hsheth2 in #11469
- fix(ingest/openapi): update recipe for DataHub OpenAPI with url_complement and bearer token by @sagar-salvi-apptware in #10980
- docs(aws): Update AWS docs to keep consistency with Docker docs by @AColocho in #11284
- feat: add second navbar by @yoonhyejin in #11471
- feat: CTA to live demos in cloud section and a few more case studies on home by @jayacryl in #11488
- fix(ingest/bq): do not query PARTITIONS for biglake tables by @mayurinehate in #11463
- config(rest-api): enable authentication and api authorization by default by @david-leifker in #11484
- feat(ingest/databricks): add usage perf report by @mayurinehate in #11480
- feat(ingestion/tableau): introduce project_path_pattern by @haeniya in #10855
- docs(ingest/dbt): update run result paths examples by @acrylJonny in #11138
- feat(ingest): support
DATAHUB_INCLUDE_ENV_IN_CONTAINER_PROPERTIES
by @hsheth2 in #11476 - refactor(criterion): refactor criterion construction by @david-leifker in #11486
- feat(auth) - Manage Children Glossary term authorization check for Owner, Domain, Remove link by @mkamalas in #11337
- fix(ingest/dagster): Fixing path to the dagster logo by @treff7es in #11489
- docker(config): Update docker profiles by @david-leifker in #11495
- fix(docs): fix form type by @deepgarg-visa in #11497
- fix(fragments.graphql): values is a correct parameter for FacetFilterInput by @bda618 in #11485
- feat: made datahub home page colors more consistent by @jayacryl in #11496
- fix(ingest/bigquery): optimise queries v2 session processing by @mayurinehate in #11492
- feat(ingest/elasticsearch): add api_key auth by @th0ger in #11475
- fix(doc): Update the documentation for Timeline API by @sakethvarma397 in #11504
- docs(upgrade): add note about downgrade limitation by @david-leifker in #11507
- fix(docs-website): corrected minor alignment issues on the home page by @jayacryl in #11508
- chore(cleanup): remove legacy bootstrap step by @david-leifker in #11494
- docs(search): example of entity weighting by @david-leifker in #11511
- feat(businessattribute): filter schema rows on business-attribute pro… by @deepgarg-visa in #11502
- fix(ingest/lookml): missing lineage for looker template -- if prod by @sid-acryl in #11426
- fix(ingestion/transformer): Add container support for ownership and domains by @sagar-salvi-apptware in #11375
- fix(ingestion/nifi): Improve nifi lineage extraction performance by @skrydal in #11490
- fix(dagster-plugin): Fix in/outs format and source config by @DSchmidtDev in #11481
- fix(ingest/elasticsearch): detect sub-properties in 'nested' type mapping by @Bumyu in #11338
- chore: update case studies by @hsheth2 in #11520
- feat: visa case study with a text label instead of logo by @jayacryl in #11521
- feat(docs-site) added snap logo by @jayacryl in #11522
- fix(ingest/looker) : Handle DeserializeError to improve error reporting. by @sagar-salvi-apptware in #11457
- build(misc): misc build, config, version updates by @david-leifker in #11527
- feat: add miro & foursquare & Deutsche Telekom by @yoonhyejin in #11516
- fix(ingestion/nifi): Fix for incremental lineage ingestion for nifi by @skrydal in #11517
- fix(tests): fix metadata-io tests by @david-leifker in #11530
- fix(search): Fix empty description by @david-leifker in #11514
- Block http trace method on mce and mce consumers by @udays-visa in #11513
- feat(bootstrap): bootstrap template mcps by @david-leifker in #11518
- add new CREATE and UPDATE privileges for USERS_AND_GROUPS by @githendrik in #11364
- feat(graphql/lineage): Support including ghost entities by @asikowitz in #11510
- fix(ingest/sigma): handle members api paginated response by @mayurinehate in #11535
- fix(metadata-service): Pass editableDatasetNameEnabled feature flag to the config. by @kris48k in #11391
- Added IEQUAL operator to support case insensitive searches by @Nbagga14 in #11501
- build(deps): bump rollup from 3.29.4 to 3.29.5 in /datahub-web-react by @dependabot in #11459
- build(deps): bump express from 4.19.2 to 4.20.0 in /docs-website by @dependabot in #11357
- build(deps): bump fast-loops from 1.1.3 to 1.1.4 in /datahub-web-react by @dependabot in #10885
- feat(fabric): Add sandbox as a possible environment variable by @pedro93 in #11491
- chore(bump): bump commons, misc versions by @david-leifker in #11538
- feat(dataProduct): add data product unset side effect by @RyanHolstien in #11512
- fix(search): Elasticsearch bool query should by @david-leifker in #11536
- fix(changeGenerator): fix logic around descriptions and make execution more efficient by @RyanHolstien in #11539
- feat(models): add support for generic platform resources by @shirshanka in #11531
- fix(ingest): fix UnboundLocalError in dataproduct transformer by @hsheth2 in #11528
- fix(ingest): add typeUrn in glossary sync source by @anshbansal in #11545
- fix(docs-site) cloud form copy spelling error by @jayacryl in #11455
- fix(propagation): show last modified propagated documentation by @shirshanka in #11498
- privileges(refactor): consolidate individual sys op privileges by @david-leifker in #11549
- feat(website): added built by acryl and linkedin to home hero by @jayacryl in #11548
- feat(ingest/bigquery): casing for tmp table check for queries v2 by @mayurinehate in #11534
- fix(ingestion/powerbi): fix for databricks lineage m-query pattern by @sid-acryl in #11462
- fix(datahub-gc): set system flag on ingestion source by @david-leifker in #11554
- fix(lint): fix yarn lint breakage from #11498 by @shirshanka in #11553
- chore(ci): bump trivy action version by @david-leifker in #11555
- feat(docker-profiles): add elasticsearch envs to docker profiles by @david-leifker in #11546
- sdk(platform-resource): add entity type for ease of use by @shirshanka in #11541
- docs(release): 0.3.6 - docs update by @david-leifker in #11506
- fix(datahub-gc): Update ingestion-datahub-gc.yaml by @david-leifker in #11556
- fix(ci): fix cassandra test flake by @david-leifker in #11557
- feat(scan): add scanning to setup images by @RyanHolstien in #11563
- Fix data search for fieldType = OBJECT for non-string fields by @udbhav-hbk in #11524
- docs(search): add structured search examples for structured properties by @david-leifker in #11565
- docs(datahub source): Add urn exclusions to docs by @eboneil in #11568
- docs(csv-enricher): Add warning for write_semantics by @eboneil in #11561
- fix(test): fix test specific for single query by @david-leifker in #11567
- feat(ingest/bigquery): Add way to reference existing DataHub Tag from a bigquery label by @treff7es in #11544
- chore: remove obsolete attribute from docker by @anshbansal in #11573
- fix(cli): update hard delete confirmation message in delete cli by @hsheth2 in #11550
- feat(ingest/stateful): omit irrelevant urns for deletion by @mayurinehate in #11558
- feat(ingest): add extra reporting for rest sink ASYNC_BATCH mode by @hsheth2 in #11562
- feat(models): support dashboards containing dashboards by @hsheth2 in #11529
- feat(ingest): support
__from_env__
special server value by @hsheth2 in #11569 - fix(test): Update EbeanAspectDaoTest.java by @david-leifker in #11580
- feat(airflow): add a
render_templates
config parameter by @hsheth2 in #11537 - feat(ingest): add preset source by @llance in #10954
- fix(ui): update soft-deletion banner message by @hsheth2 in #11571
- fix(ingest/looker): handle sdk error for folder_ancestors by @mayurinehate in #11575
- feat(airflow): add airflow 2.10 to test matrix by @hsheth2 in #11579
- docs(cypress): Update README.txt by @david-leifker in #11588
- fix(trivy): multi-repo, bump trivy version by @david-leifker in #11590
- fix(trivy): also add alternative java db by @david-leifker in #11591
- feat(openapi-v3): generic entities scroll by @david-leifker in #11564
- fix: display demo form modal on mobile by @yoonhyejin in #11581
- fix(ingest): ignore irrelevant urns from % change computation by @mayurinehate in #11583
- feat(ingest/powerbi): fix subTypes and add workspace_type_filter by @sid-acryl in #11523
- fix(ingest): gracefully handle missing system metadata in client by @hsheth2 in #11592
- docs(assertions): add example of fetching associated dataset to assertion docs by @gabe-lyons in #11566
- fix(ingest/sac): handle descriptions which are None correctly by @Masterchen09 in #11572
- fix(ingest/iceberg): Iceberg table name by @skrydal in #11599
- chore(frontend): force frontend protobuf version by @david-leifker in #11601
- fix(airflow): add dag AllowDenyPattern config by @dushayntAW in #11472
- docs(structured-properties): example to read structured properties fr… by @gabe-lyons in #11603
- fix(bootstrap): fix early bootstrap mcps by @david-leifker in #11605
- fix(spark-lineage-legacy): fix check jar script by @david-leifker in #11608
- fix(sdk): platform resource api for non existent resources by @shirshanka in #11610
- fix(ingestion/redshift): Fix for Redshift COPY-based lineage by @skrydal in #11552
- fix(ingest/delta-lake): skip file count if require_files is false by @mayurinehate in #11611
- fix(ingest/superset): parse postgres platform correctly by @ssidorenko in #11540
- feat(openapi-v3): support async and createIfNotExists params on aspect by @david-leifker in #11609
- fix(ingest/preset): Add skip_on_failure to root_validator decorator by @asikowitz in #11615
- docs(apis): update OpenAPI disclaimer by @hsheth2 in #11617
- docs: add docs on term suggestion by @hsheth2 in #11606
- fix(airflow): fix lint related to dag_run field by @hsheth2 in #11616
- fix(ingest): drop empty fields by @anshbansal in #11613
- fix(ci): ensure py 3.10 by @david-leifker in #11626
- feat(docs-site) brought back announcement banner by @jayacryl in #11618
- fix(search): make graphql query autoCompleteForMultiple to show exact matches first by @deepgarg-visa in #11586
- feat: add contributor pr open comment action by @yoonhyejin in #11487
- docs(ingestion): add architecture diagrams by @david-leifker in #11628
- feat(validations): Ingest and metadata schema validators by @david-leifker in #11619
- fix(ci): Update contributor-open-pr-comment.yml by @david-leifker in #11631
- fix(ci): add runtime limit by @david-leifker in #11630
- fix(ci): metadata-io req python by @david-leifker in #11632
- feat: add quickstart post by @yoonhyejin in #11623
- feat(ingest/bigquery): Generate platform resource entities for BigQuery labels by @treff7es in #11602
- fix(ingest): add preset deps by @hsheth2 in #11637
- feat(docker-profiles): allow version override for quickstartDebug by @david-leifker in #11643
- feat(ingest/powerbi): link cross workspace dataset into assets by @sid-acryl in #11560
- docs(custom-plugins): add overview image by @david-leifker in #11634
- fix(ci): fix build and test workflow by @david-leifker in #11644
- fix(ingest): remove default value from DatahubClientConfig.server by @hsheth2 in #11570
- chore(ingest): reorganize unit tests by @hsheth2 in #11636
- fix(ingest): run sqllineage in process by default by @hsheth2 in #11650
- feat(ingest): add offline flag to SQL parser CLI by @hsheth2 in #11635
- fix(ingest/redshift): reduce sequence limit for LISTAGG by @hsheth2 in #11621
- fix(ingest/bigquery): Not setting platform instance for bigquery platform resources by @treff7es in #11659
- fix(ingest/dbt): fix bug in CLL pruning by @hsheth2 in #11614
- fix(ingest/redshift): fix syntax error in temp sql by @hsheth2 in #11661
- docs(airflow): add known limitations for automatic lineage by @hsheth2 in #11652
- perf(ingest): streamline CLL generation by @hsheth2 in #11645
- feat(ingest): ensure sqlite file delete on clean exit by @mayurinehate in #11612
- fix(sdk): platform resource - support indexed queries when urns are i… by @shirshanka in #11660
- fix(ingest/dbt): Prevent lineage cycles when parsing sql of dbt models by @asikowitz in #11666
- fix(ingest/dagster): FIx JobSnapshot import is broken by @treff7es in #11672
- feat(ingest/transformer/domain): Add support for on conflict do nothing to dataset domain transformers by @asikowitz in #11649
- fix(ingest/looker): Remove bad imports from looker_common by @feldjay in #11663
- feat(ingest/looker): include project name in model/explore properties by @hsheth2 in #11664
- feat(ingest/fivetran): protect against high sync volume by @hsheth2 in #11589
- feat(sdk):platform-resource - complex queries by @shirshanka in #11675
- fix(docs): fix businessattributes doc by @deepgarg-visa in #11653
- feat(ingest/fivetran): add safeguards on table/column lineage by @hsheth2 in #11674
- fix(ui): show DataHub logo for DataHub sources in ingestion souces list by @Masterchen09 in #11658
- fix(ingest/prefect): Fix prefect mypy errors by @treff7es in #11680
- fix(docs-site): announcement bar responsive behavior by @jayacryl in #11681
- fix(misc): misc fixes by @david-leifker in #11678
- chore(version): confluent base image by @david-leifker in #11689
- feat(ingest): generate urn types for all entities by @hsheth2 in #11676
- fix(ingest): cache sql is_profiling_enabled method by @hsheth2 in #11665
- fix(ingest/bigquery): Fix tags urn/name ingestion for BigQuery by @skrydal in #11691
- doc(bigquery-sync): Add doc for BigQuery sync by @treff7es in #11577
- feat(ingest): use mainline sqlglot by @hsheth2 in #11693
- fix(ingest): add logging for mcp diff by @hsheth2 in #11683
- fix(ingestion/glue): manage table names from resource_links from nearest catalog correctly by @aviv-julienjehannet in #11578
- feat(ingest/fivetran): show connector filter reason by @hsheth2 in #11695
- feat(ingest/snowflake): support lineage via rename and swap using que… by @mayurinehate in #11600
- fix(ingest/mongodb): Add Collection Name as Dataset Name in MongoDB by @pankajmahato-visa in #11698
- fix(ui): show structured report context in pre by @hsheth2 in #11673
- feat(logs): add change event details to log context and improve some logs in MCL/MCP by @RyanHolstien in #11690
- feat(ingest/oracle): retire deprecated cx_oracle library by @acrylJonny in #11607
- fix(ingest/superset): Don't set schema/db for druid by @ssidorenko in #11682
- fix(ci): remove pip cache by @RyanHolstien in #11702
- feat(ingestion/dagster): Dagster assetless ingestion by @treff7es in #11262
- chore(deps): bump http-proxy-middleware from 2.0.6 to 2.0.7 in /docs-website by @dependabot in #11694
- fet(ingest/bigquery): Add support ingesting foreign keys and primary keys for BigQuery tables by @treff7es in #11686
- feat(ingest/fivetran): support overriding destination db by @hsheth2 in #11701
- feat(ingest/tableau): support ingestion of access roles by @haeniya in #11157
- fix(logging): minor modifications for logging by @RyanHolstien in #11703
- feat: multi-query lineage for temp upstreams by @mayurinehate in #11708
- docs(managed/v0.3.6): Add additional release notes for 0.3.6.8 by @asikowitz in #11715
- gradle(profiles): create compose base variable by @david-leifker in #11716
- fix(log): reduce log volume for ingestion and consumers by @darnaut in #11714
- misc(gradle): project name parameter by @david-leifker in #11717
- refactor(datahub-frontend): upgrade frontend pac4j by @david-leifker in #11709
- fix: business attribute empty bubble selection issue by @kartikey-visa in #11720
- feat(frontend): show browse paths for business attributes related entities - schema fields by @kartikey-visa in #11585
- feat(bigquery): support config for region qualifiers to fetch jobs by @mayurinehate in #11728
- feat(docs) steps on handling workspace admin approvals for slack installation by @jayacryl in #11726
- feat(docs-site) slack bot scopes by @jayacryl in #11727
- feat(docs) steps on how to troubleshoot the slack command not working by @jayacryl in #11723
- fix(docs-site) remove free trial notes by @jayacryl in #11729
- doc: add missed breaking change note by @anshbansal in #11725
- feat(ingest/fivetran): avoid duplicate table lineage entries by @hsheth2 in #11712
- fix(ingestion/bigquery-gcs-lineage): Add lineage extraction for BigQuery with GCS source by @sagar-salvi-apptware in #11442
- feat(ingestion/powerbi): ingest powerbi app by @sid-acryl in #11629
- feat(ingest/databricks): report unique query count from usage by @mayurinehate in #11576
- feat(ingest): unpin traitlets by @hsheth2 in #11731
- feat(ingest/transform): extend ownership transformer to other entities by @anshbansal in #11700
- feat(ingest): remove dep on
termcolor
by @hsheth2 in #11733 - fix(ingest/unity): remove redundant check by @hsheth2 in #11732
- feat(ingest): check ordering in SqlParsingAggregator tests by @hsheth2 in #11735
- feat(docs-website): init solution pages by @yoonhyejin in #11533
- fix(struct-prop): fix unintended struct prop ES mutation by @david-leifker in #11751
- fix(openapi-v3): fix mcp alternative validator & test by @david-leifker in #11744
- fix: add learn more in the homepage by @yoonhyejin in #11752
- docs(dbt-cloud): reference service accounts in docs by @hsheth2 in #11750
- fix(ingestion/powerbi): handle double quotes in M-query by @sid-acryl in #11743
- fix(ingest/bigquery): add missing path spec deps by @hsheth2 in #11748
- feat(ingest/datahub): Add way to filter soft deleted entities by @treff7es in #11738
- fix(ingest): pin teradata dep by @hsheth2 in #11760
- fix(ingest): reduce asyncio in check_upgrade by @hsheth2 in #11734
- feat(ingest/powerbi): add timeouts for m-query parsing by @hsheth2 in #11753
- fix(structuredProperties): fixes underscore behavior in structured property names by @RyanHolstien in #11746
- feat(ui): Support markdown for incident descriptions by @jjoyce0510 in #11759
- feat(ingestion): Add execution request cleanup job by @noggi in #11765
- feat(docs-site) polishes and improved responsiveness for the home and solutions pages by @jayacryl in #11770
- feat(ingest): unpin looker-sdk dependency by @chriscc2 in #11755
- feat(forms/cli): add additional filters to the forms CLI (subtypes, platform instances, owners, tags and glossary terms) by @Masterchen09 in #10979
- docs(spark): fix incorrect config option by @steffengr in #11119
- feat(ingest/powerbi): improve reporting around m-query parser by @hsheth2 in #11763
- docs(graphql): fix typo in entity.graphql by @vejeta in #11764
- doc(bigquery): Update setup.md by @tanguyantoine in #11769
- fix: install servicebell & add condition for markprompt by @yoonhyejin in #11532
- feat(ingest/cli): add undo soft delete command by @anshbansal in #11740
- fix(docs-site) improvements to home page by @jayacryl in #11783
- Make DatahubGC bootstrap MCP non-optional by @noggi in #11785
- fix(ingest/redshift): fix unload lineage in lineage_v2 by @hsheth2 in #11620
- fix(ingest/openapi): Fix openapi tests by @treff7es in #11789
- user should be able to pass custom mcp kafka topic by @ronybony1990 in #11767
- logging(template-mcp): adding more logging around templating by @david-leifker in #11786
- feat(entity-client): batch entity-client ingestProposals by @david-leifker in #11787
- feat(ingest/dremio): Dremio Source Ingestion by @sagar-salvi-apptware in #11598
- fix(entity-client): switch to caller runs for entity-client by @david-leifker in #11791
- bump(cli): Update ingestion-datahub-gc.yaml by @david-leifker in #11794
- fix(ingest/powerbi): change default for
use_powerbi_email
by @hsheth2 in #11742 - docs(breaking-changes): datahub-gc by @david-leifker in #11808
- fix(doc/bigquery-sync): Update bigquery sync documentation by @treff7es in #11805
- fix(ingest/gc): add limit, add actual loop for iterating over batches by @anshbansal in #11809
- fix(ingest/browsePathsV2): Emit Container aspect first, to avoid BrowsePathsV2 generation race condition by @asikowitz in #11813
- fix(ingest/fivetran): do not materialise upstream by @anshbansal in #11806
- fix(ingest/dremio): update dremio sql query to retrieve queried datasets in sql jobs by @acrylJonny in #11801
- fix(ingestion/powerbi): object has no attribute startswith by @sid-acryl in #11814
- fix(views): fix environment filter for views by @RyanHolstien in #11771
- feat(template-mcps): allow further control for helm by @david-leifker in #11816
- fix(timeline): fixes primary key change events by @RyanHolstien in #11819
- fix(ingest): ignore processed query_id from temp upstream by @mayurinehate in #11798
- feat(ingest): add stateful ingestion support for file source by @mayurinehate in #11804
- docs(relationship): update relationship docs by @david-leifker in #11820
- feat(model): add deprecation aspect to container by @anshbansal in #11824
- feat(web-react) improved webpage title generation logic by @jayacryl in #11773
- feat(ingest/gx): support gx version 0.18.0 by @mayurinehate in #11823
- feat(ingestion/powerbi): DatabricksMultiCloud native query support by @sid-acryl in #11756
- fix(ci): additional disk clean by @david-leifker in #11835
- feat(docs) better oidc setup docs by @jayacryl in #11793
- fix(config): add missing package by @RyanHolstien in #11842
- fix(timeline): fixes a renaming corner case by @RyanHolstien in #11843
- fix(ui): show ingested entities in ingestion report when ingestion succeeded with warnings by @Masterchen09 in #11704
- v0.3.7 release docs by @jjoyce0510 in #11836
- fix(ingest/dbt): handle multiple owners by @anshbansal in #11848
- fix(ingest): update GX version having name arg by @mayurinehate in #11849
- docs(release): helm chart version req 0.3.7 by @david-leifker in #11850
- refactor(run-id): refactor run id updates by @david-leifker in #11834
- fix(ingestion-web) sorting and filtering uses api by @jayacryl in #11844
- docs(automations): Add new doc for Glossary Term Propagation Automation, other docs cleanup by @jjoyce0510 in #11851
- fix(ingest/oracle): fix scheme for sqlalchemy < 2 by @mayurinehate in #11829
- fix(ingest/partition-executor): Fix deadlock by recomputing ready items by @asikowitz in #11853
- fix(ingest/dremio): Dremio software jobs retrieval SQL query fix query error by @acrylJonny in #11817
- fix(docs): Update v_0_3_7.md by @david-leifker in #11861
- fix(doc): fix link to doc, update cli recommendation by @anshbansal in #11866
- chore(version): bump netty version by @david-leifker in #11862
- chore(docs): Minor improvements to transformer docs and java example by @skrydal in #11859
- feat(ingest/cassandra): Add support for Cassandra as a source by @sagar-salvi-apptware in #11822
- search(entity-type): make searchable EntityTypeKey by @david-leifker in #11868
- docs(search): example tag AND condition by @david-leifker in #11870
- Update CSVInfo.tsx by @gabe-lyons in #11871
- chore(structured-properties): add cli validation for entity types by @shirshanka in #11863
- feat(py-sdk): add cli version to ingestion headers by @githendrik in #11847
- fix(ingest/airflow): Remove seems like not intended force debug mode which caused plugin fail on EMR by @treff7es in #11877
- fix(ingest/airflow): Add log to dag emit by @treff7es in #11880
- feat(ingest/iceberg): Iceberg performance improvement by @skrydal in #11182
- fix(ingest/lookml): replace class variable with instance variable for improved encapsulation by @raudzis in #11881
- docs(urn): urn encoding by @david-leifker in #11884
- fix(ingest/partitionExecutor): Fetch ready items for non-empty batch when _pending is empty by @asikowitz in #11885
- fix(ingest): upgrade msal by @mikeburke24 in #11883
- refactor(kafka): reconfigure consumers to allow different config by @david-leifker in #11869
- Update v_0_3_7.md by @david-leifker in #11895
- docs(structured props): fix a typo in structured property docs by @gabe-lyons in #11887
- feat(mcl-upgrade): implement resume & urn pagination by @david-leifker in #11889
- fix(ui) Fix merging siblings schema with mix of v1 & v2 fields by @chriscollins3456 in #11837
- fix(ingest): consider sql parsing fallback as failure by @hsheth2 in #11896
- feat(spark): OpenLineage 1.24.2 upgrade by @treff7es in #11830
- chore(cleanup): remove unused UrnUtils function by @david-leifker in #11897
- perf(ingest/redshift): limit copy lineage by @hsheth2 in #11662
- fix(ingest): add error handling by @anshbansal in #11905
- chore(docs): Update restli-overview.md by @david-leifker in #11908
- docs: add hudi to integrations by @yoonhyejin in #11901
- Display username while removing the user from the group by @kanavnarula in #11706
- fix(ingest/powerbi): m-query fixes by @sid-acryl in #11906
- fix(auth)- Fix Redirect url flow in OidcCallback by @mkamalas in #11878
- chore(ingest): start using explicit exports by @hsheth2 in #11899
- chore(ingest): bump black by @hsheth2 in #11898
- refactor(ingest/snowflake): move oauth config into snowflake dir by @hsheth2 in #11888
- fix(ingest/bigquery): increase logging in bigquery-queries extractor by @hsheth2 in #11774
- Update the AWS instructions with EBS CSI and IAM policy instructions by @alberttwong in #11872
- fix(ingest/sql): disable patch checker by @hsheth2 in #11910
- docs(ingest/sac): add additional permission for SAP Analytics Cloud source to docs by @Masterchen09 in #11903
- chore(ingest): always use urn creation helpers by @hsheth2 in #11911
- chore: update contributors list by @kevinkarchacryl in #11915
- fix(ts): Suppress ts errors on Editor.tsx by @pinakipb2 in #11275
- chore(deps): bump cross-spawn from 7.0.3 to 7.0.6 in /smoke-test/tests/cypress by @dependabot in #11890
- chore(deps): bump cross-spawn from 7.0.3 to 7.0.6 in /docs-website by @dependabot in #11919
- feat(ingest): handle mssql casing issues in lineage by @hsheth2 in #11920
- docs(ingest): Raise error on unsupported sqlite version by @asikowitz in #11921
- fix(analytics): look at userEditableInfo to populate cells by @kevinkarchacryl in #11909
- fix(ingestion/kafka): OAuth callback execution by @sid-acryl in #11900
- feat(ingest/mssql): include stored procedure lineage by @mayurinehate in #11912
- fix(ui): use correct docs link for csv enricher by @bossenti in #11917
- fix(UI): incorrect month showing in MAU by @kevinkarchacryl in #11918
- fix(batch-patch): fix patches in batches by @david-leifker in #11928
- fix(structuredProps) Add validation that ID and qualifiedName have no… by @david-leifker in #11930
- fix(rest.li): fix use of Criterion in rest.li filters by @david-leifker in #11932
- fix(validation): improved urn validation logic by @david-leifker in #11935
- feat(search): adjust schema field boost by @david-leifker in #11933
- fix(ingest/bigquery): ignore include constraints for biglake datasets by @mayurinehate in #11874
- feat(ingest/kafka): improve error handling of oauth_cb config by @mayurinehate in #11929
- feat(ingest/oracle): support profile limits for large tables by @mayurinehate in #11827
- fix: remove ai summit post by @yoonhyejin in #11826
- fix(ui/graphql) Handle groups in institutionalMemory aspect by @chriscollins3456 in #11934
- fix(browseDAO): Handle null browse path from ES in BrowseDAO by @pinakipb2 in #11875
- fix(airflow): remove trino to presto mapping by @shepherd44 in #11925
- docs(ingest/bigquery): add partition support capability by @mayurinehate in #11940
- fix(dataProduct): reduce write fan-out for unset side effect by @RyanHolstien in #11951
- fix(ingest/tableau): handle none database field by @hsheth2 in #11950
- fix(urn-validator): update urn validation logic by @david-leifker in #11952
- feat(ingest): add more logs for kafka polling by @mayurinehate in #11954
- fix(ingest/sigma): migrate sigma workbooks from container to dashboard by @sagar-salvi-apptware in #11939
- fix(ingest/bigquery): Fix performance issue with column profiling ignore by @treff7es in #11807
- docs(cloud): add documentation for Data Product side effect by @RyanHolstien in #11948
- feat(ingest/mssql): allow filtering by procedure_pattern by @mayurinehate in #11953
- fix(test): updates a couple tests to disregard list order by @rtekal in #11840
- fix(dataproduct): optimize data product sideeffect by @david-leifker in #11961
- fix(gms/patch): Fix Upstream lineage patching when path contained encoded slash by @treff7es in #11957
- fix(ingest): always send correct data for advanced section by @anshbansal in #11960
- feat(schematron): add java capabilities for schema translation by @shirshanka in #11963
- fix(shadowJar): fix shadowJar by @david-leifker in #11968
- fix(ingest): ensure sentry is initialized with graph tags by @hsheth2 in #11949
- fix(ingest): more error handling by @anshbansal in #11969
- feat(datahub-gc): add truncation days param by @david-leifker in #11967
- docs(release): Update v_0_3_7.md by @david-leifker in #11937
- fix(ci): fix build-and-test by @david-leifker in #11974
- refactor(ingest/powerbi): organize code within the module based on responsibilities by @sid-acryl in #11924
- fix(schematron): fix for jdk8 by @david-leifker in #11975
- fix(automations docs): Update snowflake-tag-propagation.md to include permissions required for the Automation by @jjoyce0510 in #11977
- chore(bump): bump version of akka for datahub-frontend by @david-leifker in #11979
- feat(ingestion): extend feast plugin to ingest tags and owners by @margaridafernandes-trip in #11784
- fix(validation): additional URN validation adjustments by @david-leifker in #11973
- feat(search): Update search_config.yaml by @david-leifker in #11971
- docs(release): update recommended CLI by @anshbansal in #11986
- fix(ingest/kafka):add poll for admin client for oauth_cb by @mayurinehate in #11985
- fix(ingestion/iceberg): Improvements to iceberg source by @skrydal in #11987
- feat(ingest): standardize sql type mappings by @hsheth2 in #11982
- feat(ingest): bump typing_extensions dep by @hsheth2 in #11965
- feat(ingest): add tests for colon characters in urns by @hsheth2 in #11976
- feat(ingest/athena): handle partition fetching errors by @hsheth2 in #11966
- fix: Add option for disabling ownership extraction by @sagar-salvi-apptware in #11970
- feat(ingest/dremio): Retrieve default_schema for SQL views by @acrylJonny in #11832
- fix(docs): fix sample business glossary by @acrylJonny in #11669
- fix(java-sdk): custom properties patch client by @shirshanka in #11984
- fix[ingest/build]: Disable preflight script as it is not needed anymore by @treff7es in #11989
- feat: connector for Neo4j by @k-bartlett in #11526
- fix(ingestion/dremio): Fixed lineage view for dremio EE by @sagar-salvi-apptware in #11990
- fix(ingest/gc): delete invalid dpis by @anshbansal in #11998
- feat(airflow): show dag/task logs in CI by @hsheth2 in #11981
- chore(ingest): remove deprecated calls to Urn.create_from_string by @hsheth2 in #11983
- fix(ingest): resolve missing numeric types for profiling by @mayurinehate in #11991
- fix(docs): Add spark.datahub.stage_metadata_coalescing to recommended configuration for databricks by @acrylJonny in #11800
- build(coverage): enable code coverage for java and python by @chakru-r in #11992
- chore(docs): Update v_0_3_7.md - v0.3.7.5 by @david-leifker in #12005
- feat(java-sdk): add utils classes to give equivalence with python uti… by @shirshanka in #12002
- fix(ingest/sagemaker): Gracefully handle missing model group by @treff7es in #12000
- fix(ingest/gc): typo fix, do not delete empty entities by @anshbansal in #12011
- fix(ingest/gc): do not cleanup empty job/flow by @anshbansal in #12013
- fix(test): fix metadata-io tests by @david-leifker in #12006
- fix(ingest/looker): Don't fail on unknown liquid filters by @treff7es in #12014
- feat(docs-website) fix links by @jayacryl in #12019
- fix(ci): fix datahub-client validatePythonEnv by @david-leifker in #12023
- test(urn-validation): additional test case by @david-leifker in #12001
- feat(hudi): add hudi platform to the list of default platforms by @shirshanka in #11993
- fix(airflow): fix AthenaOperator extraction by @steffengr in #11857
- feat(tableau): review reporting and debug traces by @sgomezvillamor in #12015
- fix(ingest/tableau): make
sites.get_by_id
call optional by @hsheth2 in #12024 - feat(cli): add platform filter for undo soft delete by @anshbansal in #12012
- feat(mcp): add kafka batch processing mode option (#4449) by @david-leifker in #12021
- chore: update label for team by @anshbansal in #12032
- fix(ui): Adding overflow handling (also goes to oss) by @jjoyce0510 in #12022
- fix(ingest/pulsar): handle missing/invalid schema objects by @Alice-608 in #11945
- fix(filters) Fix issues with structured properties filters by @chriscollins3456 in #11946
- fix(ingest): avoid bad IPython version by @hsheth2 in #12035
- feat(ingest/kafka): additional validation for oauth_db signature by @mayurinehate in #11996
- fix(ingest/gc): Adding test and more checks to gc source by @treff7es in #12027
- fix(graph-edge): fix graph edge delete exception by @david-leifker in #12025
- feat(ingest): add urn validation test files by @hsheth2 in #12036
- chore(deps): bump cross-spawn from 7.0.3 to 7.0.6 in /datahub-web-react by @dependabot in #11978
- fix(datahub-client): prevent unneeded classes in datahub-client jar by @david-leifker in #12037
- fix(entity-service): no-op batches by @david-leifker in #12047
- docs(compliance-forms) update guide for creating form via UI by @maggiehays in #11936
- feat(snowflake): adding oauth token bypass to snowflake by @gabe-lyons in #12048
- fix(ingest): avoid shell entities during view lineage generation by @mayurinehate in #12044
- fix(logs): add actor urn on unauthorised by @anshbansal in #12030
- fix(ingest/snowflake): Add handling of Hybrid Table type for Snowflake ingestion by @siong-tcha in #12039
- fix(ingest/powerbi): reduce type cast usage by @hsheth2 in #12004
- refactor(ingest/sql): add _get_view_definition helper method by @hsheth2 in #12033
- feat(ingest/superset): initial support for superset datasets by @hwmarkcheng in #11972
- fix(ingest/sagemaker): Adding option to control retry for any aws source by @treff7es in #8727
- fix(ingest/gc): Additional dataprocess cleanup fixes by @treff7es in #12049
- feat(tableau): adds more reporting metrics to better understand lineage construction in tableau ingestion by @sgomezvillamor in #12008
- feat(ingestion/tableau): hidden asset handling by @haeniya in #11559
- feat(airflow): drop Airflow < 2.3 support + make plugin v2 the default by @hsheth2 in #12056
- fix(web) disallow deselecting all degrees on impact analysis view by @jayacryl in #12063
- feat: Add parent container hierarchy label to the container by @kanavnarula in #11705
- fix(py-sdk): DataJobPatchBuilder handling timestamps, output edges by @shirshanka in #12067
- fix(plugin-logging): adjust error logging in plugin registry by @david-leifker in #12064
- build(metadata-events): fix shell interpreter mismatch in build script by @chakru-r in #12066
- fix(entity-service): handle no-op system-metadata batches by @david-leifker in #12055
- build(coverage): rename python coverage reports by @chakru-r in #12071
- fix(ingest): replace sqllineage/sqlparse with our SQL parser by @sagar-salvi-apptware in #12020
- fix(entity-service): prevent mutation of systemMetdata on prev by @david-leifker in #12081
- build(datahub-frontend): enable code-coverage by @chakru-r in #12072
- build(ci): codecov integration by @chakru-r in #12073
- fix(openapi): adds in previously ignored keep alive value by @RyanHolstien in #12068
- feat(ui) Add alchemy component library to FE by @chriscollins3456 in #12054
- docs(structured properties) add guide by @maggiehays in #12070
- feat(ingest): allow max_workers=1 with ASYNC_BATCH rest sink by @hsheth2 in #12088
- fix(openapi): fix sort criteria parameter by @RyanHolstien in #12090
- feat(ingest/snowflake): allow option for incremental properties by @mayurinehate in #12080
- fix(cli): don't use /api in gms url by @anshbansal in #12083
- docs(ingest/athena): update recipe with aws key pair example by @mayurinehate in #12076
- fix(ingest/gc): minor tweak gc source by @anshbansal in #12093
- fix(ingest/abs): detect jsonl schema by @acrylJonny in #11775
- feat(ingest/kafka): Flag for optional schemas ingestion by @skrydal in #12077
- feat(structuredProperties) Add new settings aspect plus graphql changes for structured props by @chriscollins3456 in #12052
- fix(ingest/tableau): project_path_pattern use in _is_denied_project by @sid-acryl in #12010
- feat: Enrich superset ingestion by @hwmarkcheng in #11688
- fix(ui) Add backwards compatibility to the UI for old policy filters by @chriscollins3456 in #12017
- feat(structuredProps) Add frontend for managing structured props and filtering by them by @chriscollins3456 in #12097
- feat(ui) Add full support for structured properties on assets by @chriscollins3456 in #12100
- docs(champions): Update directory of DH Champions by @maggiehays in #12089
- feat(ingest/snowflake): ingest secure, dynamic, hybrid table metadata by @mayurinehate in #12094
- feat(spark):OpenLineage 1.25.0 by @Jorricks in #12041
- fix(ingest): always resolve platform for browse path v2 by @mayurinehate in #12045
- fix(ingest/sdk): report recipe correctly by @anshbansal in #12101
- feat(cli): add --workers arg in delete command by @anshbansal in #12102
- fix(ingest/snowflake): handle dots in snowflake table names by @hsheth2 in #12105
- fix(ingest/tableau): apply
page_size
regardless of object count by @sid-acryl in #12026 - docs(ingest/snowflake): update permissions for dynamic tables by @mayurinehate in #12074
- fix(ingestion/lookml): resolve CLL issue caused by column name casing. by @sid-acryl in #11876
- feat(glossary): support multiple ownership types by @kevinkarchacryl in #12050
- feat(datahub-client): additionally generates java8 artefacts by @sgomezvillamor in #12106
- fix(ui): dereference errors by @anshbansal in #12034
- feat(openapi-v3): add minimal timeseries aspect support by @david-leifker in #12096
- feat(forms) Clean up form prompts on structured property deletion by @chriscollins3456 in #12053
- fix(datahub-client): adds missing archiveAppendix to artifactid when publishing by @sgomezvillamor in #12112
- chore(deps): bump nanoid from 3.3.6 to 3.3.8 in /datahub-web-react by @dependabot in #12086
- chore(deps): bump nanoid from 3.3.7 to 3.3.8 in /docs-website by @dependabot in #12114
- feat(structuredProperties): add hide property and show as badge validators by @chriscollins3456 in #12099
- fix(ingest/snowflake): further improve dot handling by @hsheth2 in #12110
- feat(ingest): improve query fingerprinting by @hsheth2 in #12104
- docs(ingest): add docs on the SQL parser by @hsheth2 in #12103
- fix(ui): dereference issues by @anshbansal in #12109
- fix(datahub-client): avoid parallel execution of publish and publish-java8 by @sgomezvillamor in #12120
- fix(ingestion/dremio): Ignore filtered containers in schema allowdeny pattern by @acrylJonny in #11959
- fix(ingest/kafka-connect): update connection test url, handle api failures by @mayurinehate in #12082
- fix(ingest/dagster): Fix Dagster build by @treff7es in #12121
- fix(ingest/snowflake): improve warn message by @anshbansal in #12125
- fix(dataproduct): creator is assigned as owner by @anshbansal in #12127
- fix(mysql): index gap lock deadlock by @david-leifker in #12119
- feat(ingest): additional limits on ingestProposalBatch by @hsheth2 in #12130
- refactor(ingest): cleanup structured properties validation by @hsheth2 in #12115
- config(docker-profiles): clean-up by @david-leifker in #12051
- build(gradle): version change (Gradle and shadow plugin) by @dejan2609 in #11999
- feat(airflow): add
DatahubRestHook.make_graph
method by @hsheth2 in #12116 - tests(datahub-client): new tests for the AvroSchemaConverter by @sgomezvillamor in #12087
- feat(ingest/snowflake): secure view lineage without owner permissions by @mayurinehate in #12123
- chore(dep): exclude end of life dependency by @deepgarg-visa in #12007
- chore(version): bump kafka version by @chakru-r in #12136
- build(ci): fix vercel setup script by @chakru-r in #12143
- feat(ingest/airflow): Add way to disable Airflow plugin without a restart by @treff7es in #12098
- fix(ingestion/tableau): honor the key projectNameWithin in pagination by @sid-acryl in #12107
- fix(ingest/datahub): Use server side cursor instead of local one by @treff7es in #12129
- feat(ingestion/tableau): verify role assignment to user in
test_connection
. by @sid-acryl in #12042 - docs(ingest): fix sink recipe to correct config parameter by @kousiknandy in #12132
- feat(ui) Add finishing touches to the structured props feature by @chriscollins3456 in #12111
- feat(ingest/sqlite): Support sqlite < 3.24.0 by @asikowitz in #12137
- feat(cli): added cli option for ingestion source by @kevinkarchacryl in #11980
- fix(patch): Add Finegrained Lineage patch support for DatajobInputOutput (#4749) by @treff7es in #12146
- fix(ingest/s3): incorrectly parsing path in s3_uri by @eagle-25 in #12135
- feat(ingest/datahub): report progress on db ingestion by @hsheth2 in #12117
- build(ingest/sqlglot): Bump pin to support snowflake CREATE ... WITH TAG by @asikowitz in #12003
- fix(frontend): fix typo datahub-frontend logback.xml by @deepgarg-visa in #12134
- feat(git): add subdir support to GitReference by @hsheth2 in #12131
- fix(ui) Fix nesting logic in properties tab by @chriscollins3456 in #12151
- fix(ingest/snowflake): improve lineage parse failure logging by @hsheth2 in #12153
- fix(ingest/pulsar): handle Avro schema with missing namespace or name by @Alice-608 in #12058
- fix(cli/properties): allow structured properties without a graph instance by @hsheth2 in #12144
- fix(ingest/gc): more logging, error handling, explicit flag by @anshbansal in #12124
- fix(ingest/kafka): update dependency, tests by @mayurinehate in #12159
- feat(api): authorization extended for soft-delete and suspend by @david-leifker in #12158
- fix(env) Fix forms hook env var default config by @chriscollins3456 in #12155
- feat(ingest/mlflow): Support configurable base_external_url by @asikowitz in #12167
- fix(cli/properties): fix data type validation by @hsheth2 in #12170
- fix(pgsql): Postgres doesn't support UNION select with FOR UPDATE by @david-leifker in #12169
- refactor(ingest/kafka-connect): define interface for new connector impl by @mayurinehate in #12149
- feat(ingest): add looker meta extractor support in sql parsing by @sagar-salvi-apptware in #12062
- feat(ingest/iceberg): Improve iceberg connector by @skrydal in #12163
- feat(python): split out temp wheel builds by @hsheth2 in #12157
- docs(release): v0.3.7.7 by @david-leifker in #12091
- fix(docs): Add improvements in examples for PATCH documentation by @jjoyce0510 in #12165
- feat(graphql/ml): Add custom properties to ml entities by @asikowitz in #12152
- chore(bump): ingestion-base & actions by @david-leifker in #12171
- feat(mssql): platform instance aspect for dataflow and datajob entities by @sgomezvillamor in #12180
- fix(tableau): prevents warning in case of site admin creator role by @sgomezvillamor in #12175
- fix(tableau): restart server object when reauthenticating by @sgomezvillamor in #12182
- fix(dagster): support dagster v1.9.6 by @sgomezvillamor in #12189
- fix(graphql): add suspended to corpuserstatus by @kevinkarchacryl in #12185
- feat(ingest/snowflake): include external table ddl lineage for queries… by @mayurinehate in #12179
- fix(gms): Change names of charts in Analytics by @deepgarg-visa in #12192
- fix(ingest/databricks): Fix profiling by @skrydal in #12060
- refactor(ingestion/tableau): mark the
fetch_size
configuration as deprecated. by @sid-acryl in #12126 - test(ingest/tableau): add test for extract_project_hierarchy scenario by @sid-acryl in #12079
- docs(structured properties) fix entityTypes in creating structured property by @nicholas-fwang in #12187
- chore(bump): bump alpine and dockerize by @david-leifker in #12184
- docs update: Update v_0_3_7.md by @david-leifker in #12197
- feat(gradle): add quickstartPgDebug option by @david-leifker in #12195
- fix(ingest/powerbi): support comments in m-query grammar by @sid-acryl in #12177
- feat(ingestion/aws-common): improved instance profile support for ec2, ecs, eks, lambda, beanstalk, app runner and cft roles by @acrylJonny in #12139
- feat(ingestion/hive): Add lineage functionality for hive tables from/to file storage by @acrylJonny in #11841
- fix(mssql): adds missing containers for dataflow and datajob entities, required for browse paths v2 generation by @sgomezvillamor in #12194
- Revert "fix(mssql): adds missing containers for dataflow and datajob entities, required for browse paths v2 generation" by @anshbansal in #12201
- chore(bump): bump node version long term support release (build time … by @david-leifker in #12199
- fix(ingest): exclude aspect from migration by @anshbansal in #12206
- fix(ingest/snowflake): handle empty snowflake column upstreams by @mayurinehate in #12207
- fix(ui): null dereference by @anshbansal in #12193
- fix(ingest): quote asset urns in patch path by @hsheth2 in #12212
- feat(ingest): add sql parser trace mode by @hsheth2 in #12210
- fix(ingest): preserve certs when converting emitter to graph by @hsheth2 in #12211
- fix(ingest/mode): move sql logic to view properties aspect by @hsheth2 in #12196
- feat: update mlflow-related metadata models by @yoonhyejin in #12174
- feat(ingest/looker): Do not emit usage for non-ingested dashboards and charts by @asikowitz in #11647
- fix(tableau): retry on InternalServerError 504 by @sgomezvillamor in #12213
- fix(ingest/snowflake): always ingest view and external table ddl lineage by @mayurinehate in #12191
- fix(tableau): fixes wrong argument when reauthenticating by @sgomezvillamor in #12216
- fix(ingest/looker): Add flag for Looker metadata extraction by @sagar-salvi-apptware in #12205
- fix(ingest/mode): Handle 204 response and invalid json by @asikowitz in #12156
- fix(ingest/glue): Add additional checks and logging when specifying catalog_id by @asikowitz in #12168
- fix(ingest/gc): misc fixes in gc source by @anshbansal in #12226
- Parallelize smoke test by @chakru-r in #12225
- chore(bump): spring minor version bump 6.1.14 by @david-leifker in #12228
- fix(ingest/lookml): emit warnings for resolution failures by @hsheth2 in #12215
- chore(ingest): remove
enable_logging
helper by @hsheth2 in #12222 - feat(ingest/dbt): support "Explore" page in dbt cloud by @hsheth2 in #12223
- feat(ingest/snowflake): support email_as_user_identifier for queries v2 by @mayurinehate in #12219
- fix(tableau): retry if 502 error code by @sgomezvillamor in #12233
- ci: remove qodana by @anshbansal in #12227
- chore(tableau): adjust visibility of info message by @sgomezvillamor in #12235
- chore(python): test with python 3.11 by @hsheth2 in #11280
- feat(ingest): add parse_ts_millis helper by @hsheth2 in #12231
- fix(ingest): use
typing_extensions.Self
by @hsheth2 in #12230 - feat(businessAttribute): generate platform events on association/removal with schemaField by @deepgarg-visa in #12224
- fix(ingest/sql-common): sql_common to use SqlParsingAggregator by @sagar-salvi-apptware in #12220
- fix(ingest/gc): reduce logging, remove unnecessary sleeps by @anshbansal in #12238
- fix(docs-site) mobile site and artwork polish by @jayacryl in #12237
- feat(data transform): adding dataTransformLogic models by @gabe-lyons in #12198
- fix(tests): fixing QueryPropertiesMapperTest.java by @gabe-lyons in #12241
- feat(delete): delete logic non-strict monotonically increasing version by @david-leifker in #12242
- docs(graphql): create graphql best practices by @david-leifker in #12229
- fix(ci): further consolidate NODE_OPTIONS by @david-leifker in #12217
- chore: cleanup extra lines by @anshbansal in #12248
- fix(docs-site) hero image typo by @jayacryl in #12250
- fix(ingestion/aws_common): update iam role and aws access key tests to complete successfully when executed on EC2 instance by @acrylJonny in #12252
- fix(ingest): json serializable fix by @anshbansal in #12246
- fix(ingest/gc): soft deletion loop fix by @anshbansal in #12255
- fix(ingest/bigquery): All View generation when queries_v2 is turned off by @sagar-salvi-apptware in #12181
- test(ingest/athena): add connector integration tests by @sagar-salvi-apptware in #12256
- chore(ingest): refactor common pytest args by @hsheth2 in #12240
- fix(sample data): Update timestamps in bootstrap_mce.json to more recent by @pedro93 in #12257
- refactor(sdk/patch): improve patch implementation internals by @hsheth2 in #12253
- feat(auth): user.props authentication by @david-leifker in #12259
- docs(undo_by_filter): Document un-soft-delete commands in delete-metadata.md by @gabe-lyons in #12251
- fix(tableau): fixes some aspects being emitted multiple times by @sgomezvillamor in #12258
- fix(ingestion/redshift): Bumped redshift-connector dependency due to CVE-2024-12745 by @skrydal in #12265
- fix(ingest/gc): logging and stopping fix by @anshbansal in #12266
- fix(ingest): consistent fingerprint for sql parsing aggregator by @mayurinehate in #12239
- docs(queries_v2): set use_queries_v2 to true in snowflake_recipe.yml by @gabe-lyons in #12269
- feat(ingest/gc): truncate query usage statistics aspect by @anshbansal in #12268
- fix(ingest/tableau): retry on auth error for special case by @mayurinehate in #12264
- fix(ingest/gc): infinite loop query entities by @anshbansal in #12274
- fix(ingest/snowflake): use fast query fingerprint for lineage by @mayurinehate in #12275
- fix(spark): Finegrained lineage is emitted on the DataJob and not on the emitted Datasets. by @treff7es in #11956
- docs(tableau): clarify docs around tableau permissions by @hsheth2 in #12270
- feat(ingest): enable
EnsureAspectSizeProcessor
for all sources by @hsheth2 in #12262 - fix(ingestion/classifier): temporary measure to avoid deadlocks for classifier by @skrydal in #12261
- feat(ingest/datahub): use stream_results with mysql by @hsheth2 in #12278
- ci: fix shellcheck warnings, update actions by @anshbansal in #12281
- docs(business attribute): clarify support by @skrydal in #12260
- fix(airflow): fix tests with Airflow 2.4 by @hsheth2 in #12279
- fix(ingest): better correctness on the emitter -> graph conversion by @hsheth2 in #12272
- feat(ingest): configurable query generation in combined sources by @hsheth2 in #12284
- fix(javaEntityClient): correct config parameter by @david-leifker in #12287
- ci: upload test coverage to codecov by @anshbansal in #12291
- log(elastic/index builder): add est time remaining by @anshbansal in #12280
- fix(ingest/glue): don't fail on profile by @anshbansal in #12288
- fix(ingest/gc): also query data process instance by @anshbansal in #12292
- fix(cli): correct url ending with acryl.io:8080 by @anshbansal in #12289
- dev: add pre-commit hooks installed by default by @anshbansal in #12293
- fix(ingest/file-backed-collections): Properly set _use_sqlite_on_conflict by @asikowitz in #12297
- fix(doc): make folder_path_pattern usage more clear by @kevinkarchacryl in #12298
- dev: fix pre-commit passing filenames incorrectly by @anshbansal in #12304
- feat(sdk): structured properties - add support for listing by @shirshanka in #12283
- chore(tableau): set ingestion stage report and perftimers by @sgomezvillamor in #12234
- chore(version): bump jdbc drivers by @david-leifker in #12301
- build(coverage): fix carry-forward coverage by @chakru-r in #12306
- chore(deps): Migrate EOL vulnerability of javax.mail to jakarta.mail by @pankajmahato-visa in #12282
- chore(alpine): bump alpine images 3.21 by @david-leifker in #12302
- feat(ingest/datahub): support dropping duplicate schema fields by @hsheth2 in #12308
- feat(ci): add manual trigger for full build by @chakru-r in #12307
- fix(ci): make upload-artifact name unique by @chakru-r in #12312
- fix(ingestion/s3): groupby group-splitting issue by @eagle-25 in #12254
- feat(graphql): adds container aspect for dataflow and datajob entities by @sgomezvillamor in #12236
- docs(ingest/glue): add permissions for glue by @anshbansal in #12290
- fix(ingest/gc): add delete limit execution request by @anshbansal in #12313
- chore(deps): Migrate CVE-2024-52046 with severity >= 9 (severity = 9.3) vulnerability of org.apache.mina:mina-core:2.2.3 by @pankajmahato-visa in #12305
- fix(ci): fix artifact upload name by @chakru-r in #12319
- feat(sdk): support urns in other urn constructors by @hsheth2 in #12311
- fix(ingest): improve error reporting in
emit_all
by @hsheth2 in #12309 - docs(ingest): refactor docgen process by @hsheth2 in #12300
- fix(dockerfile) Remove all references to jetty from the docker file by @ryota-cloud in #12310
- Add more notifications docs about platform notifications and multiple channels by @ethan-cartwright in #10801
- fix(cli/delete): prevent duplicates in delete message by @hsheth2 in #12323
- feat(ingestion/iceberg): Improve iceberg connector logging by @skrydal in #12317
- fix(header): prevent clickjack/iframing by @david-leifker in #12328
- fix(ingest): tighten Source.create type annotations by @hsheth2 in #12325
- fix(ci): only upload metadata model on root repo by @hsheth2 in #12324
- feat(models): update mlflow-related mappers by @yoonhyejin in #12263
- fix(ingest): support async_flag properly in ingestProposalBatch by @hsheth2 in #12332
- feat(ingest/snowflake): Support ingesting snowflake tags as structured properties by @asikowitz in #12285
- fix(ingestion) fix snappy inconsistent version in ingestion by @ryota-cloud in #12321
- Super type dbt redshift by @kevinkarchacryl in #12337
- fix(docker) add new gradle profile for consumer debug purpose by @ryota-cloud in #12338
- feat(entityVersioning): initial implementation by @RyanHolstien in #12166
- feat(build): use remote gradle cache by @hsheth2 in #12344
- feat(docker-profiles): version mixing & docs by @david-leifker in #12342
- docs(async-api): addition to known issues by @david-leifker in #12339
- fix(ingest/gc): fix logging by @anshbansal in #12348
- design: revamp navbar dropdown by @yoonhyejin in #11864
New Contributors
- @kanavnarula made their first contribution in #11272
- @donovan-acryl made their first contribution in #11393
- @AColocho made their first contribution in #11284
- @th0ger made their first contribution in #11475
- @Bumyu made their first contribution in #11338
- @udays-visa made their first contribution in #11513
- @kris48k made their first contribution in #11391
- @udbhav-hbk made their first contribution in #11524
- @llance made their first contribution in #10954
- @ssidorenko made their first contribution in #11540
- @kartikey-visa made their first contribution in #11720
- @chriscc2 made their first contribution in #11755
- @vejeta made their first contribution in #11764
- @tanguyantoine made their first contribution in #11769
- @ronybony1990 made their first contribution in #11767
- @raudzis made their first contribution in #11881
- @mikeburke24 made their first contribution in #11883
- @alberttwong made their first contribution in #11872
- @kevinkarchacryl made their first contribution in #11915
- @shepherd44 made their first contribution in #11925
- @margaridafernandes-trip made their first contribution in #11784
- @k-bartlett made their first contribution in #11526
- @chakru-r made their first contribution in #11992
- @Alice-608 made their first contribution in #11945
- @siong-tcha made their first contribution in #12039
- @hwmarkcheng made their first contribution in #11972
- @dejan2609 made their first contribution in #11999
- @kousiknandy made their first contribution in #12132
- @eagle-25 made their first contribution in #12135
- @ryota-cloud made their first contribution in #12310
Full Changelog: v0.14.1...v0.15.0