You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As part of our efforts for building a real time table ingestion service to OMD we're using the Ingestion Framework External Deployment with Trino Metadata Ingestion.
We see that some tables are taking very long to be ingested, around 5-10 minutes although their schema is very thin.
It seems that there is a preprocessor phase that retrieves table comments and uses a filter on schema level: https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/src/metadata/ingestion/source/database/trino/queries.py#L34
when ingesting only 1 table in an event driven architecture is seems like an overkill.
Table comments retrieval should be aware of any tables filtered in tableFilterPattern key of the YAML configuration.
Relevant Slack thread: https://openmetadata.slack.com/archives/C02B6955S4S/p1724075683883439
The text was updated successfully, but these errors were encountered:
As part of our efforts for building a real time table ingestion service to OMD we're using the Ingestion Framework External Deployment with Trino Metadata Ingestion.
We see that some tables are taking very long to be ingested, around 5-10 minutes although their schema is very thin.
It seems that there is a preprocessor phase that retrieves table comments and uses a filter on schema level:
https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/src/metadata/ingestion/source/database/trino/queries.py#L34
when ingesting only 1 table in an event driven architecture is seems like an overkill.
Table comments retrieval should be aware of any tables filtered in tableFilterPattern key of the YAML configuration.
Relevant Slack thread:
https://openmetadata.slack.com/archives/C02B6955S4S/p1724075683883439
The text was updated successfully, but these errors were encountered: