-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get table & column descriptions from information_schema metadata columns when generating yml #119
Comments
I have similar ask. I have a bunch of source tables in bigquery that have descriptions already in them and I would like to port those over to the dbt model ymls as well. But when i create the a base/source model yml using the generate_model_yaml function there is no way to tell the function hey can you look for descriptions from source table and if present automatically put those in those first models ymls. I understand that once the descriptions are in the first model i can use the upstream_descriptions in all downstream models, but i have 1000+ columns i am not trying to copy paste over, which already have somewhat of adequate descriptions. |
@kellybh123 I am encountring the same issue. Note that in dbt-bigquery, the column class that is used by codegen to retrieve columns from big query, does not have a description attribute. |
I see we have some upvotes here and I'm interested also. Any objection from maintainers about adding descriptions to columns and tables if those can be discovered from the table/column metadata? Aka - if capacity opens up, would a contribution here be accepted? 😄 |
Thank you for opening this @jakub-auger for opening this, and for all of you that have shown interest in it! It's not a priority for us to add this to dbt-codegen at this time, so we won't be accepting contributions. Alternative approaches are sketched out here: TLDRUpon the next release of dbt-osmosis, you should be able to do something like this: dbt docs generate
dbt-osmosis yaml document --catalog-file target/catalog.json Or you (or a different 3rd party tool) can utilize programmatic invocations to generate a Catalog artifact and use it to scaffold your YAML files with comments included. |
Describe the feature
As a dev I want to leverage the metadata that already exists in my database, namely table and column descriptions which are captured in the information_schema.tables and .columns metadata tables
Describe alternatives you've considered
manually scripting out the info on a per table
Additional context
Current use case is on databricks w/unity catalog, but would be useful elsewhere
Who will this benefit?
The source tables are very wide (100s of columns), manually data entering descriptions into a yml file in notepad isn't feasible
More robust to enter the info via the databricks UI into the table and columns. Need to pull it out to expose this info in the dbt docs
Are you interested in contributing this feature?
The text was updated successfully, but these errors were encountered: