Allow using macros in the documentation YAML #3277

pcasteran · 2021-04-19T23:00:36Z

Describe the feature

Currently, only the doc macro is available in the Jinja rendering context used for the doc generation.
It would be useful to add all the macros available in the project, as it is the case during the compilation of the models.
The use-case I'm trying to implement is to inherit the columns' description between all the model layers without having to copy-paste them (see this comment).

Describe alternatives you've considered

See the discussion on 2995.

Additional context

I have looked at the code and the reason why we can't use macros in the documentation comes from the fact that the DocsRuntimeContext class inherits from SchemaYamlContext whereas the context used during model compiltation inherits from ManifestContext which is responsible for adding the macros in the context used during Jinja rendering.

As these classes are not documented it's quite difficult to figure out their responsibilities and the reasons behind this type hierarchy. But maybe modifying the SchemaYamlContext to inherit from ManifestContext would do the trick and enable the use of macros during the processing of the documentation.

Who will this benefit?

Anyone who wants to write "advanced" documentation using some cool features of dbt would benefit from this feature.
Inheriting column's description and test from previous models is one example. There would also be other usages for advanced formatting of the documentation.

Are you interested in contributing this feature?

I would be interested in contributing but I would need some pointers on how to achieve this.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2021-04-21T20:00:07Z

@pcasteran I really appreciate you opening this issue. I am going to ask you to be more specific :)

Anyone who wants to write "advanced" documentation using some cool features of dbt would benefit from this feature.
Inheriting column's description and test from previous models is one example. There would also be other usages for advanced formatting of the documentation.

What other advanced use cases do you have in mind? If we're both thinking that the primary use case would be extending/inheriting descriptions, as is described by #2995, then I'll share how I'm thinking about that problem, and why this isn't my preferred solution (at least for now).

Problem: There's a lot of duplicated code in projects today, defining resource properties on one object that apply equally to another object.

Desired future state: Users can extend resource properties from one model/column/etc to another, without:

resorting to copy-paste
needing to define those objects within the same file (a limitation of vanilla YAML anchors)

With that in mind, let's think about this specific proposal: expanding the context available when dbt renders description fields, to include many more user-defined macros.

Pros:

It would be fairly straightforward to implement: dbt already uses Jinja rendering for description fields. The work would all be around the Jinja context available for that rendering.

Cons:

By and large, dbt only Jinja-renders one property at a time. As such, this approach would not do a lot to help with the full use case of extending/inheriting multiple properties from one model to another: columns, descriptions, tests, tags, meta properties, etc. For this capability, we should be looking at solutions in YAML, not Jinja.
Performance. Because Jinja is involved, description rendering is one of the slowest steps in dbt "mise-en-place" today, which is something we're working hard to speed up. Adding arbitrarily more macros to the rendering context risks slowing this down significantly.

So: I want to hear what other use cases you're thinking about. I agree completely with the desire for inheritable/extensible descriptions—I just don't think this is the right way there.

djbelknapaw · 2021-04-23T17:44:22Z

This adds a level of complexity, but it would be ideal if this ignored macros or functions inside a code tag within the markdown as well. For example, dbt tries to run this ref() right now when it's inside a code tag:

-- depends_on: {{ ref('my_model') }}

but it would be preferable if that code block was left as-is. Maybe this is just a nice-to-have after a first pass enabling more functionality.

I didn't open this issue but I'm interested in it as well. I really like the suggestions going on in #2995, and #1334 describes my original use case even though I didn't open that one either! Overall, having access to more functions/macros would be handy.

I'd love to contribute but I'm not at a point where I'd be very useful just yet on back-end stuff. And maybe the code-tag thing is a separate issue, I can split it out if that's helpful.

pcasteran · 2021-04-23T22:05:25Z

Thanks for taking the time to look at my proposal and for your detailed answer @jtcohen6.

First, I want to clarify the scope of this discussion. What I'm actually proposing is to enable more macros to be used during the doc rendering, it is not a proposal for documentation inheritance. I may plan to use it for this use-case in the short term but the proposal was more general than that.
If we go back to the initial topic, is there a reason why macros aren't currently supported (except doc) during the doc generation or is it just because there were simply no use for it yet?

Besides column doc inheritance, I would also like to access some attributes of other models in my documentation like {{ ref('...').path.identifier }} to display the actual table name in which is materialized the model.

--

Now regarding the more specific topic of documentation reuse, there are many possible ways to achieve it and maybe there should not be only one implemented.
Indeed the solution I described in #2995 can't be the "official" recommended way to do it, but only one made possible by allowing to use macros during the doc processing. I totally agree it cannot be a complete solution as it does not allow inheriting blocks of columns or anything else than column description.

Performance wise, I'm not sure there would be a penalty using this kind of macro as, if I remember correctly the code I saw, dbt caches the columns description and only performs Jinja rendering once per column.

--

Regarding the use of YAML anchors, is there a discusion about it other than #1790 ? I'm not really convinced this is the way to go as it raises some concerns:

Basically it means creating a parallel model dependency processing logic beside the parsing already done by dbt.
It induces a dependency on the location/structure of the documentation inside the project instead of relying on the model graph resulting from the parsing.
Will it work when refactoring (fusing or merging) some YAML files without having to modify the ones depending on it?
Will it work for models from another projects imported as packages?

jtcohen6 · 2021-09-02T10:47:33Z

@pcasteran Apologies for the delay in getting back to you! This is something I continue to think about, and I'd like to find the right way to do it, possibly next year. For now, this has to wait until after dbt-core v1.0, which is our top development priority for dbt-core for the rest of the year.

A similar idea just came in from @brittianwarner over in #3827, with the specific thought of more dynamic support in docs blocks (including "special" ones like __overview__). This got me thinking about the potential of expanded rendering contexts for descriptions, in lieu of arbitrary macro support. Those contexts could include things like {{ model }}, {{ column_name }}, vars, env vars, etc. Those make a lot of sense in a world with inherited documentation, and they'd get you the other piece you mention ({{ ref('...').path.identifier }}), while still preserving the guardrails we have about parse-time documentation rendering today.

octocat1000 · 2022-01-14T18:17:02Z

Just came across a similar problem when trying to add dynamic content to the docs. I'd just assumed I could use macros in my docs. Found out otherwise then found this issue. My use case is that I'm trying to insert some data quality metrics into my docs by running a query and using the results in the docs. I'd like to be able to keep my query as a macro (organized with my macros for models) and then have it be called when I'm running docs generate to fill in the results.

pcasteran added enhancement New feature or request triage labels Apr 19, 2021

jtcohen6 removed the triage label Apr 21, 2021

jtcohen6 mentioned this issue Sep 2, 2021

Use Variables and Macros in Markup Documentation (Like Overview) #3827

Closed

jtcohen6 added the discussion label Sep 2, 2021

dbt-labs locked and limited conversation to collaborators Apr 19, 2022

jtcohen6 converted this issue into discussion #5097 Apr 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Allow using macros in the documentation YAML #3277

Allow using macros in the documentation YAML #3277

pcasteran commented Apr 19, 2021

jtcohen6 commented Apr 21, 2021

djbelknapaw commented Apr 23, 2021

pcasteran commented Apr 23, 2021 •

edited

Loading

jtcohen6 commented Sep 2, 2021

octocat1000 commented Jan 14, 2022

This issue was moved to a discussion.

This issue was moved to a discussion.

Allow using macros in the documentation YAML #3277

Allow using macros in the documentation YAML #3277

Comments

pcasteran commented Apr 19, 2021

Describe the feature

Describe alternatives you've considered

Additional context

Who will this benefit?

Are you interested in contributing this feature?

jtcohen6 commented Apr 21, 2021

djbelknapaw commented Apr 23, 2021

pcasteran commented Apr 23, 2021 • edited Loading

jtcohen6 commented Sep 2, 2021

octocat1000 commented Jan 14, 2022

This issue was moved to a discussion.

pcasteran commented Apr 23, 2021 •

edited

Loading