Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve clone reference page #4117

Open
1 task
runleonarun opened this issue Sep 22, 2023 · 0 comments
Open
1 task

Improve clone reference page #4117

runleonarun opened this issue Sep 22, 2023 · 0 comments
Labels
content Improvements or additions to content improvement Use this when an area of the docs needs improvement as it's currently unclear

Comments

@runleonarun
Copy link
Collaborator

runleonarun commented Sep 22, 2023

Contributions

  • I have read the contribution docs, and understand what's expected of me.

Link to the page on docs.getdbt.com requiring updates

https://docs.getdbt.com/reference/commands/clone

What part(s) of the page would you like to see updated?

Feedback from the dbt Community:

  • The "clone command is useful for" section is A++. I love the way it helps me contextualize the feature and think about how I can take advantage of it. I wouldn't have thought clone -> blue/green by myself (immediately?) because of the nomenclature.
  • I can't think of how cloning an object would be needed to test downstream dependencies in your BI tool. Maybe because of environments? But that's a me problem, just something in my train of thought.
  • The code part of the page should have it's own header and be expanded. I'd like to see something like command A with B config will take C and make D.
  • You keep talking about "zero-copy cloning of tables" data platforms(e.g. RDBMS) capable platforms. But what are they and what are they not? I assume Snowflake but not Redshift from my other knowledge, but it reads very opaque to me as is. I could understand not wanting to have to maintain a list of what databases currently support it, but I think "At least Snowflake" and/or "Not Redshift as of x/x/xxxx" could work.
  • "simple pointer view" I know what you mean, though others may not. The thing I don't know from that bullet point, and can not infer on my own, is the naming the objects will get. So for instance, if I already have object X in my schema, what will the name of the object X copy at a specific state (in time) object name be?
  • The link for "specified state" doesn't work.
  • Feature intersection: How does this interact with versioning of models?
  • This feature seems completely dependent on --state but that has limits only called out on the description of state page. And it doesn't seem to point out, but that I infer, that you have to start saving manifest files instead of overwriting or deleting them. And were any of the docs about deleting the target folder as part of troubleshooting updated?
  • There will be confusion and problems around all the things that have to line up to make this work - having a manifest file that works on your dbt version that works with all the versions of everything else in your repo(see macros which I don't believe can be versioned), the same data as that state(not as snapshotted or overwritten). You have to do a lot of prep and foundation work for this to be viable and that prep has to start much before you can use this command.
  • This probably needs a warning that it's an advanced feature.
  • Honestly, after thinking through all of that I would not have called this dbt clone, I would have called it dbt recreate, but :woman-shrugging::skin-tone-2: Way outside the scope of a docs conversation!
  • Oh! zero copy cloning == aliasing. THAT's why it would be good for blue/green deployment even though it's recreating. The word alias being on this page would help.
  • "clone materialization" should be linked. When I search for "clone" in the docs I don't get anything about clone materialization. See screenshot. So now I have more questions.
  • Wow, with and without --full-refresh this command is a whole other level of version issues.
  • This seems like it's trying to go back-in-time, under perfect circumstances, to create a point in time materialization(or alias to) without pulling it from a snapshot. This is fascinating why this was built I'm going to have to go read the Github discussions on this.

Additional information

For more information, you can see this dbt Community Slak discussion
No response

@runleonarun runleonarun added content Improvements or additions to content improvement Use this when an area of the docs needs improvement as it's currently unclear labels Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content Improvements or additions to content improvement Use this when an area of the docs needs improvement as it's currently unclear
Projects
None yet
Development

No branches or pull requests

1 participant