-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate existing resources #25
Conversation
…t match the opinionated nature of this tool. It will now handle migration.
I am pondering how to add some tests to this. Any guidance would be appreciated. |
Something else of note. Currently this breaks if more than one thread is used. I expect this is due to the handling of the existing yaml files when they have more than one definition and multiple threads start operating on them. I need to give some consideration to the best path forward on that. |
@BeadW Thank you for submitting this PR! I appreciate you doing that. I want to let you know that I see it, and that I intend to find the time to review it in the coming days. I will also provide guidance on testing and will give some thought to your comment about multiple threads. |
…threw an error. Added a check that the columns key was present before trying to use it
Hello @BeadW, I apologize for the delay. I have had some time to look through the proposed changes and to think about your comments regarding multiple threads and testing. To provide some background, which you may have already discovered yourself, the motivation for using multiple threads comes from the If we could perform the migration without running What are your thoughts on the following setup?:
In terms of testing, we could include a multiple-resource-property-file in I've put together PR #26 which incorporates these ideas. I would love to have your feedback. |
Hi @robastel thanks for putting in all this effort. Really appreciate it. I think your logic here is pretty solid. I like the idea of having the migrate command be separate and can understand how that removes the bottleneck. The method of testing is in line with what I was thinking after I gave it some more thought so that is great. I think the implementation makes a couple of assumptions that might reduce the usefulness and that I believe were addressed by my implementation. PR #26 works (so far as I can tell) on the assumption that because there are two schools of thought people have adopted one or the other. One thing I found within our own repos (we have a lot of dbt happening) is that sometimes people have just made it up as they go along. Stuff like one yaml file per directory, one yaml file per set of similar models etc. Personally I think it is worth supporting the idea that "yaml might be in any state but we should be able to migrate to one yaml per model". I believe my implementation did that (I might be missing something though). I also think this opens up a future state where we can run dbt-invoke properties and have it handle more complex or differing structures of there is value in that. Do you think there is some hybrid of the two approaches we could achieve where perhaps we aim for the separation like you have achieved and support more complex use cases as I believe I did? Kind Regards |
Brad, thank you for the thoughtful feedback! That is a great point about my assumption that projects would not mix the two schools of thought or even take a totally different approach. I noticed that dbt Labs themselves now fall somewhere in between, using one yaml per folder (https://docs.getdbt.com/guides/best-practices/how-we-structure/5-the-rest-of-the-project#yaml-in-depth). I think you have made an excellent suggestion to try to find a hybrid of our two approaches that would both separate the migration logic and have the ability to handle more complex structures. My first thought is to take a look into how dbt itself locates and parses property files. Maybe we can import those functions and use them directly. What do you think? I'm open to any other ideas too! |
:return: Str or None | ||
""" | ||
node_unique_id = resource_dict['unique_id'] | ||
patch_path = _read_manifest(ctx.config['target_path'])['nodes'][node_unique_id]['patch_path'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robastel this section of the manifest is the result of dbt having discovered the matching yaml for that node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is created when the dbt compile takes place from memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, great, I forgot you already did that!
Logistically, how would you like to go about working on our hybrid approach?
I'm happy to take an attempt at combining our two approaches. Also happy to review if you feel like you want to take a shot at it. I'm going to sign off for today, but I'll keep in touch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have finished work for the week now. Happy to take a stab at this Monday if you haven’t already done it. 🙂
Hi @robastel I have done some work to bring the two approaches together. Testing locally with the sqllite db seemed to work fine. If you create a random set of model resource definitions in different files with strange names it migrates them properly. I am off work the rest of this week but happy to discuss further if required when I return. (one error I encountered was a cyclic dependency error when doing a dbt run for the orders object) That doesn't seem related to this change though it seems like the definition of that view is referring to itself. I put the changes against your branch so it was easier to review. |
It is common to have definitions of resources in yaml files that don't match the opinionated nature of this tool. It will now handle migration. Goes somewhat to #11