-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support custom hooks between full_task_set
and full_task_graph
phases
#424
Comments
So basically you want a custom hook that takes the parameters and the full set of generated tasks as input? I don't love making the generation process even more complicated, but I guess we could do that.. A possible (very hacky) way to accomplish this today might be to create a special kind that depends on every other kind in its (Note your link is pointing to the end of the "kind graph" rather than the task graph. I think you meant to link around here: taskgraph/src/taskgraph/generator.py Line 326 in 85c360e |
full_task_set
and full_task_graph
phases
Just to restate what I think I'm reading: It sounds like the idea here is that we'd generate task definitions in one stage (but explicitly not any dependencies, fetches, or anything else that requires information from other tasks), and then a subsequent stage would deal with all of the connections. That's obviously a valid way to generate a task graph, but it's almost an entirely different thing than what Taskgraph does, and I'm not even sure it plays well with how transforms work at the moment. For example, it would necessarily have to ignore I don't want to say that it can't or even shouldn't be done with Taskgraph, but if/when we support this formally I think we need to spend some time thinking about how it will work and interact with other parts of Taskgraph. |
I don't think it would necessarily need to ignore dependencies / kind-dependencies.. I think this could be as simple as moving this for loop: taskgraph/src/taskgraph/generator.py Line 329 in 85c360e
Into a standalone function. Then we can cover our eyes and say "I can't see you!" as
Actually my hacky special kind suggestion from my previous comment would play nicely with that.. just omit any kinds you don't want to post process from the special kind's |
That said if there are existing transforms that accomplish what we need already, my first choice would be to make those even better. Hide away some of the complexity, make them more generic, simplify the task interface etc |
I like the idea of giving this approach a try using a transform that can access all the tasks or a simple monkey-patching of a function first. I actually meant this place in the code, where it takes kind dependencies and builds taskgraph/src/taskgraph/generator.py Line 294 in 85c360e
We could modify them before that. The idea is some dependencies might not be specified in the yaml at all and populated only by the hook. I'm not sure if it's possible not to connect kinds and connect only tasks or connect both on |
This is unlikely to be a good place. At this point the only think we have is a graph of the relationships between the kinds. At this point we've read the raw yaml from the kinds, but we haven't run the transforms. That happens in load_tasks, and as @ahal mentioned earlier, after this is probably the best place to do any fiddling. I can help you get set up to experiment with this, if you like. |
This is true, but I guess my point is more that you could easily end up with situations where you have tasks from kind B which uses fetches from kind A, and then kind C uses this new hook, and removes or alters one or more of those fetches, leading to confusion if you're only looking at kinds A & B. I guess the general point is taskgraph is built in a way where kinds generally deal with all of their own data, and you can usually understand how a task is made by looking at a kind and its transforms. This change would mean you can't make any assumptions like that any more. (To be fair, optimizations and morphs also do this sort of graph level modification, but they are used in very specific ways, and explicitly are not supposed to alter the meaning of the graph.) To be perfectly clear: I'm not at all against the idea of a system where task definitions and task graph generation are segregated from one another - I'm just concerned that making it a very official part of taskgraph will lead to loads of confusion down the road. A generic "post task loading hook" (which I think is what you're suggesting @ahal?) is probably okay enough. |
We talked about this a bit this week, and indeed, this is one of the ideas being suggested. I have no issues with this fairly generic approach. |
Would it be possible to introduce a hook similar to transforms but for the full graph to run it before adding the edges? This would allow connecting the graph dynamically based on parameters. This is something that can be done in individual transforms by modifying dependencies but having it in one place would simplify the code significantly. I'm thinking it could be called here. It would also allow not specifying dependencies for some tasks in yaml, so they become optional unless included in the graph.
We have a pretty complex use case for graph modifications based on parameters, here is the full context.
The text was updated successfully, but these errors were encountered: