Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(controller): step group stuck on running when exit hook has illegal expression #14032

Merged
merged 1 commit into from
Dec 30, 2024

Conversation

chengjoey
Copy link
Contributor

Fixes #14031

Motivation

The error handling logic for stepGroup on exithook with illegal expressions should be consistent with Dag.

if taskNode.Fulfilled() {
if taskNode.Completed() {
hasOnExitNode, onExitNode, err := woc.runOnExitNode(ctx, dagCtx.GetTask(taskName).GetExitHook(woc.execWf.Spec.Arguments), taskNode, dagCtx.boundaryID, dagCtx.tmplCtx, "tasks."+taskName, scope)
if err != nil {
return node, err
}
if hasOnExitNode && (onExitNode == nil || !onExitNode.Fulfilled()) {
onExitCompleted = false
}
}
}
}

Modifications

@chengjoey chengjoey changed the title fix(controller): step group stuck on running when exit hook has invalid expression fix(controller): step group stuck on running when exit hook has illegal expression Dec 26, 2024
if err != nil {
return node, err
}
if hasOnExitNode && (onExitNode == nil || !onExitNode.Fulfilled()) {
// The onExit node is either not complete or has errored out, return.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be best to update the explanation in the comments as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jswxstw
Copy link
Member

jswxstw commented Dec 26, 2024

I ran the example from #14031 and found that node StepGroup[1] still remains Running. This issue seems to be quite common when workflow is marked as Error directly.

@chengjoey chengjoey force-pushed the fix/exit-hook branch 3 times, most recently from 7851bd2 to a94ea8f Compare December 26, 2024 15:16
Copy link
Member

@jswxstw jswxstw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgNode, err := woc.executeStepGroup(ctx, stepGroup.Steps, sgNodeName, &stepsCtx)
if err != nil {
return nil, err
}

I think there is also need to mark sgNode as Error as I commented before: #14032 (comment).

if err != nil {
    return woc.markNodeError(sgNodeName, err), nil
}

@chengjoey
Copy link
Contributor Author

I think there is also need to mark sgNode as Error as I commented before: #14032 (comment).

thanks @jswxstw , done

@tczhao tczhao enabled auto-merge (squash) December 30, 2024 09:51
@tczhao tczhao merged commit 1d36956 into argoproj:main Dec 30, 2024
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workflow got stuck due to condition expression evaluation error
3 participants