-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEP-0158: pipeline fail fast #1162
Open
chengjoey
wants to merge
1
commit into
tektoncd:main
Choose a base branch
from
chengjoey:feat/tep-0157-fast-fail
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
--- | ||
status: implementable | ||
title: Pipeline fail-fast | ||
creation-date: '2024-08-17' | ||
last-updated: '2024-08-17' | ||
authors: | ||
- '@chengjoey' | ||
--- | ||
|
||
# TEP-0158: Pipeline fail-fast | ||
|
||
<!-- toc --> | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
* [Goals](#goals) | ||
* [use-cases](#use-cases) | ||
- [Proposal](#proposal) | ||
* [failFast](#failfast) | ||
* [config-defaults](#config-defaults) | ||
* [PipelineRun](#pipelinerun) | ||
* [Pipeline](#pipeline) | ||
- [Design Evaluation](#design-evaluation) | ||
* [Priority](#priority) | ||
* [Performance](#performance) | ||
- [References](#references) | ||
<!-- /toc --> | ||
|
||
|
||
## Summary | ||
|
||
This proposal is to support stopping the execution of the Pipeline immediately when a Task in the Pipeline failed. | ||
Because there may be multiple parallel tasks in the pipeline, when one task fails, | ||
The final status of the pipeline is Failed, but other tasks may still be executed, | ||
which will waste resources. Therefore, we need a mechanism to support the | ||
immediate stop of pipeline execution when a task failed. | ||
|
||
## Motivation | ||
|
||
When a Task in a Pipeline fails, the Pipeline status may eventually change to Failed, but other parallel Tasks continue to execute. | ||
When the final Pipeline status is Failed, There is a high probability that users will create a new Pipeline execution, | ||
because in many cases they hope that the final Pipeline status will be Success. Therefore, | ||
in this case, it is necessary to stop the Pipeline execution as soon as possible. The fail-fast mechanism can | ||
reduce the waste of resources on the one hand, and on the other hand, | ||
it can also let users know the execution results of the Pipeline as soon as possible. | ||
|
||
### Goals | ||
|
||
- Support canceling the execution of the Pipeline immediately when a Task in the Pipeline failed. | ||
|
||
### use-cases | ||
|
||
Take the following pipeline as an example. When `fail-task` fails to execute, | ||
the parallel `success1` and `success2`Tasks will exit execution. | ||
The status of PipelineRun is `Failed`, the status of `fail-task` is `Failed`, | ||
and the status of `success1` and `success2` are `RunCancelled`. | ||
|
||
```yaml | ||
apiVersion: tekton.dev/v1 | ||
kind: PipelineRun | ||
metadata: | ||
name: pipeline-run | ||
spec: | ||
failFast: "true" | ||
pipelineSpec: | ||
tasks: | ||
- name: fail-task | ||
taskSpec: | ||
steps: | ||
- name: fail-task | ||
image: busybox | ||
command: ["/bin/sh", "-c"] | ||
args: | ||
- exit 1 | ||
- name: success1 | ||
taskSpec: | ||
steps: | ||
- name: success1 | ||
image: busybox | ||
command: ["/bin/sh", "-c"] | ||
args: | ||
- sleep 360 | ||
- name: success2 | ||
taskSpec: | ||
steps: | ||
- name: success2 | ||
image: busybox | ||
command: ["/bin/sh", "-c"] | ||
args: | ||
- sleep 360 | ||
``` | ||
|
||
## Proposal | ||
|
||
### failFast | ||
|
||
The `failFast` property is of string type and supports the following values: | ||
1. `true`: When a Task in the Pipeline fails, the Pipeline execution is stopped immediately. | ||
2. `false`: This is consistent with the current behavior. When a Task in a Pipeline fails, the Pipeline status changes to Failed, but other Tasks continue to execute. | ||
|
||
### config-defaults | ||
|
||
Add the `default-fail-fast` field in `config-defaults` to set the default fail-fast attribute for all Pipelines. | ||
|
||
```yaml | ||
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: config-defaults | ||
namespace: tekton-pipelines | ||
data: | ||
# Default fail-fast attribute for all Pipelines | ||
default-fail-fast: "true" | ||
``` | ||
|
||
### PipelineRun | ||
|
||
Add the `failFast` field in PipelineRun to set the fail-fast property of the PipelineRun at runtime. | ||
|
||
```yaml | ||
apiVersion: tekton.dev/v1 | ||
kind: PipelineRun | ||
metadata: | ||
name: pipeline-run | ||
spec: | ||
failFast: "true" | ||
pipelineSpec: | ||
... | ||
``` | ||
|
||
### Pipeline | ||
|
||
Add the `failFast` field in Pipeline to set the fail-fast property of Pipeline. | ||
|
||
```yaml | ||
apiVersion: tekton.dev/v1 | ||
kind: Pipeline | ||
metadata: | ||
name: "demo.pipeline" | ||
spec: | ||
failFast: "true" | ||
tasks: | ||
... | ||
``` | ||
|
||
## Design Evaluation | ||
|
||
### Priority | ||
|
||
The `failFast` priority is:PipelineRun > Pipeline > config-defaults | ||
|
||
### Performance | ||
|
||
PipelineRun status is `Failed`, failed Task status is `Failed`, and other parallel Task status is `RunCancelled`. | ||
When a Task fails, the Cancel event is triggered to cancel the execution of PipelineRun. | ||
|
||
## References | ||
|
||
* Implementation Pull Requests: | ||
* [Tekton Pipelines PR #7987 - support fail-fast for PipelineRun][pr-7987] | ||
* [Tekton Pipelines Issue #7880][issue-7880] |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like we have fail-fast half implemented... I think the two options ideally should be:
true
: When something causes the pipeline run to fail, all running tasks are terminated immediatelyfalse
: When something causes the pipeline run to fail, no new tasks are scheduled. Once all running tasks are finished, the pipeline is marked as failedBut I guess we could keep
false
to behave like it does today and perhaps change it as part of a separate effort.It would be nice for this to have graceful termination of Tasks and we don't want to lose logs in case of fail fast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graceful termination has been implemented by feat/Cancel taskrun using entrypoint binary
and it will be promoted to default