Skip to content

Commit

Permalink
TEP-0069: Support retries for custom task in a pipeline - design.
Browse files Browse the repository at this point in the history
  • Loading branch information
ScrapCodes authored and tekton-robot committed Sep 21, 2021
1 parent 0fe439a commit 67acfd7
Show file tree
Hide file tree
Showing 2 changed files with 44 additions and 22 deletions.
64 changes: 43 additions & 21 deletions teps/0069-support-retries-for-custom-task-in-a-pipeline.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
---
status: proposed
status: implementable
title: Support retries for custom task in a pipeline.
creation-date: '2021-05-31'
last-updated: '2021-05-31'
last-updated: '2021-07-26'
authors:
- '@Tomcli'
- '@ScrapCodes'
Expand Down Expand Up @@ -41,7 +41,7 @@ This TEP is about, a pipeline task can be configured with a `retries` count
for Custom tasks.

Also, a `PipelineRun` already manages a retry for regular task
by updating it's status. However, for custom task, a tekton owned controller
by updating its status. However, for custom task, a tekton owned controller
can signal a custom task controller, to retry. A custom task controller may
optionally support it.

Expand Down Expand Up @@ -126,7 +126,7 @@ Proposed algorithm for performing a retry for custom task.
to request a custom task to cancel.

- Step 3. In addition to patching the `pipelinerun` controller also enqueue a timer
`EnqueueAfter(30*time.Second)` (configurable). On completion of timeout
`time.After(30*time.Second)` (configurable). On completion of timeout
(i.e. 30s), it checks if `/spec/status` is `RunRetry`, then it assumes that
custom task does not support retry.
- a) if custom task does not supports retry as above, It sets no. of `retry done`
Expand Down Expand Up @@ -216,23 +216,36 @@ performance requirements.

## Design Details

Add an optional `Retries` field of type `int` to `RunSpec`.

## Test Plan
Add optional `RetriesStatus` field to `RunStatusFields` of type `[]RunStatus`.

<!--
**Note:** *Not required until targeted at a release.*
Add a config map entry (default-short-timeout-seconds) to `config-defaults` in
order to make short timeout configurable.

```yaml
# default-short-timeout-seconds contains the default number of
# seconds to wait for custom task to respond, on timeout it is assumed
# custom task does not support the feature. Currently, it is used to
# quickly timeout a retry in a custom-task.
default-short-timeout-seconds: "30" # 30 seconds
```
Consider the following in developing a test plan for this enhancement:
- Will there be e2e and integration tests, in addition to unit tests?
- How will it be tested in isolation vs with other components?
Introduce a new status `RunSpecStatusRetry RunSpecStatus = "RunRetry"` for
`/spec/status` of a `Run`.

No need to outline all of the test cases, just the general strategy. Anything
that would count as tricky in the implementation and anything particularly
challenging to test should be called out.
Algorithm for performing a retry for a custom task is same as proposal section
[Proposal](#proposal).

All code is expected to have adequate tests (eventually with coverage
expectations).
-->
## Test Plan

Add unit tests and e2e integration tests for following two cases.

1. If the custom task, does not support a retry, we wait until the configured shorter timeout
(30 seconds by default) and exhaust all retries.

2. If the custom task *does support* a retry i.e., it does clear its 'spec.status',
on each retry. Verify it performs the correct number of retries.

## Design Evaluation
<!--
Expand Down Expand Up @@ -265,11 +278,20 @@ SIG to get the process for these resources started right away.

## Upgrade & Migration Strategy (optional)

<!--
Use this section to detail wether this feature needs an upgrade or
migration strategy. This is especially useful when we modify a
behavior or add a feature that may replace and deprecate a current one.
-->
An upgrade strategy for existing custom controllers,

1. Custom controller already supports a retry field.
- It can deprecate the existing retry field and refer to `Run.spec.retries`.
- Watch the status field i.e. `/spec/status` of `Run` if it is `RunRetry`
then start executing retry and clear its status to let `tektoncd`
controller that the custom task has started retrying.
2. If custom-task does not already support retry and does not wish to support
it, then they can ignore it and tektoncd controller will be able detect
that.
3. If custom-task does not already support retry and wish to support it,
then it should start watching the `/spec/status` of `Run`. If it is
`RunRetry` then start executing retry and clear its status to let
`tektoncd` controller know that the custom task has started retrying.

## References (optional)

Expand Down
2 changes: 1 addition & 1 deletion teps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ This is the complete list of Tekton teps:
|[TEP-0063](0063-workspace-dependencies.md) | Workspace Dependencies | proposed | 2021-04-23 |
|[TEP-0066](0066-dogfooding-tekton.md) | Dogfooding Tekton | proposed | 2021-05-16 |
|[TEP-0067](0067-tekton-catalog-pipeline-organization.md) | Tekton Catalog Pipeline Organization | implementable | 2021-02-22 |
|[TEP-0069](0069-support-retries-for-custom-task-in-a-pipeline.md) | Support retries for custom task in a pipeline. | proposed | 2021-05-31 |
|[TEP-0069](0069-support-retries-for-custom-task-in-a-pipeline.md) | Support retries for custom task in a pipeline. | implementable | 2021-07-26 |
|[TEP-0070](0070-tekton-catalog-task-platform-support.md) | Platform support in Tekton catalog | proposed | 2021-06-02 |
|[TEP-0071](0071-custom-task-sdk.md) | Custom Task SDK | proposed | 2021-06-15 |
|[TEP-0072](0072-results-json-serialized-records.md) | Results: JSON Serialized Records | implementable | 2021-07-26 |
Expand Down

0 comments on commit 67acfd7

Please sign in to comment.