diff --git a/teps/0090-looping.md b/teps/0090-looping.md new file mode 100644 index 000000000..856656a43 --- /dev/null +++ b/teps/0090-looping.md @@ -0,0 +1,393 @@ +--- +status: proposed +title: Looping +creation-date: '2021-10-13' +last-updated: '2021-10-13' +authors: +- '@jerop' +- '@pritidesai' +--- + +# TEP-0090: Looping + + +- [Summary](#summary) +- [Motivation](#motivation) + - [Goals](#goals) + - [Non-Goals](#non-goals) + - [Use Cases](#use-cases) + - [Parallel Kaniko Build](#parallel-kaniko-build) + - [Dynamic Parallel Docker Build](#dynamic-parallel-docker-build) + - [Fan Out Vault Reading](#fan-out-vault-reading) + - [Multiple Testing Strategies](#multiple-testing-strategies) + - [Requirements](#requirements) + - [Related Work](#related-work) + - [GitHub Actions](#github-actions) + - [Argo Workflows](#argo-workflows) + - [Ansible](#ansible) +- [References](#references) + + +## Summary + +Today, users cannot supply varying `Parameters` to the same `Task` or `Custom Task` - that is, fan out their `Task` or +`Custom Tasks`. In this TEP, we aim to provide a way to run the same `Task` or `Custom Task` with varying `Parameters` +by spinning up a `TaskRun` or `Run` for each `Parameter` in a loop. This looping construct is aimed at improving the +composability, scalability, flexibility and reusability of *Tekton Pipelines*. + +## Motivation + +Users can specify `Parameters`, such as an artifacts' names, that they want to supply to [`Tasks`][tasks-docs] and +[`Custom Tasks`][custom-tasks-docs] at execution. However, they don't have a way to supply varying `Parameters` to +the same `Task` or `Custom Task`. + +Today, users would have to duplicate that `Task` or `Custom Task` in the `Pipelines` specification as many times as the +number of varying `Parameters` that they want to pass in. This creates some limitations and challenges: +- It is tedious and does not scale well because users have to add a `Task` entry to handle an additional *Parameter*. +- It is error-prone when duplicating the `Tasks` specifications, and it may be challenging to debug those errors. +- It is not flexible enough to handle a dynamic set of `Parameters` making it less reusable. + +A common scenario is [a user needs to build multiple images][kaniko-example-1] from one repository using the +[kaniko][kaniko-task] `Task` from the *Tekton Catalog*. Let's assume it's three images. The user would have to specify +that `Pipeline` with the kaniko `Task` duplicated, as such: + +```yaml +apiVersion: tekton.dev/v1beta1 +kind: Pipeline +metadata: + name: kaniko-pipeline +spec: + workspaces: + - name: shared-workspace + params: + - name: image-1 + description: reference of the first image to build + - name: image-2 + description: reference of the second image to build + - name: image-3 + description: reference of the third image to build + tasks: + - name: fetch-repository + taskRef: + name: git-clone + workspaces: + - name: output + workspace: shared-workspace + params: + - name: url + value: https://github.com/tektoncd/pipeline + - name: subdirectory + value: "" + - name: deleteExisting + value: "true" + - name: kaniko-1 + taskRef: + name: kaniko + runAfter: + - fetch-repository + workspaces: + - name: source + workspace: shared-workspace + params: + - name: IMAGE + value: $(params.image-1) + - name: kaniko-2 + taskRef: + name: kaniko + runAfter: + - fetch-repository + workspaces: + - name: source + workspace: shared-workspace + params: + - name: IMAGE + value: $(params.image-2) + - name: kaniko-3 + taskRef: + name: kaniko + runAfter: + - fetch-repository + workspaces: + - name: source + workspace: shared-workspace + params: + - name: IMAGE + value: $(params.image-3) +``` + +As shown in the above example, the limitations and challenges include: +- the user would have to add another `Task` entry if we need to build another image. +- the user can easily make errors while duplicating the `Tasks` specifications. +- the `Pipeline` cannot handle a dynamic set of images making it less reusable. + +The `Parameters` used in the above example are user-defined. In some cases, the `Parameter` may be the `Result` of a +previous `Task` in the `Pipeline`. For example, a user [needs to build a dynamic set of images][kaniko-example-2] and +they share their current experience: + > "Right now I'm doing all of this by just having a statically defined single `Pipeline` with a `Task` and then + delegating to code/loops within that single `Task` to achieve the `N` things I want to do. This works, but then + I'd prefer the concept of a single Task does a single thing, rather than overloading it like this. Especially + when viewing it in the dashboard etc, things get lost" ~ [bitsofinfo][kaniko-example-2] + +We need to address these challenges and limitations to improve the composability, scalability, flexibility, +reusability and debuggability of *Tekton Pipelines*. + +**In this TEP, we aim to provide a way to run the same `Task` or `Custom Task` with varying `Parameters` by spinning up +a `TaskRun` or `Run` for each `Parameter` in a loop. This looping construct is aimed at improving the composability, +scalability, flexibility and reusability of *Tekton Pipelines*** + +### Goals + +- Executing `Tasks` and `Custom Tasks` in a loop with varying `Parameter` values. +- Configuring whether the `TaskRuns` and `Runs` created in the loop execute sequentially or parallelly. +- Controlling the concurrency of `TaskRuns` or `Runs` created in a given loop. + +### Non-Goals + +- Terminating early when the `Tasks` or `Custom Tasks` are executed parallely - in-progress `TaskRuns` and `Runs` have +to complete execution before termination. +- Ignoring a failure when the `Tasks` or `Custom Tasks` are executed sequentially - addressed in [TEP-0050][tep-0050]. + +### Use Cases + +#### Parallel Kaniko Build + +As a `Pipeline` author, I [need to build multiple images][kaniko-example-1] from one repository using the same `Task`. +I choose to use the [*kaniko*][kaniko-task] `Task` from the *Tekton Catalog*. Let's assume it's three images. I want to +pass in varying `Parameter` values for `IMAGE` to create three `TaskRuns`, one for each image. + +``` + clone + | + v + -------------------------------------------------- + | | | + v v v + ko-build-image-1 ko-build-image-2 ko-build-image-3 +``` + +In other circumstances, the `Parameter` values for `IMAGE` may be produced by a previous `Task` in the `Pipeline` +instead of supplying them myself. + +Read more in [user experience report #1][kaniko-example-1] and [user experience report #2][kaniko-example-2]. + +#### Dynamic Parallel Docker Build + +As a `Pipeline` author, I have several dockerfiles in my repository. + +``` +/ docker / Dockerfile + python / Dockerfile + Ubuntu / Dockerfile +... +``` + +I have a *clone* `Task` that fetches the repository to a shared `Workspace`. Then I have a *get-dir* `Task` that +produces a `Result` array with the directory names of the dockerfiles. Finally, I want to dynamically generate the +parallel *docker build* `Tasks` that gets each dockerfile and runs docker build and push. + +``` + clone + | + v + get-dir + | + v + -------------------------------------------------- + | | | + v v v + docker-build-1 docker-build-2 docker-build-3 +``` + +Read more in the [user experience report][docker-example]. + +#### Fan Out Vault Reading + +As a `Pipeline` author, I have a file in my repository with several vault paths. + +```text +path1 +path2 +path3 +... +``` + +I have a *vault-read* `Task` that I need to run for every entry in the file and get the secrets in each of them. +As such, I need to fan out the *vault-read* `Task` N times, where N is the number of vault paths in my file. + +``` + clone + | + v + get-vault-paths + | + v + -------------------------------------------------- + | | | + v v v + vault-read-1 vault-read-2 vault-read-3 +``` + +Read more in the [user experience report][vault-example]. + +#### Multiple Testing Strategies + +As a `Pipeline` author, I have several a file configuring the test types that I want to run. + +```text +code-analysis +unit-tests +e2e-tests +... +``` + +I have a *test* `Task` that I need to run for each test type in the file - the `Task` runs tests based on a `Parameter`. +I need to run this *test* `Task` for multiple test types that are defined in my repository (fetched using the +*test-selector* `Task`). + +``` + clone + | + v + tests-selector + | + v + -------------------------------------------------- + | | | + v v v + code-analysis unit-tests e2e-tests +``` + +### Requirements + +- User should be able to pass in an array `Parameter` to a `Task` or `Custom Task` and generate as many `TaskRuns` or +`Runs` as the length of the array `Parameter`. +- Users should be able to pass in several array `Parameters` to a `Task` or `Custom Task` and generate as many `TaskRuns` +or `Runs` as the combinations of the array `Parameters`. +- Users should be able to configure whether the loop is executed sequentially or parallelly. +- Users should be able to control the concurrency limit (maximum `TaskRuns` or `Runs` executed at a time). + +### Related Work + +The looping construct is related to `for loops` which are available in most programming languages. In this section, we +explore related work on looping constructs in other continuous delivery systems. + +#### GitHub Actions + +GitHub Actions allows users to define a matrix of job configurations - which creates jobs with after substituting +variables in each job. It also allows users to include or exclude combinations in the build matrix. + +For example: + +```yaml +runs-on: ${{ matrix.os }} +strategy: + matrix: + os: [macos-latest, windows-latest, ubuntu-18.04] + node: [8, 10, 12, 14] + exclude: + # excludes node 8 on macOS + - os: macos-latest + node: 8 + include: + # includes node 15 on ubuntu-18.04 + - os: ubuntu-18.04 + node: 15 +``` + +GitHub Actions workflows syntax also allows users to: +- cancel in-progress jobs is one of the matrix jobs fails +- specify maximum number of jobs to run in parallel + +Read more in [documentation][github-actions]. + +#### Argo Workflows + +Argo Workflows allows users to iterate over: +- a list of items as static inputs +- a list of sets of items as static inputs +- parameterized list of items or list of sets of items +- dynamic list of items or lists of sets of items + +Here's an example from the [documentation][argo-workflows]: +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Workflow +metadata: + generateName: loops-param-result- +spec: + entrypoint: loop-param-result-example + templates: + - name: loop-param-result-example + steps: + - - name: generate + template: gen-number-list + # Iterate over the list of numbers generated by the generate step above + - - name: sleep + template: sleep-n-sec + arguments: + parameters: + - name: seconds + value: "{{item}}" + withParam: "{{steps.generate.outputs.result}}" + + # Generate a list of numbers in JSON format + - name: gen-number-list + script: + image: python:alpine3.6 + command: [python] + source: | + import json + import sys + json.dump([i for i in range(20, 31)], sys.stdout) + + - name: sleep-n-sec + inputs: + parameters: + - name: seconds + container: + image: alpine:latest + command: [sh, -c] + args: ["echo sleeping for {{inputs.parameters.seconds}} seconds; sleep {{inputs.parameters.seconds}}; echo done"] +``` + +Read more in the [documentation][argo-workflows]. + +#### Ansible + +Ansible allows users to execute a task multiple times using `loop`, `with_` and `until` keywords. + +For example: + +```yaml +- name: Show the environment + ansible.builtin.debug: + msg: " The environment is {{ item }} " + loop: + - staging + - qa + - production +``` + +Read more in the [documentation][ansible]. + +## References + +- [Task Loops Experimental Project][task-loops] +- Issues: + - [#2050: `Task` Looping inside `Pipelines`][issue-2050] + - [#4097: List of `Results` of a `Task`][issue-4097] + +[task-loops]: https://github.com/tektoncd/experimental/tree/main/task-loops +[issue-2050]: https://github.com/tektoncd/pipeline/issues/2050 +[issue-4097]: https://github.com/tektoncd/pipeline/issues/4097 +[tasks-docs]: https://github.com/tektoncd/pipeline/blob/main/docs/tasks.md +[custom-tasks-docs]: https://github.com/tektoncd/pipeline/blob/main/docs/pipelines.md#using-custom-tasks +[kaniko-example-1]: https://github.com/tektoncd/pipeline/issues/2050#issuecomment-625423085 +[kaniko-task]: https://github.com/tektoncd/catalog/tree/main/task/kaniko/0.5 +[kaniko-example-2]: https://github.com/tektoncd/pipeline/issues/2050#issuecomment-671959323 +[docker-example]: https://github.com/tektoncd/pipeline/issues/2050#issuecomment-814847519 +[vault-example]: https://github.com/tektoncd/pipeline/issues/2050#issuecomment-841291098 +[tep-0050]: https://github.com/tektoncd/community/blob/main/teps/0050-ignore-task-failures.md +[argo-workflows]: https://github.com/argoproj/argo-workflows/blob/7684ef4a0c5f57e8723dc8e4d3a17246f7edc2e6/examples/README.md#loops +[github-actions]: https://docs.github.com/en/actions/learn-github-actions/workflow-syntax-for-github-actions +[ansible]: https://docs.ansible.com/ansible/latest/user_guide/playbooks_loops.html#loops diff --git a/teps/README.md b/teps/README.md index b74cfc6fe..b76b7b535 100644 --- a/teps/README.md +++ b/teps/README.md @@ -223,3 +223,4 @@ This is the complete list of Tekton teps: |[TEP-0073](0073-simplify-metrics.md) | Simplify metrics | proposed | 2021-06-23 | |[TEP-0080](0080-support-domainscoped-parameterresult-names.md) | Support domain-scoped parameter/result names | implemented | 2021-08-19 | |[TEP-0084](0084-endtoend-provenance-collection.md) | end-to-end provenance collection | proposed | 2021-09-16 | +|[TEP-0090](0090-looping.md) | Looping | proposed | 2021-10-13 |