-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change ordering of Matrixed TaskRuns Produced to Not Be Deterministic #7056
Comments
Original discussion where this was introduced: #6312 (comment) |
Yes absolutely, ordering over @lbernick Please explain the use cases for your concerns. Thanks! |
I'd like to have the possibility of changing our matrix implementation in the future. If users expect that taskruns are fanned out in a deterministic order, they will likely come to rely on this behavior, making it much harder for us to change it if we want. We can always go from randomized ordering to deterministic ordering later, but we cannot go from deterministic order to randomized order if users depend on deterministic order. Therefore, I believe we should start with randomized order, and only introduce deterministic order if there's a strong use case/need for it. In terms of use cases: There's no use case I'm aware of for pipeline authors needing randomized ordering. The use case is for Tekton developers wanting to change implementation of fanned out tasks. I don't know what we might need to change, but here's a straw man example: Consider a matrixed taskrun that produces an array result, which is then fanned out into another matrixed taskrun. We will probably initially implement this to wait until all matrixed taskruns are complete, and then fan out the second task in the same order as the first task. But what if we want to improve performance, and start creating matrixed taskruns for the second task as soon as the string results from some of the previous matrixed taskruns are ready? If we guarantee deterministic ordering, we might not be able to make this change. While deterministic ordering makes testing easier, I'm not aware of any use cases for it. If we do want to support deterministic ordering as a feature, I think we should have clear use cases in mind, and make sure deterministic ordering is the best way to address them. I believe this is a beta blocker, since alpha features "may be dropped or backwards incompatible changes made at any time", while for beta features, "these features will not be dropped, though the details may change." If deterministic ordering is considered a feature, rather than an implementation detail, promoting matrix to beta could signify that we will not remove deterministic ordering. Therefore, I think we should make a decision on this before beta promotion. |
What could be the motivation for pipeline developers wanting to change implementation of fanned out tasks?
The use case you have included is not clear to me. Let's say we have two tasks in a pipeline.
Now, the use case you are describing includes scheduling an instance of
what kind testing are you referring to? Let me take some time to understand your concerns with Let's continue with the same example. - name: kaniko-build
taskRef:
name: kaniko
matrix:
params:
- name: IMAGE
value: [ "image-1", "image-2", "image-3" ] Now, the By following the standard programming language constructs, Now, making it a little bit more complex, let's say: - name: kaniko-build
taskRef:
name: kaniko
matrix:
params:
- name: IMAGE
value: [ "image-1", "image-2", "image-3" ]
- name: OS
value: ["linux", "windows"]
Ranging over an array Now, the first instance of this
Are you suggesting to randomize ranging over an array using |
It's hard for me to predict what changes we want to make in the future, but I think it's important to leave ourselves the option of doing so. We can always re-introduce sorting later, but it's hard to take away. The example I gave was just an attempt to brainstorm a reason we might want to change the implementation. I don't think we need to know that we will change the implementation; I think we just shouldn't lock ourselves into this implementation without a good reason.
Just tekton pipelines unit tests. However, I don't think easier testing (by itself) is a good reason to have deterministic ordering.
I agree that typical expectations are that array ordering is preserved. However, I don't think there are typical expectations when it comes to iteration over 2D (or 3D etc) arrays (e.g. should the next combo be "image-2" and "linux" or "image-1" and "windows"?), and it's not clear what the ordering should be if there's an
No, I'm just suggesting we remove our sorting logic and document that matrix ordering is not guaranteed. |
We are not reinventing here, this is a standard programming paradigm. The entire language specifications for
This sorting logic has no impact or influence on the matrix ordering. This sorting logic was introduced to order each combination object e.g. a combination with multiple parameters returns the same order of params such that comparing combinations does not cause any failures due to mismatch order. Let's take an existing example with the latest pipelines release: Here is the list of taskRuns created:
Now, let's see what were the combination used for each instance:
Note the combinations, first three taskRuns 0 through 2 has browser set to The next three taskRuns 3 through 5 has browser set to Now, I deleted the sorting logic and ran the same example:
The order is still the same. |
If it helps, Matrix in GitHub Actions inspired Matrix in Tekton. The ordering is also deterministic in GitHub Actions, and it is ordered in the same way as Tekton. Users of Tekton who are familiar with GitHub Actions would likely expect the feature to behave in similar ways. The blurb below the line below is lifted from this documentation: https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs#using-a-matrix-strategy jobs:
example_matrix:
strategy:
matrix:
version: [10, 12, 14]
os: [ubuntu-latest, windows-latest] The order of the variables in the matrix determines the order in which the jobs are created. The first variable you define will be the first job that is created in your workflow run. For example, the above matrix will create the jobs in the following order:
|
thanks @jerop yes it is an industry standard and common across many CI/CD platforms. |
Thank you for testing this out; I didn't realize this sorting logic was not what was causing the deterministic ordering. @jerop thanks for the info about GHA; do you happen to have any insight on why they chose to support this? (I understand this might not be publicly available info.) I am not opposed to deterministic ordering, but I want to understand the use cases that exist for it before users come to rely on it. |
Thanks a bunch for clarifying this. May be I misread this issue suggesting to change the deterministic order. Now that since you not opposed to it, can we please move this out as a blocker from #5265 (comment). Thanks!
Please clarify what exactly you would like to understand, I am happy to help! |
I still believe this is important to decide before beta, for the reasons I listed in this comment. Would a clearer title for this issue be "decide whether to preserve deterministic ordering of matrixed taskruns"? I see a few options:
I'd like to understand an example of a CI/CD workflow that would be much more difficult to implement if matrixed ordering was not deterministic. |
I cannot speak to why GitHub Actions chose to support it, but I can see the predictability of the matrix being useful for users Users can rely on the deterministic ordering to automate progressive deployments to different environments by using a matrix and limiting concurrency to one: https://cbrgm.net/post/2022-01-12-github-actions-environments/ ...
jobs:
deploy:
runs-on: ubuntu-latest
strategy:
matrix:
stage: ['development', 'integration', 'production']
fail-fast: true
max-parallel: 1
steps:
- name: do some stuff
uses: ...
with: ... In TEP-0090, we indicated interest in adding similar concurrency controls in Matrix in Tekton
While looking into the deterministic ordering in GHA matrix, I ran into reported issues with comments from frustrated users when the ordering was broken: actions/runner#703, actions/runner#483. The examples mentioned in the issues are depending on matrix and concurrency to automate progressive processes. # running two tests at a time, then progress to more expensive tests only when the previous were successful
...
jobs:
tests:
runs-on: ubuntu-latest
strategy:
matrix:
test: ['simple', 'groovy', 'filesystem', 'webflow', 'webflowconfig', 'attributes']
fail-fast: true
max-parallel: 2
steps:
- name: run the tests
uses: ...
with: ... # changing terraform workspace one at a time, starting with dev then staging
...
jobs:
deploy:
runs-on: ubuntu-latest
strategy:
matrix:
workspace: ['dev-us-east-1', 'stg-us-west-2']
fail-fast: true
max-parallel: 1
steps:
- name: change terraform workspace
uses: ...
with: ... |
Thanks very much for this info @jerop, this is exactly what I was looking for, super helpful! I agree this is an important use case to address, although I am not sure that deterministic ordering is the best way to address it. However, I agree that users who are used to GHA syntax may be surprised to find that ours differs. I am curious to hear what other maintainers think. Since it sounds like IBM has a strong need for this feature, and I don't have a strong stake in it, I am not opposed to moving forward. |
Thanks a bunch @EmmaMunley for your continuous support on making this feature possible! In addition to IBM, every small and big organization which has deployed Tekton for their CI/CD needs will be benefited with this feature. @lbernick please feel free to reopen this whenever you find some stake in it. |
This issue is tracking the discussion regarding the fact that TaskRuns are currently fanned out from a Matrixed PipelineTask in a deterministic order with a suggestion of wanting to change it to be non-deterministic so that users won't rely on the ordering. This was originally brought up when discussing TEP-140: Produce Results from Matrixed PipelineTasks: tektoncd/community#1047 (comment)
This issue should be resolved before promoting Matrix to Beta #5265
The text was updated successfully, but these errors were encountered: