Change tinkerbell e2e run algorithm to schedule longer/smaller tests concurrently #8627

rahulbabu95 · 2024-08-17T02:24:25Z

Description of changes:
Tinkerbell E2E tests runner algorithm sorts the tests based on the hardware count and prioritize running the tests that require more hardware which typically take more duration to run first. Our test runners run concurrently and reserve the hardware from the available pool. The concurrent count is set to 20 as we have issues with VCenter environment when we bump this count to higher number. In current setup, the longer tests reserve the required hardware first and run for a really long time. So the first few runners that reserve the hardware do not free them up until the first couple of hours of the runtime. We do not know if this is the best way to use up the hardware as smaller tests like single node tests take 20-30% of the runtime as that of longer tests. We have earlier also tried running all the smaller tests first which did take same amount of time like longer tests or even a little longer. Change the algorithm to prioritize tests that require lesser and more hardware count at the same time by popping from the tests queue at both ends. Experimentally check if this helps in using the hardware efficiently and helps in reducing the test run time.

Testing (if applicable):

Documentation added/planned (if applicable):

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

rahulbabu95 · 2024-08-17T02:41:19Z

/hold

codecov · 2024-08-17T04:01:23Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 73.53%. Comparing base (75d5c92) to head (43bfd8d).
Report is 9 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff            @@
##           main    #8627       +/-   ##
=========================================
+ Coverage      0   73.53%   +73.53%     
=========================================
  Files         0      578      +578     
  Lines         0    36557    +36557     
=========================================
+ Hits          0    26882    +26882     
- Misses        0     7955     +7955     
- Partials      0     1720     +1720

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Rahul Ganesh <[email protected]>

sp1999 · 2024-08-25T08:17:36Z

/lgtm
/approve
/unhold

eks-distro-bot · 2024-08-25T08:17:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sp1999

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [sp1999]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sp1999 · 2024-10-24T19:18:00Z

/cherry-pick release-0.20

eks-distro-pr-bot · 2024-10-24T19:18:38Z

@sp1999: new pull request created: #8902

In response to this:

/cherry-pick release-0.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

eks-distro-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Aug 17, 2024

eks-distro-bot added do-not-merge/hold size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 17, 2024

rahulbabu95 changed the title ~~Change tinkerbell e2e run algorithm to use 2 pointer technique~~ Change tinkerbell e2e run algorithm to schedule longer/smaller tests concurrently Aug 19, 2024

Change tinkerbell e2e run algorithm to use 2 pointer technique

43bfd8d

Signed-off-by: Rahul Ganesh <[email protected]>

rahulbabu95 force-pushed the e2e/tink-e2e-runners branch from bbb047e to 43bfd8d Compare August 24, 2024 01:41

eks-distro-bot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 24, 2024

eks-distro-bot removed the do-not-merge/hold label Aug 25, 2024

eks-distro-bot assigned sp1999 Aug 25, 2024

eks-distro-bot added the lgtm label Aug 25, 2024

eks-distro-bot added the approved label Aug 25, 2024

eks-distro-bot merged commit c83d043 into aws:main Aug 25, 2024
12 checks passed

eks-distro-pr-bot mentioned this pull request Oct 24, 2024

[release-0.20] Change tinkerbell e2e run algorithm to schedule longer/smaller tests concurrently #8902

Merged

sp1999 mentioned this pull request Oct 24, 2024

[release-0.20] Remove "max-instances" from e2e binary #8897

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change tinkerbell e2e run algorithm to schedule longer/smaller tests concurrently #8627

Change tinkerbell e2e run algorithm to schedule longer/smaller tests concurrently #8627

rahulbabu95 commented Aug 17, 2024 •

edited

Loading

rahulbabu95 commented Aug 17, 2024

codecov bot commented Aug 17, 2024 •

edited

Loading

sp1999 commented Aug 25, 2024

eks-distro-bot commented Aug 25, 2024

sp1999 commented Oct 24, 2024

eks-distro-pr-bot commented Oct 24, 2024

Change tinkerbell e2e run algorithm to schedule longer/smaller tests concurrently #8627

Change tinkerbell e2e run algorithm to schedule longer/smaller tests concurrently #8627

Conversation

rahulbabu95 commented Aug 17, 2024 • edited Loading

rahulbabu95 commented Aug 17, 2024

codecov bot commented Aug 17, 2024 • edited Loading

Codecov Report

sp1999 commented Aug 25, 2024

eks-distro-bot commented Aug 25, 2024

sp1999 commented Oct 24, 2024

eks-distro-pr-bot commented Oct 24, 2024

rahulbabu95 commented Aug 17, 2024 •

edited

Loading

codecov bot commented Aug 17, 2024 •

edited

Loading