Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build requests piling up on workers #250

Open
vitalybuka opened this issue Aug 12, 2024 · 9 comments · Fixed by #266
Open

Build requests piling up on workers #250

vitalybuka opened this issue Aug 12, 2024 · 9 comments · Fixed by #266
Assignees

Comments

@vitalybuka
Copy link
Contributor

https://lab.llvm.org/staging/#/builders/sanitizer-x86_64-linux-bootstrap-msan

@vitalybuka
Copy link
Contributor Author

I believe previously, maybe a few month ago, it was never more than 1-2 requests.

@vitalybuka
Copy link
Contributor Author

vitalybuka commented Aug 16, 2024

Another example
https://lab.llvm.org/buildbot/#/builders/49

image

It just started to build 3 days old patch.
I would expect it picks the most recent one and discard the rest.

@gkistanova
Copy link
Contributor

This has been discussed in discourse starting from this message - https://discourse.llvm.org/t/staging-buildmaster-no-longer-collapses-requests/80500/3. In short - a few days ago we had unusually large number of commits with changes to cmake files. For such commits we request clean builds to make sure if there is an issue it will not be hidden by incremental builders. Build requests with different sets of properties do not collapse.

If this would become a regular problem, I'll rethink of the combining logic.

I quickly checked and this is the case for the builder mentioned here as well.

@vitalybuka
Copy link
Contributor Author

Is there an option to disable this for a particular builder?
For out bots, with sanitizers, many of them 4+ Hour

https://lab.llvm.org/buildbot/#/waterfall?tags=sanitizer

Also they don't run incremental builds, but instead, as suggested, use ccache. So scheduling clean build unnecessary,

@gkistanova
Copy link
Contributor

None of them is marked as doing clean builds.

Could you prepare a patch which either explicitly sets clean=True for these builders, or changes the default in getSanitizerBuildFactory to True, please?

And I'll think of changing the collapse logic to use that information.

vitalybuka added a commit that referenced this issue Aug 23, 2024
vitalybuka added a commit that referenced this issue Aug 23, 2024
DavidSpickett added a commit that referenced this issue Sep 4, 2024
This is step 1 to addressing the rapidly expanding build queues we're
seeing (#250).

The bot is currently on silent, and the plan is:
* Mark all builds as clean (we use ccache so it shouldn't be a big
impact).
* Clear the existing requests.
* See what happens to the queue.

It's possible we will need a follow up to make collapsing work, but this
is the first step either way.
@DavidSpickett
Copy link
Contributor

And I'll think of changing the collapse logic to use that information.

We (Linaro) are working on changes to do this.

@DavidSpickett
Copy link
Contributor

DavidSpickett commented Sep 20, 2024

We've been able to implement this new rule and it appears to be working. We're going to test it more over the weekend and have changes for review next week.

DavidSpickett added a commit to DavidSpickett/llvm-zorg that referenced this issue Sep 24, 2024
This is to address llvm#250.
Where build requests were piling up on slower builders because
they were clean/don't clean/clean/ don't clean etc. and we could
not merge those requests because of that difference.

Galina suggested that if we knew that the builder was going to
clean the build first anyway, we could merge them. This commit
implements that.

We do this by first setting a "clean" attribute on the builder's
factory object, that we can find later via in the collapse callback.

So far only ClangBuilder passes this on but any builder can opt
in by copying what it does.

In the collapse function when comparing properties we delete the
"clean_obj" key from both sets of properties if we know that the
builder's factory is always clean.

With some logging (not included in this commit) we can see it now
collapses a clean/non-clean pair of requests:
```
collapseRequests
selfBuildSetProperties: {'scheduler': ('main:clang,clang-tools-extra,compiler-rt,flang,lld,llvm,mlir', 'Scheduler')}
otherBuildSetProperties: {'scheduler': ('main:clang,clang-tools-extra,compiler-rt,flang,lld,llvm,mlir', 'Scheduler'), 'clean_obj': (True, 'Change')}
Removing clean_obj from build set properties.
Build set properties were the same.
```

This has beeen tested by my colleague Omair Javaid on a test build
master. Of course we only tested it with one of Linaro's builders,
so it's possible we see other side effects in production.
asb added a commit that referenced this issue Oct 10, 2024
…R_RT_BUILD_SANITIZERS=OFF

clean=True due to <#250>

Not updating other riscv builders as they are about to be redeployed
with a different config (PR forthcoming).
vitalybuka added a commit that referenced this issue Nov 4, 2024
vitalybuka added a commit that referenced this issue Nov 4, 2024
vitalybuka added a commit that referenced this issue Nov 4, 2024
For #250.

This bots can't catch-up with `clean_obj` requests, but they do clean
build anyway.
Maybe we should do the same for all sanitizer builders, but seems they
are OK as is.
vitalybuka added a commit that referenced this issue Nov 5, 2024
vitalybuka added a commit that referenced this issue Nov 5, 2024
@vitalybuka vitalybuka reopened this Nov 19, 2024
@vitalybuka
Copy link
Contributor Author

It's still happening.

https://lab.llvm.org/buildbot/#/builders/94

image

@DavidSpickett
Copy link
Contributor

I expect that the buildmaster(s) have not received the change yet. This timing of that is up to @gkistanova .

vitalybuka added a commit that referenced this issue Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants