Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TASK: adjust ELNBuildSync wait criteria between batches? #182

Open
yselkowitz opened this issue Mar 13, 2024 · 4 comments
Open

TASK: adjust ELNBuildSync wait criteria between batches? #182

yselkowitz opened this issue Mar 13, 2024 · 4 comments
Labels
bug Something isn't working stalled This has been back-burnered

Comments

@yselkowitz
Copy link
Member

yselkowitz commented Mar 13, 2024

Background

perl-Compress-Raw-Lzma hardcodes (and verifies?) the version of liblzma (xz-libs) against which it was built, and therefore the RPM has a hardcoded version-specific dependency on that xz-libs version. As a result, they need to be built in the same side tag.

Despite xz-5.6.1-1.fc41 and perl-Compress-Raw-Lzma-2.209-5.fc41 having correctly been built together in f41-build-side-85449, resulting in an xz-libs = 5.6.1 dependency correctly created for the latter, bodhi apparently took too long processing the side tag, first tagging the xz build into f41 at 14:26:27, then the perl-Compress-Raw-Lzma build at 14:26:35. That's 8 seconds apart, and being over a weekend things were quiet, and with no other builds in the interim and more than 5 wall clock seconds between the tags, they ended up in separate batches (eln-build-side-85453 and eln-build-side-85455), each by themselves.

Furthermore, despite xz going first, that didn't help because the next batch (with perl-Compress-Raw-Lzma) was started mere seconds later, but the xz build was still working its way through bodhi, meaning that it wasn't in the buildroot yet in time for the second batch, so perl-Compress-Raw-Lzma got mistakenly built against the previous xz-5.6.0. In fact, the updates for the two batches went stable to ELN almost simultaneously (fwiw xz is probably critical path).

What does the ELN SIG need to do?

So a few things went wrong here:

  • First, it seems that 5 seconds is not sufficient to guarantee that everything from a single rawhide side tag gets into a single ELN batch.
  • Secondly, the builds from a completed batch do not actually end up in the buildroot for the next batch, because of the lag time going through bodhi before being marked stable.

Therefore, we need to consider:

  1. If 5 seconds is not long enough to assure that all builds from a side-tag get into a single batch, how long is?
  2. Does EBS need to (and can it even) wait until the previous batch goes through testing and is marked stable before starting the next branch?
  3. Alternatively, should eln-updates-candidate be in the eln-build inheritance to avoid this wait? (Probably a bad idea, at least not without limiting who can build directly to eln.)

Also, to fix this particular issue in the short-term, either we'll need a dist-tag bump, or another revbump and rebuild to perl-Compress-Raw-Lzma.

@yselkowitz
Copy link
Member Author

FYI this is why perl-Module-Build FTBFS: https://koji.fedoraproject.org/koji/taskinfo?taskID=114812020

@sgallagher
Copy link
Member

Therefore, we need to consider:

1. If 5 seconds is not long enough to assure that all builds from a side-tag get into a single batch, how long is?

This is a balancing act and probably requires looking into tagging statistics for Rawhide. We want it to be long enough that we capture everything but not so long that we never end up starting a batch because the timeout never hits. Maybe we switch to 30s and see how that goes?

2. Does EBS need to (and can it even) wait until the previous batch goes through testing and is marked stable before starting the next branch?

There are a couple ways to do this, none of them ideal.

As one approach, we could monitor for tag events on all of the builds and only start a new batch once all of them have been tagged into eln . The other option would be for us to become Bodhi-aware and instead of directly tagging all of the builds to eln-updates-candidate, we would create a Bodhi side-tag merge event and wait until it completes.

The problem with that is that if even one package gets stuck in gating, we won't trigger another batch. We can set a maximum wait time to work around that, but I worry that this could lead to significant delays. With the Bodhi approach, we wouldn't even have any of them in the buildroot, so I think we can exclude that approach entirely. At least with the direct tagging and a timeout, we'd pick up whatever made it through gating.

3. Alternatively, should `eln-updates-candidate` be in the `eln-build` inheritance to avoid this wait?  (Probably a bad idea, at least not without limiting who can build directly to eln.)

Definitely not in eln-build, but I wonder if we could get a separate build target created that does this. Then, instead of generating a side-tag from eln-build, we could have EBS generate it from eln-build-ebs (bikeshed here!) which does that inheritance.

It would allow people to still directly build against the eln target, but EBS could get the extra inheritance.

Also, to fix this particular issue in the short-term, either we'll need a dist-tag bump, or another revbump and rebuild to perl-Compress-Raw-Lzma.

For a single package, I'd prefer not to bump dist-tag. Can you ask the maintainer whether they mind?

@yselkowitz
Copy link
Member Author

Filed https://src.fedoraproject.org/rpms/perl-Compress-Raw-Lzma/pull-request/2 for the rebuild.

@yselkowitz yselkowitz changed the title TASK: adjust ELNBuildSync wait criteria between branches? TASK: adjust ELNBuildSync wait criteria between batches? Mar 13, 2024
@yselkowitz
Copy link
Member Author

This happened again; perl-Compress-Raw-Lzma-2.212-3.eln140 was built with xz 5.4.6 instead of 5.6.2, which was built in an earlier batch but didn't get into the buildroot in time.

@yselkowitz yselkowitz added the bug Something isn't working label Aug 16, 2024
@yselkowitz yselkowitz added the stalled This has been back-burnered label Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stalled This has been back-burnered
Projects
None yet
Development

No branches or pull requests

2 participants