tasks: Lots of cleanup #590

martinpitt · 2024-03-08T05:17:05Z

Rendered README: https://github.com/cockpit-project/cockpituous/blob/cleanups/tasks/README.md

Tests blocked on Convert .format() to f-strings bots#6035
Blocked on finding a proper solution for #6045 , i.e. getting rid of the old machinery
Rebase after Properly fix podman.socket permissions in tasks containers #598, it's going to conflict (textually, not conceptually)
Refresh the container . ~~Blocked on Convert .format() to f-strings bots#6035 and test: Drop rhel-8* from supported OSes, fix for ruff 0.3.1 (fixes failing "tox" test) cockpit#20149~~
Deploy this on OpenShift and validate
Deploy this to rhos/e2e and validate

allisonkarlitskaya

commit message: consistentcy

commit message: we're also running tests-scan and friends directly in the main container (in addition to statistics)

allisonkarlitskaya · 2024-03-12T19:10:46Z

ansible/roles/tasks-systemd/tasks/main.yml

          # qcow overlays on tmpfs
          '--tmpfs=/tmp:size=14g',
          '--env=TEST_OVERLAY_DIR=/tmp',
+
+          # image cache
+          '--env=COCKPIT_IMAGES_DATA_DIR=/cache/images',


I feel like I'm missing the corresponding change in run-local.sh

It's right here, in this very commit.

Ah, I didn't understand how this works differently (ie: args given to outside container and then forwarded inside, with job-runner.toml created by shellscript).

I feel like it would be easier to understand (in run-local.sh) if the never-changing parts of the run-args are listed directly in the toml file and we only use the variable substitution for the parts we change when we're mocking.

Oh, of course -- this also should go into job-runner, not just the "queue run" tasks container, so that it reflects what's going on in production. Also agreed about the "never-changing parts", I cleaned this up.

ansible/roles/tasks-systemd/tasks/main.yml

tasks/cockpit-tasks

tasks/install-service

martinpitt · 2024-03-13T07:51:55Z

OK, this works now. I adjusted the commit messages, too. However, I'd like to queue it after #598, as it's gonna conflict (a little bit), and is less important.

martinpitt · 2024-03-13T08:51:10Z

@allisonkarlitskaya This is the "first stage" review -- it may likely get some minor fixups when I deploy it, I just wanted us to agree before I change anything in prod

allisonkarlitskaya

I think these are all nice cleanups — but let's see how the rollout goes. Please see the comments about how job-runner.toml in run-local.sh could be easier to understand...

One thing I'd wish for is less user-setup stuff in the containerfile.

allisonkarlitskaya · 2024-03-13T09:27:26Z

ansible/roles/tasks-systemd/tasks/main.yml

          # qcow overlays on tmpfs
          '--tmpfs=/tmp:size=14g',
          '--env=TEST_OVERLAY_DIR=/tmp',
+
+          # image cache
+          '--env=COCKPIT_IMAGES_DATA_DIR=/cache/images',


Ah, I didn't understand how this works differently (ie: args given to outside container and then forwarded inside, with job-runner.toml created by shellscript).

I feel like it would be easier to understand (in run-local.sh) if the never-changing parts of the run-args are listed directly in the toml file and we only use the variable substitution for the parts we change when we're mocking.

Replace them with env variables in `job-config.toml`/OpenShift. `$TEST_OVERLAY_DIR` was already set. Also clean up the job-runner.toml construction in run-local.sh: Move the constant parts of `container.run-args` into the template, and only use `$run_args` for test specific adjustments.

With job-runner, all qemu jobs moved to separate containers. So we don't need the libvirt related cleanup any more. Note: We still need to retain the image-prune call. issue-scan/image-refresh only prunes S3, not the local cache. We may want to teach it to at some point.

These now just do `{tests,issue}-scan`, `run-queue`, `job-runner`, and test statistics. None of these uses a lot of resources, the expensive ones (tests and image refreshes) moved to job-runner.toml. So drop all the memory/shm configuration and obsolete env vars. Also drop the custom network. If inter-container socket leakage is still an issue, it'll need to be done to the job containers, but hopefully that was fixed long ago.🤞 Drop the remaining `Environment=` indirection. The env vars can just be expanded in-place when writing the service file. This was already done for many of them. Still keep importing /dev/kvm into the container. It doesn't strictly need this, but current bots `run-queue` uses the presence of this as an indication whether or not it can consume tasks from the public/rhel queues. This needs to be fixed first (which needs some thought). So dropping that will happen later.

This is a deployment script on the host, it has no place inside the container.

This separates files which affect the cockpit/tasks container image (and hence need an image refresh) from the files for deployment and tests. This simplifies and robustifies the "did the container change?" check in the tests workflow.

This is hopelessly outdated and useless. Production does not use that, and local developers just use toolbox or podman-run -- they wouldn't have any of these secrets directories anyway.

Adjust it to current reality of secrets, job-runner, and CentOS CI resources. Drop the remaining `$` prefixes from commands for consistency and working copy&paste. Move up the "using with toolbox" section, as it belongs to the deployment part. The bottom is documentation how our system works.

martinpitt · 2024-03-13T11:03:01Z

Updated for two issues that I overlooked. The /dev/kvm was a funny one.. I deployed this now, and sent cockpit-project/bots#6079 to confirm both an image refresh and a test run, and it's happy now.

Update: I sent a queue-run fix to cockpit-project/bots#6080 . If that lands soon, I'm happy to revert the revert 😁

Update 2: Ah, this has to land first. Me refreshing the tasks container now broke bots tests, as they use the new container with the old run-local.sh code. I'll drop /dev/kvm as a follow-up.

With job-runner, the bits that actually need /dev/kvm don't run in the "queue monitor" task container any more, but in separate containers. We want to drop importing /dev/kvm into the monitor container [1], so extend the check to also accept the presence of a job-runner configuration. With that we can drop the check for `COCKPIT_TESTMAP_INJECT` -- that was just a hack, and the cockpituous integration tests define `JOB_RUNNER_CONFIG` already. [1] cockpit-project/cockpituous#590

allisonkarlitskaya

Thanks for the cleanup.

Let's get this in.

With job-runner, the bits that actually need /dev/kvm don't run in the "queue monitor" task container any more, but in separate containers. We want to drop importing /dev/kvm into the monitor container [1], so extend the check to also accept the presence of a job-runner configuration. With that we can drop the check for `COCKPIT_TESTMAP_INJECT` -- that was just a hack, and the cockpituous integration tests define `JOB_RUNNER_CONFIG` already. [1] cockpit-project/cockpituous#590

martinpitt added the blocked label Mar 8, 2024

martinpitt force-pushed the cleanups branch 4 times, most recently from e8c5abc to d21fb87 Compare March 8, 2024 07:40

martinpitt requested review from allisonkarlitskaya and removed request for allisonkarlitskaya March 8, 2024 07:40

martinpitt self-assigned this Mar 12, 2024

martinpitt force-pushed the cleanups branch from d21fb87 to 6cdcb02 Compare March 12, 2024 17:12

allisonkarlitskaya requested changes Mar 12, 2024

View reviewed changes

martinpitt force-pushed the cleanups branch 2 times, most recently from 859e8a0 to f41df47 Compare March 13, 2024 07:26

martinpitt removed the blocked label Mar 13, 2024

martinpitt force-pushed the cleanups branch from f41df47 to 8cb314b Compare March 13, 2024 08:47

martinpitt marked this pull request as ready for review March 13, 2024 08:47

martinpitt requested a review from allisonkarlitskaya March 13, 2024 08:47

allisonkarlitskaya reviewed Mar 13, 2024

View reviewed changes

martinpitt added 2 commits March 13, 2024 11:06

martinpitt force-pushed the cleanups branch from 8cb314b to 942164f Compare March 13, 2024 10:08

martinpitt temporarily deployed to quay.io March 13, 2024 10:15 — with GitHub Actions Inactive

martinpitt added 5 commits March 13, 2024 11:59

tasks: Drop "install-services" from container

bf2a0f8

This is a deployment script on the host, it has no place inside the container.

Makefile: Drop tasks-shell

d0b2a48

This is hopelessly outdated and useless. Production does not use that, and local developers just use toolbox or podman-run -- they wouldn't have any of these secrets directories anyway.

martinpitt force-pushed the cleanups branch from 942164f to b41ad9f Compare March 13, 2024 11:01

martinpitt requested a review from allisonkarlitskaya March 13, 2024 11:03

martinpitt mentioned this pull request Mar 13, 2024

run-queue: Extend queue capability check to job-runner cockpit-project/bots#6080

Merged

allisonkarlitskaya approved these changes Mar 13, 2024

View reviewed changes

allisonkarlitskaya merged commit 9ca717a into main Mar 13, 2024
3 checks passed

allisonkarlitskaya deleted the cleanups branch March 13, 2024 12:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tasks: Lots of cleanup #590

tasks: Lots of cleanup #590

martinpitt commented Mar 8, 2024 •

edited

Loading

allisonkarlitskaya left a comment

allisonkarlitskaya Mar 12, 2024

martinpitt Mar 13, 2024 •

edited

Loading

allisonkarlitskaya Mar 13, 2024

martinpitt Mar 13, 2024

martinpitt commented Mar 13, 2024

martinpitt commented Mar 13, 2024

allisonkarlitskaya left a comment

allisonkarlitskaya Mar 13, 2024

martinpitt commented Mar 13, 2024 •

edited

Loading

allisonkarlitskaya left a comment

tasks: Lots of cleanup #590

tasks: Lots of cleanup #590

Conversation

martinpitt commented Mar 8, 2024 • edited Loading

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

allisonkarlitskaya Mar 12, 2024

Choose a reason for hiding this comment

martinpitt Mar 13, 2024 • edited Loading

Choose a reason for hiding this comment

allisonkarlitskaya Mar 13, 2024

Choose a reason for hiding this comment

martinpitt Mar 13, 2024

Choose a reason for hiding this comment

martinpitt commented Mar 13, 2024

martinpitt commented Mar 13, 2024

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

allisonkarlitskaya Mar 13, 2024

Choose a reason for hiding this comment

martinpitt commented Mar 13, 2024 • edited Loading

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

martinpitt commented Mar 8, 2024 •

edited

Loading

martinpitt Mar 13, 2024 •

edited

Loading

martinpitt commented Mar 13, 2024 •

edited

Loading