Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tasks: Lots of cleanup #590

Merged
merged 7 commits into from
Mar 13, 2024
Merged

tasks: Lots of cleanup #590

merged 7 commits into from
Mar 13, 2024

Conversation

martinpitt
Copy link
Member

@martinpitt martinpitt commented Mar 8, 2024

Rendered README: https://github.com/cockpit-project/cockpituous/blob/cleanups/tasks/README.md

@martinpitt martinpitt force-pushed the cleanups branch 4 times, most recently from e8c5abc to d21fb87 Compare March 8, 2024 07:40
@martinpitt martinpitt requested review from allisonkarlitskaya and removed request for allisonkarlitskaya March 8, 2024 07:40
@martinpitt martinpitt self-assigned this Mar 12, 2024
Copy link
Member

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commit message: consistentcy

commit message: we're also running tests-scan and friends directly in the main container (in addition to statistics)

# qcow overlays on tmpfs
'--tmpfs=/tmp:size=14g',
'--env=TEST_OVERLAY_DIR=/tmp',

# image cache
'--env=COCKPIT_IMAGES_DATA_DIR=/cache/images',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm missing the corresponding change in run-local.sh

Copy link
Member Author

@martinpitt martinpitt Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's right here, in this very commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't understand how this works differently (ie: args given to outside container and then forwarded inside, with job-runner.toml created by shellscript).

I feel like it would be easier to understand (in run-local.sh) if the never-changing parts of the run-args are listed directly in the toml file and we only use the variable substitution for the parts we change when we're mocking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, of course -- this also should go into job-runner, not just the "queue run" tasks container, so that it reflects what's going on in production. Also agreed about the "never-changing parts", I cleaned this up.

ansible/roles/tasks-systemd/tasks/main.yml Outdated Show resolved Hide resolved
tasks/cockpit-tasks Show resolved Hide resolved
tasks/install-service Show resolved Hide resolved
@martinpitt martinpitt force-pushed the cleanups branch 2 times, most recently from 859e8a0 to f41df47 Compare March 13, 2024 07:26
@martinpitt
Copy link
Member Author

OK, this works now. I adjusted the commit messages, too. However, I'd like to queue it after #598, as it's gonna conflict (a little bit), and is less important.

@martinpitt martinpitt marked this pull request as ready for review March 13, 2024 08:47
@martinpitt
Copy link
Member Author

@allisonkarlitskaya This is the "first stage" review -- it may likely get some minor fixups when I deploy it, I just wanted us to agree before I change anything in prod

Copy link
Member

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are all nice cleanups — but let's see how the rollout goes. Please see the comments about how job-runner.toml in run-local.sh could be easier to understand...

One thing I'd wish for is less user-setup stuff in the containerfile.

# qcow overlays on tmpfs
'--tmpfs=/tmp:size=14g',
'--env=TEST_OVERLAY_DIR=/tmp',

# image cache
'--env=COCKPIT_IMAGES_DATA_DIR=/cache/images',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't understand how this works differently (ie: args given to outside container and then forwarded inside, with job-runner.toml created by shellscript).

I feel like it would be easier to understand (in run-local.sh) if the never-changing parts of the run-args are listed directly in the toml file and we only use the variable substitution for the parts we change when we're mocking.

Replace them with env variables in `job-config.toml`/OpenShift.
`$TEST_OVERLAY_DIR` was already set.

Also clean up the job-runner.toml construction in run-local.sh: Move the
constant parts of `container.run-args` into the template, and only use
`$run_args` for test specific adjustments.
With job-runner, all qemu jobs moved to separate containers. So we don't
need the libvirt related cleanup any more.

Note: We still need to retain the image-prune call.
issue-scan/image-refresh only prunes S3, not the local cache. We may
want to teach it to at some point.
These now just do `{tests,issue}-scan`, `run-queue`, `job-runner`, and
test statistics. None of these uses a lot of resources, the expensive
ones (tests and image refreshes) moved to job-runner.toml. So drop all
the memory/shm configuration and obsolete env vars.

Also drop the custom network. If inter-container socket leakage is still an
issue, it'll need to be done to the job containers, but hopefully that was
fixed long ago.🤞

Drop the remaining `Environment=` indirection. The env vars can just be
expanded in-place when writing the service file. This was already done for many
of them.

Still keep importing /dev/kvm into the container. It doesn't strictly
need this, but current bots `run-queue` uses the presence of this as an
indication whether or not it can consume tasks from the public/rhel
queues. This needs to be fixed first (which needs some thought). So
dropping that will happen later.
This is a deployment script on the host, it has no place inside the
container.
This separates files which affect the cockpit/tasks container image (and
hence need an image refresh) from the files for deployment and tests.

This simplifies and robustifies the "did the container change?" check in
the tests workflow.
This is hopelessly outdated and useless. Production does not use that,
and local developers just use toolbox or podman-run -- they wouldn't
have any of these secrets directories anyway.
Adjust it to current reality of secrets, job-runner, and CentOS CI resources.

Drop the remaining `$` prefixes from commands for consistency and
working copy&paste.

Move up the "using with toolbox" section, as it belongs to the
deployment part. The bottom is documentation how our system works.
@martinpitt
Copy link
Member Author

martinpitt commented Mar 13, 2024

Updated for two issues that I overlooked. The /dev/kvm was a funny one.. I deployed this now, and sent cockpit-project/bots#6079 to confirm both an image refresh and a test run, and it's happy now.

Update: I sent a queue-run fix to cockpit-project/bots#6080 . If that lands soon, I'm happy to revert the revert 😁

Update 2: Ah, this has to land first. Me refreshing the tasks container now broke bots tests, as they use the new container with the old run-local.sh code. I'll drop /dev/kvm as a follow-up.

martinpitt added a commit to cockpit-project/bots that referenced this pull request Mar 13, 2024
With job-runner, the bits that actually need /dev/kvm don't run in the
"queue monitor" task container any more, but in separate containers.
We want to drop importing /dev/kvm into the monitor container [1], so
extend the check to also accept the presence of a job-runner
configuration.

With that we can drop the check for `COCKPIT_TESTMAP_INJECT` -- that was
just a hack, and the cockpituous integration tests define
`JOB_RUNNER_CONFIG` already.

[1] cockpit-project/cockpituous#590
martinpitt added a commit to cockpit-project/bots that referenced this pull request Mar 13, 2024
With job-runner, the bits that actually need /dev/kvm don't run in the
"queue monitor" task container any more, but in separate containers.
We want to drop importing /dev/kvm into the monitor container [1], so
extend the check to also accept the presence of a job-runner
configuration.

With that we can drop the check for `COCKPIT_TESTMAP_INJECT` -- that was
just a hack, and the cockpituous integration tests define
`JOB_RUNNER_CONFIG` already.

[1] cockpit-project/cockpituous#590
Copy link
Member

@allisonkarlitskaya allisonkarlitskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the cleanup.

Let's get this in.

@allisonkarlitskaya allisonkarlitskaya merged commit 9ca717a into main Mar 13, 2024
3 checks passed
@allisonkarlitskaya allisonkarlitskaya deleted the cleanups branch March 13, 2024 12:15
allisonkarlitskaya pushed a commit to cockpit-project/bots that referenced this pull request Mar 13, 2024
With job-runner, the bits that actually need /dev/kvm don't run in the
"queue monitor" task container any more, but in separate containers.
We want to drop importing /dev/kvm into the monitor container [1], so
extend the check to also accept the presence of a job-runner
configuration.

With that we can drop the check for `COCKPIT_TESTMAP_INJECT` -- that was
just a hack, and the cockpituous integration tests define
`JOB_RUNNER_CONFIG` already.

[1] cockpit-project/cockpituous#590
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants