Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated e2e testing in a4x2 #6749

Open
1 of 7 tasks
rcgoodfellow opened this issue Oct 2, 2024 · 4 comments
Open
1 of 7 tasks

Automated e2e testing in a4x2 #6749

rcgoodfellow opened this issue Oct 2, 2024 · 4 comments
Assignees

Comments

@rcgoodfellow
Copy link
Contributor

rcgoodfellow commented Oct 2, 2024

Omicron #4585 introduces very basic automated testing for the a4x2 topology in CI. However, this needs to expand beyond simple "does it work" tests.

Update: automated testing need to be removed due to overloading CI resources. We'll need to solve that problem to make forward progress here

The following is a list of things we should cover.

Networking

  • Probe all sleds on rack startup.
  • Scrimlet cold boot
  • Adding and removing switch port/link configuration.
  • Adding, removing and updating BFD configuration.
  • Adding, removing and updating BGP configuration.
  • Adding, removing and updating static routes.
  • Ensuring mupdate works across versions

Tests from other categories to be added here.

@davepacheco
Copy link
Collaborator

Related: https://github.com/oxidecomputer/omicron/tree/main/live-tests

I think @faithanalog has been looking at some steps toward running these in CI.

@faithanalog
Copy link
Contributor

faithanalog commented Oct 3, 2024

indeed I have, though I wasn't aware of the existing efforts in 4585. that'll be helpful

@faithanalog faithanalog self-assigned this Oct 3, 2024
@faithanalog
Copy link
Contributor

I've been working on what is essentially a rust version of a4x2-prepare.sh , with the intent to then create what would essentially be a rust version of a4x2-deploy.sh

I think doing this (but saving myself some effort by referencing the existing scripts) is probably the thing to keep doing, particularly because I can make them easy to run on a local dev workstation as well, so someone trying changes can get the same results without having to round trip through buildomat while iterating. That someone being me for now; I don't feel like roundtripping through buildomat while getting stuff working :P

@faithanalog
Copy link
Contributor

https://github.com/oxidecomputer/omicron/tree/artemis/a4x2-package This is the branch I've been working on the rust port in. Not ready for review yet, I'm just tying it to this issue.

I have it building an a4x2 bundle, and deploying it with working control plane, working locally on my machine, so that's pretty nice. Next steps are

  • get it running the live-tests & commtest
  • get a CI runner to do it successfully
    • But we will be turning this back off until we solve the CI resource contention problem with more workers or a nightly run option

We'll be able to improve the xtasks while running locally, even without CI running the tests. Should help out people who are interested in trying things on a4x2 but haven't found the time to sit down and learn it in depth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants