Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add hhfab switch reinstall and power subcommands #256

Merged
merged 13 commits into from
Jan 15, 2025
Merged

Conversation

pau-hedgehog
Copy link
Contributor

@pau-hedgehog pau-hedgehog commented Dec 9, 2024

Adds switch power|reinstall suboptions to vlab in hhfab.

  • vlab switch power enables power management via external PDU servers. For example:
hhfab vlab switch --verbose power --yes --all ON
hhfab vlab switch --verbose power --yes --name leaf-01 CYCLE

A new annotation structure (power:) is supported:

core@control-1 ~ $ kubectl get switch leaf-01 -o=jsonpath='{.metadata.annotations}' | jq
{
  "power.hhfab.githedgehog.com/psu1": "http://192.168.101.10/outlet/1",
}

The pdu credentials are configured in a YAML file in the same folder as fab.yaml with the following structure:

$ cat .pdu.yaml 
pdus:
  192.168.101.10:
    user: user10
    password: password10
  192.168.101.11:
    user: user11
    password: password11
  • vlab switch reinstall enables automated NOS reinstall of switches. For example:
hhfab-power vlab switch -v reinstall --yes --all --mode hard-reset
hhfab-power vlab switch -v reinstall --yes --name leaf-01

The default mode (reboot) logs into the switch console, reboots and goes into ONIE NOS install mode to reboot once more and let the switch ready to be managed by the Control node.

The hard-reset mode relies on the power CYCLE operation to cut power briefly for the given switch(es).

Depends On githedgehog/fabric#689

@pau-hedgehog pau-hedgehog self-assigned this Dec 10, 2024
@pau-hedgehog pau-hedgehog force-pushed the power_helper branch 8 times, most recently from 6c20b0e to 417d628 Compare December 12, 2024 20:17
@pau-hedgehog pau-hedgehog marked this pull request as ready for review December 12, 2024 21:21
pkg/pdu-mgt/go.mod Outdated Show resolved Hide resolved
pkg/pdu-mgt/utils/utils.go Outdated Show resolved Hide resolved
pkg/pdu-mgt/utils/utils.go Outdated Show resolved Hide resolved
pkg/pdu-mgt/utils/utils.go Outdated Show resolved Hide resolved
cmd/hhfab/main.go Outdated Show resolved Hide resolved
cmd/hhfab/main.go Outdated Show resolved Hide resolved
cmd/hhfab/main.go Outdated Show resolved Hide resolved
pkg/pdu-mgt/go.sum Outdated Show resolved Hide resolved
@pau-hedgehog pau-hedgehog force-pushed the power_helper branch 2 times, most recently from 09bf259 to cf15cf1 Compare December 18, 2024 16:11
@pau-hedgehog pau-hedgehog marked this pull request as draft December 19, 2024 15:03
@pau-hedgehog pau-hedgehog force-pushed the power_helper branch 2 times, most recently from 224d6be to ed33f61 Compare December 22, 2024 09:49
@pau-hedgehog pau-hedgehog marked this pull request as ready for review December 22, 2024 10:20
@pau-hedgehog pau-hedgehog force-pushed the power_helper branch 2 times, most recently from 1c4a0c2 to 5b326db Compare December 23, 2024 19:06
@pau-hedgehog pau-hedgehog force-pushed the power_helper branch 2 times, most recently from c1618fd to 7d00253 Compare December 31, 2024 14:16
@pau-hedgehog pau-hedgehog marked this pull request as draft January 2, 2025 21:59
@pau-hedgehog pau-hedgehog marked this pull request as ready for review January 2, 2025 22:30
@pau-hedgehog pau-hedgehog requested a review from Frostman January 4, 2025 00:48
@pau-hedgehog pau-hedgehog marked this pull request as draft January 13, 2025 23:19
@pau-hedgehog
Copy link
Contributor Author

pau-hedgehog commented Jan 13, 2025

After discussing, the default behavior needs to change and reinstall must leave the switches in the ONIE install loop.

This way one can trigger the reinstall and then do the vlab up. Once the control node is ready, the switches will be finally reinstalled.

The initial behavior will be preserved with a flag --wait-ready that will be relayed to the GRUB script.

@pau-hedgehog pau-hedgehog mentioned this pull request Jan 13, 2025
@pau-hedgehog pau-hedgehog changed the title add hhfab options for power management add hhfab switch reinstall and power subcommands Jan 14, 2025
@pau-hedgehog pau-hedgehog marked this pull request as ready for review January 14, 2025 00:35
@pau-hedgehog
Copy link
Contributor Author

pau-hedgehog commented Jan 14, 2025

After discussing, the default behavior needs to change and reinstall must leave the switches in the ONIE install loop.

This way one can trigger the reinstall and then do the vlab up. Once the control node is ready, the switches will be finally reinstalled.

The initial behavior will be preserved with a flag --wait-ready that will be relayed to the GRUB script.

I tried this but hhfab requires vlab up:

ubuntu@env-1:~/hhfab$ ./hhfab-pow-reinst vlab switch reinstall --all --yes 
07:37:04 INF Hedgehog Fabricator version=v0.32.1-21-gec342fd8-dirty-a72344
Enter username: admin
Enter password: 
07:37:10 INF Wiring hydrated successfully mode=if-not-present
07:37:10 ERR reinstall failed: preparing VLAB: VLAB directory does not exist: "/home/ubuntu/hhfab/vlab"

Shall we go with the original approach, @Frostman ?

@pau-hedgehog pau-hedgehog force-pushed the power_helper branch 2 times, most recently from 66499ee to c6ca534 Compare January 14, 2025 22:22
pau-hedgehog and others added 9 commits January 14, 2025 23:03
add vlab switch power suboptions for PDU
power management based on annotations

the pdu IPs and credentials are
stored in a file named .pdu.yaml
in the same folder as fab.yaml

Signed-off-by: Pau Capdevila <[email protected]>
different reinstall modes are available,
reload mode uses credentials to log into
the switch (current default)
soft-reset uses an agent based power reset
hard-reset uses a PDU based power reset

verbose option allows to monitor parallel
reinstall process. Opens in byobu if exists

Signed-off-by: Pau Capdevila <[email protected]>
Co-authored-by: Emanuele Di Pascale <[email protected]>
Signed-off-by: Pau Capdevila <[email protected]>
rename reload mode to reboot

Signed-off-by: Pau Capdevila <[email protected]>
Reinstall now will not wait until switches are ready
This allows requesting reinstall and letting switches
enter the ONIE discovery loop

This way you can trigger reinstall and then do vlab up

Original behavior is preserver using --wait-ready flag

Signed-off-by: Pau Capdevila <[email protected]>
Signed-off-by: Sergei Lukianov <[email protected]>
Use Opts Struct Pattern and improve error handling

Signed-off-by: Pau Capdevila <[email protected]>
@Frostman Frostman force-pushed the power_helper branch 2 times, most recently from 3f8983c to 4f79e2b Compare January 15, 2025 08:19
pau-hedgehog and others added 3 commits January 15, 2025 09:35
Adds capability to run hhfab whole hardware test:

hhfab vlab up -v --ready switch-reinstall --ready \
setup-vpcs --ready test-connectivity --ready exit

switch-reinstall is equivalent to run hhfab vlab
switch reinstall --all --yes -mode hard-reset

Signed-off-by: Pau Capdevila <[email protected]>
Signed-off-by: Sergei Lukianov <[email protected]>
Signed-off-by: Sergei Lukianov <[email protected]>
@pau-hedgehog pau-hedgehog force-pushed the power_helper branch 2 times, most recently from 121b0f0 to 5f54e35 Compare January 15, 2025 15:06
pkg/hhfab/pdu/go.sum Outdated Show resolved Hide resolved
.github/workflows/ci.yaml Outdated Show resolved Hide resolved
cmd/hhfab/main.go Outdated Show resolved Hide resolved
cmd/hhfab/main.go Outdated Show resolved Hide resolved
@Frostman Frostman merged commit 462abdb into master Jan 15, 2025
26 checks passed
@Frostman Frostman deleted the power_helper branch January 15, 2025 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants