-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packer SSH Communicator Fails to Iterate Default KEX-algo list #12917
Comments
Hi @ferricoxide, Looking into this now, from what I can see, if undefined, the kex algorithm list defaults to what the go crypto library exports, so in the current state (testing on Packer main, which relies on
If you're using Packer 1.8.7 (Note: plugin versions may be more relevant to this issue as they're the ones establishing the SSH connection), it's possible that you get another Judging by this, the second algorithm should work in order to get an algo that is FIPS-compatible, I'm not sure yet why/if the other algos are tried by Packer/Plugins/SDK/Crypto (pick the guilty one), but I'll continue to dig. By any chance, would you be able to provide a template I can play with for testing and troubleshooting this? Something as minimal as possible would be greatly appreciated in order to dig into this. I'll try to build something in the meantime. Thanks in advance! |
By "template", are you referring to the Amazon Machine Image or the Packer builder? In the former case, any of the AWS CONUS Region AMIs owned by |
Hi @ferricoxide, In Packer lingo, "template" are the configuration files, so yeah the spel HCL templates would be it. I've tested locally on a CentOS 9 Stream qemu VM, and I'm unable to reproduce the problem (yet), it seems that Packer does negotiate an algorithm for kex, and is able to connect in my case. For reference:
Note: the kex logs are some I've manuallty added to a local copy of the crypto lib that I compile the qemu plugin with, nothing that is in the standard logs. According to the SSH code from the library we use, the first algorithm from the client's list that is supported by the server will be used, so in my case I would think I'll continue to dig into this, but it's not clear to me still where the problem lies. |
Update: I tested with the spel template for AWS (in which I removed the
I have tested with the latest AWS plugin, on the latest Packer. I'll see if 1.8.7 has the same behaviour, but if it doesn't, this could maybe be linked to the host environment? I'm testing on an Ubuntu 22.04, without additional restrictions. |
Other update: tested with Packer 1.8.7, with the AWS plugin v1.3.2 and the one I manually compiled with the extra crypto logs, same behaviour here. If you have time @ferricoxide, @eemperor or @lorengordon, I'd like to have a chat with one of you to narrow-down the issue, as I'm unable to reproduce the problem, it'll be hard (impossible even) for me to troubleshoot and fix this problem. |
Just for clarity: you tested with an image that has FIPS mode enabled? Reason I'd linked to the Current Published Images section document is because the images we produce with Packer _have) FIPS mode enabled, whereas most of the other base images we've tried over the years don't have FIPS enabled (and, only recently started having SELinux enabled "out of the box"). Just want to make sure that the reason you're not able to reproduce the issue isn't because you're using a non-FIPS base image. |
For the CentOS I have locally yes, I'm using a FIPS-enabled image; here's the template I used for reference: # Copyright (c) HashiCorp, Inc.
# SPDX-License-Identifier: MPL-2.0
packer {
required_plugins {
qemu = {
version = ">= 1.0.1"
source = "github.com/hashicorp/qemu"
}
}
}
build {
sources = ["source.qemu.example"]
provisioner "shell" {
inline = ["fips-mode-setup --check 2>&1 | grep enabled"]
}
}
source "qemu" "example" {
iso_url = "./CentOS-Stream-9-latest-x86_64-boot.iso"
iso_checksum = "none"
headless = "false"
memory = "4096"
cpu_model = "host"
boot_steps = [
["<up><tab> fips=1 inst.text<enter>", "Setup fips/text mode installation"],
["<wait40>", "wait for install prompt"],
["2<enter><wait80>", "select text mode install"],
["3<enter><wait10>3<enter><wait>1<enter>", "choose network for package installation"],
["<wait20>r<enter>", "wait for main menu refresh and mirror detection"],
["4<enter>3<enter>c<enter>c<enter><wait2>r<enter>", "select minimal software installation"],
["5<enter>c<enter>c<enter><wait>1<enter><wait>c<enter>", "setup standard partitioning scheme, use all disk"],
["9<enter>1<enter>2<enter>packer<enter><wait>", "create packer user"],
["5<enter>packer<enter><wait3>packer<enter><wait3>yes<enter><wait>", "setup password for packer"],
["6<enter>c<enter><wait>", "setup user as admin and leave"],
["b<enter>", "start installation"],
["<wait400><enter>", "wait until it ends to continue and reboot"]
]
boot_key_interval = "15ms"
disk_size = "10G"
format = "qcow2"
boot_wait = "5s"
ssh_password = "packer"
ssh_username = "packer"
ssh_wait_timeout = "20m"
vm_name = "centos9_fips"
output_directory = "centos9_fips-out"
} No problems with SSH in this case, as for the spel templates, I've tested those from the repo directly, running this exact command: |
@lbajolet-hashicorp The spel templates, currently, are set to use a non-FIPS "bootstrap" builder: https://github.com/plus3it/spel/blob/master/spel/minimal-linux.pkr.hcl#L118 This issue came about because @ferricoxide originally attempted to FIPS-enable those bootstraps, which failed. To start with a FIPS-enabled image, you could perhaps try instead to pass |
Hi @lorengordon, Thanks for the hint, sorry to say though that even with this change, I am unable to hit a SSH kex issue with the build from spel. Here's the changes I made for reference:
diff --git a/build/build.sh b/build/build.sh
index 7897655..7f156cb 100644
--- a/build/build.sh
+++ b/build/build.sh
@@ -5,11 +5,12 @@ set -u -o pipefail
echo "==========STARTING BUILD=========="
echo "Building packer template, spel/minimal-linux.pkr.hcl"
-packer build \
+PACKER_LOG=1 packer build \
-only "${SPEL_BUILDERS:?}" \
-var "spel_identifier=${SPEL_IDENTIFIER:?}" \
-var "spel_version=${SPEL_VERSION:?}" \
- spel/minimal-linux.pkr.hcl
+ -var 'aws_source_ami_filter_centos9stream_hvm={"name":"spel-minimal-centos-9stream-hvm-*.x86_64-gp3","owners"=["125523088429","174003430611","216406534498"]}' \
+ spel/minimal-linux.pkr.hcl 2>&1 | tee output.log
BUILDEXIT=$?
@@ -34,7 +35,7 @@ if [[ -n "${SUCCESS_BUILDS:-}" ]]
then
SUCCESS_BUILDERS=$(IFS=, ; echo "${SUCCESS_BUILDS[*]}")
echo "Successful builds being tested: ${SUCCESS_BUILDERS}"
- packer build \
+ PACKER_LOG=1 packer build \
-only "${SUCCESS_BUILDERS//amazon-ebssurrogate./amazon-ebs.}" \
-var "spel_identifier=${SPEL_IDENTIFIER:?}" \
-var "spel_version=${SPEL_VERSION:?}" \
diff --git a/spel/minimal-linux.pkr.hcl b/spel/minimal-linux.pkr.hcl
index ebc7545..5e32211 100644
--- a/spel/minimal-linux.pkr.hcl
+++ b/spel/minimal-linux.pkr.hcl
@@ -847,14 +847,6 @@ source "amazon-ebssurrogate" "base" {
ssh_pty = true
ssh_timeout = "60m"
ssh_username = var.spel_ssh_username
- ssh_key_exchange_algorithms = [
- "ecdh-sha2-nistp521",
- "ecdh-sha2-nistp256",
- "ecdh-sha2-nistp384",
- "ecdh-sha2-nistp521",
- "diffie-hellman-group14-sha1",
- "diffie-hellman-group1-sha1"
- ]
subnet_id = var.aws_subnet_id
tags = { Name = "" } # Empty name tag avoids inheriting "Packer Builder"
temporary_security_group_source_cidrs = var.aws_temporary_security_group_source_cidrs Invoked with Anything else I can test out so I can reproduce this behaviour? |
No, our user groups tend to equate "hardened" to "the full DISA STIG has been applied", which these images are just the basis for, but are not intended to do entirely, so we attempt to indicate as such in the description. Perhaps poorly.
Not that I can think of. You've already gone above and beyond. We'll have to revalidate ourselves and come up with a better reproduction case, if we're still able to reproduce the problem ourselves. |
Sounds good, thanks for the update! I'll wait for an update on your part regarding this, if the problem's solved itself that'd be great tbh 😄 |
Community Note
Overview of the Issue
When using the ssh-communicator to provision a FIPS-enabled target, the SSH communicator hangs and eventually times out. If one logs into the target systems and reviews the system logs, one finds errors about:
In
/var/log/secure
.The problem may be worked around by using the
ssh_key_exchange_algorithms
parameter to specify an algorithm-list that omits[email protected]
. However, this seems like it shouldn't be necessary. Since the documentation indicates that there's already a list of algorithms to try, Packer notionally should attempt to iteratively renegotiate the connection to use one of the other ones in the list. This seems to not be the actual behavior.Would request that any of:
Be implemented.
While I did notice there were other communicator issues around FIPS-enabled systems, the nature/focus of those tickets seemed to be different.
Reproduction Steps
Steps to reproduce this issue
/var/log/secure
file: messages similar to the above will be foundPacker version
1.8.7 (yes, I know that this is elderly but a few of our older job-defs won't work with newer versions: we're planning to remove them when Red Hat deprecates RHEL 7 in early summer)
Simplified Packer Template
If the file is longer than a few dozen lines, please include the URL to the
gist of the log or use the Github detailed
format
instead of posting it directly in the issue.
Operating system and Environment details
OS, Architecture, and any other information you can provide about the
environment.
Packer-executor host(s):
Packer target(s):
Log Fragments and crash.log files
Cc'ing: @lorengordon & @eemperor
The text was updated successfully, but these errors were encountered: