Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential ways to get around networking sandbox #32

Closed
mrcnski opened this issue Aug 28, 2023 · 5 comments · Fixed by #34
Closed

Potential ways to get around networking sandbox #32

mrcnski opened this issue Aug 28, 2023 · 5 comments · Fixed by #34
Assignees
Labels
high priority This should be addressed immediately

Comments

@mrcnski
Copy link

mrcnski commented Aug 28, 2023

Hello, much appreciate your work! I tried to implement a networking sandbox here inspired by birdcage. However, I was informed that there are at least a couple ways to get around it:

@cd-work
Copy link
Collaborator

cd-work commented Aug 28, 2023

Thanks for reaching out!

The current seccomp network filters on Linux certainly could use some love. Initially I was hoping that landlock-based network filters would be available sooner, but since this will probably take a while longer (especially until the kernels are propagated through various distros), I think it makes sense to put some effort into a robust seccomp network filter.

The best path forward is likely to rewrite the seccomp code to use a whitelist rather than a blacklist. This is mandatory for secure forward-compatibility and will also likely help with syscalls that get missed accidentally.

In this change, the io_uring syscalls probably need to just get disabled completely. While this is unfortunate for applications hoping to speed up their filesystem access through io_uring, I don't think there's a better way to handle this with seccomp. I belive Google has also completely disabled io_uring on Android and ChromeOS due to security issues, so application developers are hopefully somewhat aware of potential failure here anyway.

Repurposing existing sockets is probably the more difficult issue to solve, mainly since I'm not entirely sure if this is a bug or a feature. Especially with seccomp there's numerous examples that recommend acquiring "allowed" file descriptors first and then locking down everything in a sandbox to prevent all new filesystem/network access, while still allowing access to the previously opened ones. So I don't plan on making any concrete changes to accommodate this usecase, but if you have ideas that might work for all of these usecases then I'm always open for suggestions.

cd-work added a commit that referenced this issue Aug 28, 2023
This patch blocks all `io_uring` syscalls when the sandbox does not have
full networking permissions.

Closes #32.
cd-work added a commit that referenced this issue Aug 28, 2023
This patch blocks all `io_uring` syscalls when the sandbox does not have
full networking permissions.

Closes #32.
@cd-work cd-work added the high priority This should be addressed immediately label Aug 28, 2023
@louislang
Copy link

@mrcnski Thanks for the issue! If you'd like some Phylum swag as a thanks, shoot me your email ([email protected]) and I'll make it happen!

@alindima
Copy link

In this change, the io_uring syscalls probably need to just get disabled completely.

This sounds about right and is the safest thing to do. I'm not familiar with how a user would typically use birdcage but io_uring has support for something called restrictions.

They're essentially a way of filtering the possible actions on a io_uring fd, similar to seccomp.
If one really wants to use io_uring in the safest way possible, it could open the io_uring fd before installing the seccomp filter and blacklisting io_uring_setup. Then register the restrictions and whitelisting io_uring_enter, which would then filter new submission entries based on the configured restrictions

@mrcnski
Copy link
Author

mrcnski commented Aug 29, 2023

Initially I was hoping that landlock-based network filters would be available sooner, but since this will probably take a while longer

Yeah, I'm also looking forward to Landlock support for this. It seems like some work is already underway, see here and here, but may not be available for a while yet.

The best path forward is likely to rewrite the seccomp code to use a whitelist rather than a blacklist. This is mandatory for secure forward-compatibility and will also likely help with syscalls that get missed accidentally.

I'm curious how you plan to get the list of allowed syscalls. For us it is a rather tricky problem. :)

Not super familiar with Linux myself, but is there really no way to restrict a single application with a simple firewall? 🤔

cd-work added a commit that referenced this issue Aug 29, 2023
This patch blocks all `io_uring` syscalls when the sandbox does not have
full networking permissions.

Closes #32.
@cd-work
Copy link
Collaborator

cd-work commented Aug 29, 2023

I'm curious how you plan to get the list of allowed syscalls. For us it is a rather tricky problem. :)

My plan was to start by looking at Docker's seccomp filter and the syscall table and just go through them by hand. Docker somewhat has a different purpose but it should show which syscalls might not be necessary for things to work.

Not super familiar with Linux myself, but is there really no way to restrict a single application with a simple firewall? 🤔

With elevated permissions you can do some more stuff, but I'm not familiar with any mechanism to self-restrict that would be easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority This should be addressed immediately
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants