Add support for Interpartition Communication Drivers #14

dadada · 2022-12-06T18:46:26Z

The hypervisor should support initializing and attaching a device driver to a partition, as specified by ARINC 653P5-1. The device driver should be accessible to other partitions through sampling- or queueing-ports and execute inside the partition it is attached to.

In the case of apex-linux, a device driver that exposes the receive / send system calls of an UDP-Socket could be useful for development purposes. The semantics of UDP-Sockets can be pretty much directly translated to the semantics of UDP-Sockets, since both are message-based. The following semantics should be implemented by a partition that is handed a UDP "device driver".

Messages that are received on a UDP-Socket should be available immediately on the associated sampling port. This can be achieved by writing them to the sampling port source as-soon-as they are received, and keeping the contents of the port the same, unless another message is received on the UDP-Socket.

For sending via UDP-Sockets, each write to the sampling port should lead to exactly one successful send for the UDP-Socket. If the content of the sampling port does not change (no newer message was written to the sampling port), no further UDP-Message should be sent.

The text was updated successfully, but these errors were encountered:

cvengler · 2022-12-09T13:35:40Z

I will try to see how this can be accomplished through namespaces.

cvengler · 2022-12-09T15:10:18Z

I've began working on this within the network_namespaces branch. See 2d46f8a

cvengler · 2022-12-09T18:03:48Z

I have investigated this a bit further and it looks more complicated than I originally thought, but it's definitely do-able. 😄

It looks like that the man 7 netlink sockets are what Linux uses, in order to "modify the routing tables (both IPv4 and IPv6), IP addresses, link parameters, neighbor setups, queueing disciplines, traffic classes, and packet classifiers".
There is a popular Rust crate for this (1.000.000+ downloads).

The fishy 🐟 things that occurs however, is that after the call to clone3(2) was made, and the process has split itself into two, the child and parent both need to coordinate, in order to create a veth(4) interface for the child and for the parent. Speaking more precisely, the parent needs to create the veth(4) interface and move it to the children net namespace. After that, it needs to tell the child to proceed with the preparations of running the binary.

Starting next week, I'll spend my two work days working on a netlink module within the core crate. It should be capable of creating veth(4) pairs, moving interfaces between namespaces, and setting up the loopback interface. The IPC between parent and child is fairly simple. A socket pair will be enough and the child will wait, until it receives an OOB, because the other FD has been closed.

The only thing I am asking myself right now: Is veth(4) a good choice? AFAIK, it is only possible to use it to create a new interface on both ends, but not to connect with an existing interface. Beside this, am I overseeing anything else?

/cc @wucke13 @dadada

dadada · 2022-12-09T18:32:00Z

That sounds great! I think for testing and development purposes, a veth is sufficient or even more practical than a physical network interface. If we want to use the other side (on the host / hypervisor) for integration testing, we should be able to do that with Linux. For example, we might add the interface to a software bridge where multiple hypervisors are connected.

What information does the veth expose to the partition (e.g. timings of individual transmissions, if the frame was sent, if the interface is busy)? I have a use-case where that information might be relevant.

dadada · 2022-12-10T14:30:13Z

The partition process would probably also need CAP_NET_RAW to send raw Ethernet frames on the interface. The hypervisor would have to allow it in the capability bounding set. The effective capabilities could then be controlled for example by the file capabilities. How the effective capabilities are calculated is documented in capabilities(7).

P'(permitted) = (P(inheritable) & F(inheritable)) |
                (F(permitted) & cap_bset)

P'(effective) = F(effective) ? P'(permitted) : 0

P'(inheritable) = P(inheritable)    [i.e., unchanged]

dadada · 2022-12-10T14:47:56Z

Currently, all partitions seem to be started with =ep, which, if I understand cap_from_text(3) correctly, means that the process has no effective or permitted capabilities.

Another question would be how we might specify which capabilities are allowed for which partition. Maybe we could specify this in the configuration file?

dadada · 2022-12-10T15:04:18Z

Setting the file capabilities for the hypervisor partition probably won't work, since the executable is copied to the file system of the partition process.

dadada · 2022-12-10T15:19:45Z

Setting the file capabilities for the hypervisor partition probably won't work, since the executable is copied to the file system of the partition process.

I've checked. If I understand correctly, we would have to either set the file capabilities when copying the executable to the file process root or add CAP_NET_RAW to the permitted set cap_net_raw=i and set cap_net_raw=ep in the partition process.

dadada · 2022-12-18T16:24:17Z

@emilengler: @wucke13 and I were wondering if it would be better to use UDP sockets for this at first, just to get things working initially. It's probably a lot easier than working with capabilities and veth, although being able to directly use Ethernet would be more useful, since it is closer to the production environment.

cvengler · 2022-12-18T16:30:01Z

@emilengler: @wucke13 and I were wondering if it would be better to use UDP sockets for this at first, just to get things working initially. It's probably a lot easier than working with capabilities and veth, although being able to directly use Ethernet would be more useful, since it is closer to the production environment.

What do you mean by UDP sockets exactly? A socket to communicate from the host to the partition and vice-versa?

dadada · 2022-12-18T19:43:43Z

What do you mean by UDP sockets exactly? A socket to communicate from the host to the partition and vice-versa?

No, what I mean is a socket that can be used to send or receive datagrams from the network. The hypervisor may create such a socket and pass it to the partition process using an AF_UNIX socket.

dadada · 2022-12-28T14:06:42Z

Here is a draft for this using UDP sockets. #24

dadada · 2024-02-29T16:43:19Z

We have something like that now by sending TCP and UDP sockets to partitions.

cvengler self-assigned this Dec 9, 2022

dadada closed this as completed Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Interpartition Communication Drivers #14

Add support for Interpartition Communication Drivers #14

dadada commented Dec 6, 2022 •

edited

Loading

cvengler commented Dec 9, 2022

cvengler commented Dec 9, 2022

cvengler commented Dec 9, 2022

dadada commented Dec 9, 2022

dadada commented Dec 10, 2022

dadada commented Dec 10, 2022

dadada commented Dec 10, 2022

dadada commented Dec 10, 2022

dadada commented Dec 18, 2022

cvengler commented Dec 18, 2022

dadada commented Dec 18, 2022

dadada commented Dec 28, 2022

dadada commented Feb 29, 2024

Add support for Interpartition Communication Drivers #14

Add support for Interpartition Communication Drivers #14

Comments

dadada commented Dec 6, 2022 • edited Loading

cvengler commented Dec 9, 2022

cvengler commented Dec 9, 2022

cvengler commented Dec 9, 2022

dadada commented Dec 9, 2022

dadada commented Dec 10, 2022

dadada commented Dec 10, 2022

dadada commented Dec 10, 2022

dadada commented Dec 10, 2022

dadada commented Dec 18, 2022

cvengler commented Dec 18, 2022

dadada commented Dec 18, 2022

dadada commented Dec 28, 2022

dadada commented Feb 29, 2024

dadada commented Dec 6, 2022 •

edited

Loading