- Status: rejected
- Type new: feature
- Related components: crust
- Start Date: 22-09-2015
- RFC PR: #46
- Issue number: Active - #49
In this text we describe an implementation of UDP hole punching process to allow for P2P connections of nodes behind NATs in a fully decentralised network.
Most users of the Crust library are expected to run the software on home PCs connected to the internet through a router. Routers usually implement some kind of a firewall to prevent anyone from the outside internet to connect to user's PC directly. As Crust is a library that concentrates at P2P connections, it should try available techniques to circumvent these firewall restrictions and UDP hole punching is one such technique.
Additionally, traditional UDP hole punching process involves a well known server to tell each peer's their external UDP endpoints. Since our intent is a fully decentralised system, this text also describes how nodes to which we're already connected can be used as a replacement for such server.
In the following text the reader is expected to know basic NAT types and how they work. The section 'Methods of translation' on wikipedia gives a nice overview.
The UDP hole punching process described here consists of two steps:
- Finding an external mapping of a UDP socket
- Do the actual hole punching.
Suppose, we have two nodes A
and B
, where each one is possibly behind a NAT and
they want to connect to each other. Routing needs to send A
an endpoint of B
and
vice versa. But A
and B
don’t know their public
endpoints a priori (they may know their IP addresses from previous connection establishments,
but not their ports). Each node therefore needs to create a UDP socket and use it to ask some
other node it is connected to what the public endpoints of the UDP socket is. We’ll refer
to such helping nodes C(A)
and C(B)
(it needs not be the same node for A
and B
).
We'll call U(C(X))
the UDP port on which X can contact C(X)
in order to find its public
endpoint. X
finds out about U(C(X))
through the initial handhsake, that is, when
X
connects to C(X)
.
In the following text we’ll describe steps taken at node A
, but the same steps need to
be taken at node B
.
For A
to find out its external address A
first needs to create a UDP socket S(A)
with an arbitrary local port. A
will then use S(A)
to periodically send
datagrams to U(C(A))
containing a MAGIC number and the ID of the request. Each time
C(A)
receives such datagram, it sends back a datagram containing public endpoint
of S(A)
together with the request ID.
The above protocol shall be initiated from upper layers by calling
Service::get_mapped_udp_socket(&self, result_token: u32)
And the result of such call shall be an event OnUdpSocketMapped
holding a structure
struct MappedUdpSocket {
result_token: u32,
udp_socket: UdpSocket,
public_addresses: BTreeSet<SocketAddr>, // of node A
}
Once upper layers receive such event, they can send/route MappedUdpSocket::public_addresses
to node B
. Once B
does the same, and upper layers receive B
’s public endpoint, upper
layers are ready for the actual hole punching.
The act of hole punching shall be initiated by a function with the following signature:
Service::udp_punch_hole(&self,
result_token: u32,
udp_socket : UdpSocket,
secret: Option<[u8; 4]>,
peer_addrs : mut BTreeSet<SocketAddr> /* of node B */)
This call will initiate reading on the udp_socket
and will also
start periodically sending small datagrams to the peer_addr
.
These datagrams will be of type:
struct HolePunch {
secret: Option<[u8; 4]>,
ack: bool,
}
At this stage, we may receive the above datagram either from peer_addr
or from some other endpoint. If the later, it can mean one of two things:
- The other end is behind a Symmetric NAT and we're behind a Full-cone or no NAT.
This is OK but we need to adjust the
peer_addr
variable to point to this new peer. - Some other application is sending us irrelevant or malicious packets. We
can distinguish this case by matchin the
secret
passed to the function as an argument withdatagram.secret
. If secrets don't match, we ignore the datagram.
Once we receive a datagram from the remote peer, we know our hole has been punched. Since the other end is waiting for our datagram as well, we need to make a "best effort" to ensure the other end receives our packet as well. That is, we need to make sure at least one of the following conditions is true:
- We sent our datagram to
peer_addr
K
times (for some predefinedK
). - We receive
HolePunch
packet with theack
field set to true.
Note that the condition ((1) or (2)) will always be satisfied eventually,
so the decision whether hole punching was successfull or not shall
be based purely on the fact whether we have received a datagram or not, that is,
once ((1) or (2)) is true, we check whether we've received a datagram,
if we did, we call the callback
with (udp_socket, Ok(peer_addr))
,
otherwise we call it with an appropriate io::Error
.
Result of this call shall be sent to the user as an Event::OnHolePunched
holding
the following structure:
struct HolePunchResult {
result_token: u32,
udp_socket: UdpSocket,
peer_addr: io::Result<SocketAddr>,
}
After a successful hole-punch, it is assumed that the two peers can communicate with each other over the unreliable UDP protocol. At this point we'd like to use that UDP socket with protocols such as uTP for added reliability.
Unfortunatelly, the rust-utp library doesn't currently support rendezvous connections, so this functionality will need to be added.
The alternative approach described in the next section is simpler in that it doesn't require two async calls, instead, it uses only one. Other than that, the simpler approach is limited in number of NAT types it can successfully punch through.
Another approach would work with the use of multiplexing. That is, if we had a UDP socket
which is already connected to a remote peer, we could also use it to communicate
with other peers as the hole has already been punched. Problem with this approach is
that it would only help with Full-cone NAT
types because it is the only NAT
type where a hole punched to one peer/host can be reused with another peers/hosts.
Here are the available options:
-
Enum events over channels. This is what Routing is currently using. It is an improvement from the visitor pattern we had before, but is still very hard to use. The main disadvantage is that with this approach it is most inconveniet to combine two or more async calls into one. E.g. in our current context, we'd like to combine the
get_mapped_udp_socket
call with theudp_punch_hole
call. Using event enums approach, user needs to hold a state outside of the Crust library to know what should happen afterget_mapped_udp_socket
event arrives. -
Coroutines: these are nice primitives to work with, they do allow for two or more coroutines to be combined together. However they are not generic enough to express all different combinations asynchronous programming has to offer. I.e. coroutines only allow for sequentinal combination of coroutines, more advanced patterns (such as "start many async actions and continue when all of them finish") need to be complemented with other approaches.
-
Futures: these (AFAIK) have bigger combinatorial power than coroutines and many libraries already ship with these patterns abstracted (think of functions like
when_any
orwhen_all
). That said, they have the disadvantage that they combine blocking and non blocking paradigms, resulting in slower code (read this for more info on the topic). -
Continuation Passing Style: Callbacks are ligthweight, fast and offer biggest combinatorial power. This comes with the price that they are often referred to as hard to read. On the other hand, the combinatorial power allows for complex and hard-to-read patterns to be hidden behind other callback taking functions with more descriptive names.
One criterion for C(X)
is that it should not be on the same local network as X
.
For this we could use the getifaddrs
function, it gives us SocketAddrs
of our network interfaces, plus the netmasks they use. This information should
be enough to determine whether C(X)
is in the same subnet as a given
interface.
If we allow for C(X)
to be more than one node, we could get a reponse quicker
and more reliable. Additionally getting more than one response can reveal
the information whether we're behind a Symmetric NAT as two different
C(X)
nodes will disagree in port number in such cases.
On the other hand, if X is behind a Symmetric NAT, then contacting
multiple C(X)
s would disable port prediction.
/// Generate a new number each time this function is called, the new
/// number shall allways be (1+previous_number) and the first
/// number shall be generated by random. Once the maximum number is
/// reached, next number will be 0.
fn State::generate_request_id(&mut self: State) -> u32
/// Returns a vector of SocketAddrs pointing to U(C(X)),
/// the vector should be sorted so that addresses that
/// are not on our LAN are first (use `getifaddr` here).
fn State::sort_helping_nodes_by_preference(&self) -> Vec<SocketAddr>
struct PeriodicSender {
udp_socket: UdpSocket,
payload: Vec<u8>,
times_sent: u32,
// Specifies how many times the payload should be sent.
// If set to None, sends indefinitely.
times_to_send: Option<u32>,
}
impl PeriodicSender {
/// Starts periodicaly sending `what` `times_to_send` times.
fn start<T: Serializable>(&mut self, where: SockAddr, times_to_send: Option<u32>, what: T);
fn stop();
/// Resets the payload, `times_sent` will remain unchainged.
fn reset_payload<T: Serializable>(&mut self, what: T);
/// Wait til times_sent == times_to_send.unwrap()
/// (panic if times_to_send == None and this function is called)
fn block_until_finished(&self);
}
/// Stop sending when the sender is dropped.
impl Drop for PeriodicSender;
fn blocking_get_mapped_udp_socket(request_id: u32, helper_nodes: Vec<SocketAddr>)
-> (UdpSocket, BTreeSet<SocketAddr>)
const TIMES_TO_SEND = 5;
let udp_socket = UdpSocket::bind("0.0.0.0:0");
let mut addrs = BTreeSet::new();
let periodic_sender = PeriodicSender::new(udp_socket);
// This happens when `sort_helping_nodes_by_preference` returns an empty
// vector. An empty vector is given when only nodes from the same network are
// found.
if helper_nodes.len() == 0 {
// use `getifaddrs` to return the addresses of our interfaces
}
for cx in helper_nodes {
periodic_sender.start(cx, TIMES_TO_SEND, GetExtAddr::new(request_id));
loop {
let our_ext_address = match udp_socket.timed_blocking_read(2000ms) {
Err(_) => {
break;
},
Ok(datagram) => {
if datagram.request_id != request_id { continue }
if datagram.from != cx { continue }
datagram.ext_address
}
}
addrs.insert(our_ext_address);
return (udp_socket, addrs);
}
}
(udp_socket, addrs)
}
// The non blocking version that users of the Crust library will use.
pub fn Service::get_mapped_udp_socket(&self, result_token: u32) {
Self::push(&self.cmd_sender, move |state| {
let request_id = generate_request_id();
let event_sender = state.event_sender.clone();
let helpers = self.sort_helping_nodes_by_preference();
thread::spawn(move || {
let (socket, mapped_addresses) = blocking_get_mapped_udp_socket(request_id, helpers);
event_sender.send(Event::OnUdpSocketMapped(MappedUdpSocket::new(result_token, socket, mapped_addresses)));
});
});
}
fn blocking_udp_punch_hole(udp_socket : UdpSocket,
secret: Option<[u8; 4]>,
peer_addr : mut SocketAddr /* of node B */)
-> (UdpSocket, Result<SocketAddr> /* peer's address */) {
const TIMES_TO_SEND = 5;
let mut periodic_sender = PeriodicSender(udp_socket);
periodic_sender.start(peer_addr,
TIMES_TO_SEND,
HolePunch::new(secret, false));
loop {
match udp_socket.timed_blocking_read(2000ms) {
Ok(datagram) => {
if datagram.secret != secret { continue }
if datagram.ack {
// He received our packet and we received his, we're done.
return (udp_socket, Ok(datagram.from));
}
if datagram.from != peer_addr {
peer_addr = datagram.from;
periodic_sender.start(peer_addr,
TIMES_TO_SEND,
HolePunch::new(secret, true));
}
else {
// New payload with `received` set to true.
periodic_sender.reset_payload(HolePunch::new(secret, true));
}
periodic_sender.block_until_finished();
break;
},
Err(what) => {
return (udp_socket, Err(what));
},
}
}
(udp_socket, Ok(peer_addr))
}
// The non blocking version that users of the Crust library will use.
pub fn Service::udp_punch_hole(&self,
result_token: u32,
udp_socket: UdpSocket,
secret: Option<[u8,4]>,
peer_addrs: mut BTreeSet<SocketAddr> /* of node B */) {
Self::push(&self.cmd_sender, move |state| {
let event_sender = state.event_sender.clone();
thread::spawn(move || {
for peer_addr in peer_addrs {
let (socket, peer_addr) = blocking_udp_punch_hole(udp_socket,
secret,
peer_addr);
event_sender.send(Event::OnHolePunched(HolePunchResult::new(result_token,
udp_socket,
peer_addr)));
}
});
});
}
When X
connects to C(X)
, the handshake exchanged contains U(C(X))
. The
handshake represents U(C(X))
like the following:
struct Handshake {
ucx: u16 // `U(C(X))`
}
There is a scenario where this representation may cause trouble. This scenario
is when the running device contains more than one network interface, and each of
these interfaces was connected to a different gateway (NAT). In this scenario,
it's possible that UPnP IGD is running only on one interface, but X
is
connected to C(X)
using the other interface, then the wrong address is
assumed.
The problem is that port doesn't replace the socket address. It's unlikely that
this problem would be solved just by the replacement of u16
with
SocketAddr
. We'd need to properly identify our public address (using UPnP IGD,
the internal Crust messaging infrastructure or other techniques) and the process
may fail (so it'd be good if we keep the possibility of just sending the port).
Also, this update would give us a list of public endpoints to report, not just a
single one. Then we also need to update ucx
to be a list of addresses, not
just a single address. Some of the addresses might be unusable, but we report
them all anyway because it's too costy to detect which ones are undesirable.
This problem is not solved in this RFC and we go with the structure presented previously that might cause trouble. The problem happens in a scenario that is really uncommon for PCs. So it's fine to keep it for now and increase robustness later.
Before we do enhance the protocol/behaviour, we need to have all corner cases sorted.