Optimistic Provide to replace legacy provide operation #43
Labels
kind/enhancement
New feature or request
scope/nicetohave
Nice to have feature going beyond go-libp2p-kad-dht
This issue describes an alternative provide process making use of Optimistic Provide and mitigating CID eclipsing attacks.
Current Provide Operation
The current IPFS/libp2p provide operation (as described in the spec) consists in finding the
K
(K
has multiple meanings, andK=20
in most IPFS implementations) closest peer in XOR distance to the provided content, or in other words theK
peers whose Kademlia's identifier is the closest to the CID's Kademlia identifier. Once theseK
peers are identified, the content provider sends them aADD_PROVIDER
RPC, asking them to record that the provider is providingCID
.Some problems of the current provide operation
It is harder than it seems to implement correctly. One of the reasons, is that among the 20 closest peers, some of them will be unreachable/offline. The
GET_PROVIDERS
RPC, used to find the 20 closest peers to a Kademlia key preimage (e.g apeer.ID
orcid.Cid
) can only return the 20 closest peers to a key. There are no guarantee that it will discover the 25 closest peers. And if the 20 closest peers contain unreachable nodes, then the provider record can not be allocated to 20 peers, or at least not to the 20 closest peers that are reachable.Moreover, it will be inefficient to wait for the lookup query to be over before sending the
ADD_PROVIDER
RPC. The lookup process may take a very long time if we need to wait for the timeout of unreachable peers. Hence, a user needs to wait for seconds before the provider record is actually sent to DHT servers.Improvement suggestion
Given a network size estimator (like the one included in Optimistic Provide), we can compute the average distance from any Kademlia key, in which there will be on average$2^{256}$ in IPFS/libp2p) multiplied by
K
peers, namely there areK
peers located in the range fromkey
tokey+dist
in the XOR keyspace. If the network size estimator can exclude the unreachable peers, or the percentage of unreachable peers in the network is known, we can determinedist
such that for all Kademlia keys there are on averageK
reachable peers in a range of lengthdist
.dist
is computed as the maximal XOR distance (K
divided by the number of (reachable) peers in the network.The query process to find the closest peers to the provided Kademlia
key
(defined ashash(cid)
in IPFS), should discover all peers betweenkey
andkey+dist
, even if it means looking up multiple Kademlia keys (each lookup can return AT MOST 20 reachable peers). A method to discover all peers within a given Kademlia range is described here.As soon as a peer within
dist
ofkey
answers to a lookup, we can immediately send it aADD_PROVIDER
RPC.This technique is also an elementary mitigation against CID eclipse attacks. If an attacker creates 20 nodes located around a victim
key
in the keyspace, the provider ofkey
will allocate on average 20+20=40 provider records, because it doesn't care about the absolute 20 closest peers, but rather about the distance to thekey
. So there should always be honest peers storing provider records, it may be harder for the requester to find them during an attack though.However, the distribution of the peers in the keyspace isn't exactly uniform, and there may be more or less than
K
peers within thedist
from akey
. It is thus important to have new peer ids as uniformly distributed as possible, as described in the Power of two choices. In most cases it is not a problem if there are18
or22
provider record replicas instead of20
.This allows provider records to be available earlier to the DHT, and improves security against CID eclipsing attacks.
Lookup termination
Lookup termination when looking for the
K
closest peers to a key, or when looking for akey
that isn't found immediately is also challenging. We do not want to query all peers that we discover during the lookup, because some of them will be very far away from thekey
and will not provide useful answers. However, there are some peers that we should query, that are not discovered by the lookup, that would provide useful results.A good candidate for lookup termination would be to fully explore the keyspace between
key
andkey+dist
. This might require looking up for multiple keys, and can be achieves as described here. Once this part of the keyspace has been fully explored, and if the provider records hasn't been found, the lookup can be cancelled, there is only a negligible probability that the provider record is available outside of this zone.This lookup termination mechanism is simple and well defined.
Provide Interface
It would be great in the context of a multi thread application (e.g kubo/boxo) to have a
Provide(cid.Cid) chan peer.ID
function. As provider records are allocated to remote peers, theProvide
function write thepeer.ID
of the DHT servers storing the provider record on the channel. The channel is closed when the provide operation is over. This allows the caller to know how many (and which) peers are storing the provider record so far, and doesn't need to wait for the provide operation to be finished before returning.How about
IPNS
?We could use the same technique for the
PUT
RPC used byIPNS
.References
The text was updated successfully, but these errors were encountered: