Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DHT Server transaction timeout #3

Open
arpitjindal97 opened this issue Jul 23, 2022 · 12 comments
Open

DHT Server transaction timeout #3

arpitjindal97 opened this issue Jul 23, 2022 · 12 comments

Comments

@arpitjindal97
Copy link

arpitjindal97 commented Jul 23, 2022

Following the code from this file:
https://github.com/xgfone/bt/blob/master/dht/dht_server_test.go

everything is same, main looks like this:

func main() {
        pm := newTestPeerManager()
	server1, err := newDHTServer(metainfo.NewRandomHash(), "0.0.0.0:9001", pm)
	if err != nil {
		fmt.Println(err)
		return
	}
	defer server1.Close()

	server2, err := newDHTServer(metainfo.NewRandomHash(), "0.0.0.0:9002", nil)
	if err != nil {
		fmt.Println(err)
		return
	}
	defer server2.Close()

	go server1.Run()
	go server2.Run()
	time.Sleep(time.Second * 5)
	server1.Bootstrap([]string{"router.bittorrent.com:6881"})
	server2.Bootstrap([]string{"127.0.0.1:9001"})

	infohash := metainfo.NewHashFromString("4a42c5712ec4b810b162460f40cce1c6072b37ad")

	for true {
		time.Sleep(time.Second * 10)
		fmt.Print("Server1 Node4Num: ")
		fmt.Println(server1.Node4Num())

		fmt.Print("Server2 Node4Num: ")
		fmt.Println(server2.Node4Num())

		server2.GetPeers(infohash, func(r dht.Result) {
			if len(r.Peers) == 0 {
				fmt.Printf("no peers for %s\n", infohash)
			} else {
				for _, peer := range r.Peers {
					fmt.Printf("%s: %s\n", infohash, peer.String())
				}
			}
		})

	}
}

output:

Server1 Node4Num: 2
Server2 Node4Num: 1
127.0.0.1:9002 is searching 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
2022/07/23 19:54:28 transaction '2' timeout: query=find_node, raddr=34.206.39.153:6881
Server1 Node4Num: 2
Server2 Node4Num: 30
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
2022/07/23 19:54:38 transaction 'l' timeout: query=get_peers, raddr=36.228.238.179:63962
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
2022/07/23 19:54:38 transaction 'a' timeout: query=get_peers, raddr=125.167.51.189:10758
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
2022/07/23 19:54:38 transaction 'e' timeout: query=get_peers, raddr=176.39.2.120:6881
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
2022/07/23 19:54:38 transaction 'h' timeout: query=get_peers, raddr=87.249.198.3:13426
no peers for 4a42c5712ec4b810b162460f40cce1c6072b37ad
2022/07/23 19:54:38 transaction 'q' timeout: query=get_peers, raddr=37.151.139.67:6881

Two things to note from the above output:

  • Server2 is not able to connect to other nodes (timeout error). Why?
  • Server1 which was bootstrapped with 1 IP and server2 got added later, remains same. Why?

I'm trying to build a router DHT node and expect my server1 to get filled with other nodes. There are plenty of peers for the above infohash.

@arpitjindal97
Copy link
Author

arpitjindal97 commented Jul 23, 2022

One thing, I'm able to understand by looking at source code is server1 will only add node when it receives announce_peer/get_peers query from clients because router DHT node has nothing to query about.

Ref: https://github.com/xgfone/bt/blob/master/dht/dht_server.go#L435

@xgfone
Copy link
Owner

xgfone commented Jul 24, 2022

Server2 is not able to connect to other nodes (timeout error). Why?

May be blocked by the peer? Suggest: only start one DHT server on one ip. If not, DHT server may be blocked by the peer. When starting one DHT server, it works. If you have two ips, it maybe works to start two DHT servers.

Server1 which was bootstrapped with 1 IP and server2 got added later, remains same. Why?

Because Server1 returns the closest K nodes. If the number of nodes in the route table is less than K, it returns all. So Server2 and Server1 maybe have the same nodes on start.

2022/07/23 19:54:28 transaction '2' timeout: query=find_node, raddr=34.206.39.153:6881
2022/07/23 19:54:38 transaction 'a' timeout: query=get_peers, raddr=125.167.51.189:10758

There is a bug when handling the response of find_node and get_peers: if not found nodes or peers and need to query the others recursively, it forgot to terminate the finished transaction. So when the transaction is timeout, DHT server will only print the timeout error. It has been fixed, see and use v0.4.3 .

@arpitjindal97
Copy link
Author

This bug still exists. I'm using v0.4.3. Below is a small smaple code with logs

main

        pm := router.NewTestPeerManager()
	server, err := router.NewDHTServer(metainfo.NewRandomHash(), ":9001", pm)
	if err != nil {
		fmt.Println(err)
		return ""
	}
	defer server.Close()
	server.Bootstrap([]string{"router.bittorrent.com:6881"})
	go server.Run()
	time.Sleep(time.Second * 10)


	fmt.Println("Searching for " + infohash)
	server.GetPeers(metainfo.NewHashFromString(infohash), func(r dht.Result) {
		if len(r.Peers) == 0 {
			fmt.Printf("no peers for %s\n", infohash)
		} else {
			for _, peer := range r.Peers {
				fmt.Printf("%s: %s\n", infohash, peer.String())
			}
		}
	})
	time.Sleep(time.Minute * 5)

logs

no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
2022/07/30 19:11:41 transaction 'd' timeout: sid=268599ea40031d9424336de9201dab1e729d8b62, q=get_peers, qid=c8b5ec93ace754eb0f7461a4fcf38c51969ed91e, laddr=0.0.0.0:9001, raddr=176.210.30.176:23362
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
2022/07/30 19:11:41 transaction '3' timeout: sid=268599ea40031d9424336de9201dab1e729d8b62, q=get_peers, qid=c8b5ec93ace754eb0f7461a4fcf38c51969ed91e, laddr=0.0.0.0:9001, raddr=195.135.215.11:20526
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
2022/07/30 19:11:41 transaction 'f' timeout: sid=268599ea40031d9424336de9201dab1e729d8b62, q=get_peers, qid=c8b5ec93ace754eb0f7461a4fcf38c51969ed91e, laddr=0.0.0.0:9001, raddr=212.149.204.231:46061
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
2022/07/30 19:11:41 transaction 'e' timeout: sid=268599ea40031d9424336de9201dab1e729d8b62, q=get_peers, qid=c8b5ec93ace754eb0f7461a4fcf38c51969ed91e, laddr=0.0.0.0:9001, raddr=150.129.206.85:8039
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
2022/07/30 19:11:41 transaction 'c' timeout: sid=268599ea40031d9424336de9201dab1e729d8b62, q=get_peers, qid=c8b5ec93ace754eb0f7461a4fcf38c51969ed91e, laddr=0.0.0.0:9001, raddr=180.191.65.37:41842
no peers for c8b5ec93ace754eb0f7461a4fcf38c51969ed91e
2022/07/30 19:11:41 transaction 'a' timeout: sid=268599ea40031d9424336de9201dab1e729d8b62, q=get_peers, qid=c8b5ec93ace754eb0f7461a4fcf38c51969ed91e, laddr=0.0.0.0:9001, raddr=172.58.221.141:45465

@xgfone
Copy link
Owner

xgfone commented Jul 30, 2022

  1. Please ensure that your host can be accessed with the port 9001 by the public network. You maybe try to use a cloud host with a public ip, such as AWS, Azure, etc. NOTICE: the firewall must allow the port 9001 is accessed.
  2. Please ensure that you can access these ips from the host running DHT server. In my case, I cannot access 172.58.221.141, 180.191.65.37, 150.129.206.85, 212.149.204.231, etc.

@xgfone
Copy link
Owner

xgfone commented Jul 30, 2022

2022/07/30 19:11:41 transaction 'a' timeout: sid=268599ea40031d9424336de9201dab1e729d8b62, q=get_peers, qid=c8b5ec93ace754eb0f7461a4fcf38c51969ed91e, laddr=0.0.0.0:9001, raddr=172.58.221.141:45465

It is not a bug. There are two reasons leading to timeout:

  • You cannot access these IPs. Your or peer's ISP maybe block it.
  • The response is blocked by your NAT gateway?

@arpitjindal97
Copy link
Author

arpitjindal97 commented Jul 30, 2022

I'm behind a NAT firewall so my client is not directly reachable from remote peers.

But like most torrent clients, it should try to use the UDP hole punching technique to open the port and make it reachable. Can this technique be implemented?

@xgfone
Copy link
Owner

xgfone commented Jul 30, 2022

Yes, it may be implemented, but not belong on the scope of BT. I don't known much about UDP hole punching technique. You may refer to https://github.com/kkdai/ri?

@xgfone
Copy link
Owner

xgfone commented Jul 30, 2022

For establishing it is really a bug, change fmt.Printf("no peers for %s\n", infohash) to fmt.Printf("%s: no peers for %s\n", r.Addr.String(), infohash).

If printing not only the timeout error but also no peers for the same peer ip, it may be a bug.

@arpitjindal97
Copy link
Author

I found few threads related to NAT Traversal

I'm getting an intuition that implementing BEP 11 & 55 might solve it.

I checked out various libraries for UDP Hole punching & UPnP but no luck.

@xgfone
Copy link
Owner

xgfone commented Jul 31, 2022

BEP 11 is used to implement a tracker based on UDP, not HTTP,

BEP 55 is holepunch extension, but for the peer protocol with uTP, not DHT.

It's same for UDP hole punching technique, but not for the implementation. So it may be difficult to provide a unified & consistent implementation.

@xgfone
Copy link
Owner

xgfone commented Jul 31, 2022

In general, hole-punching requires a shared introducer. But how to communicate with the introducer? I haven't found a standard.

@xgfone
Copy link
Owner

xgfone commented Jul 31, 2022

In general, therefore, the DHT server isn't run behind NAT, but the peer server may be OK. DHT/BEP5 supports the peers running behind NAT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants