Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset peer store after stop #617

Closed
wants to merge 2 commits into from
Closed

Conversation

betamos
Copy link
Contributor

@betamos betamos commented Apr 27, 2020

  • Clear up has/get to accept PeerInfo for parity
  • Clear peer store after stop, to avoid lingering peers upon reconnect
  • Add tests

Didrik Nordström added 2 commits April 27, 2020 00:14
Add test coverage of existing and new surface.
Prior, dirty peers from last run interferes with discovery and
new connections.
@jacobheun
Copy link
Contributor

Clear peer store after stop, to avoid lingering peers upon reconnect

What's the reasoning for clearing peers here? We are currently working on a large update to the PeerStore, #582, which is planned to land in the next few weeks in 0.28. Part of this work includes PeerStore persistence, as it can greatly reduce the time to reconnect to new peers. I'd like to understand why you don't want to keep a record of peers on restart.

@betamos
Copy link
Contributor Author

betamos commented Apr 27, 2020

Oh I see, interesting. I use TCP and mdns modules, and had issues with peers not being rediscovered properly upon reconnection. As I understand it, the libp2p peer discovery is using the existing peerStore to filter discovery events.

One particular pain point was that if peer B goes offline and online again, peer A was notified of an old address (in my case - it was the old port number, which fails to connect). When the working address arrives seconds later, it didn't emit a second discovery event necessary to establish a connection. My workaround was to manipulate the peerStore manually, by removing peers that failed connection, so that they could be rediscovered and inserted properly.

I guess that the high level goals that I'm trying to achieve in my application are:

  1. All peers should be persistently connected to one another as soon as they are present.
  2. ...using a single connection (I have a duplex client/server RPC muxer). autoDial establishes two connections so I can't use that. There are also edge cases e.g. what constitutes a full connection – is it valid if only one connection is established?
  3. And they should be eventually be disconnected if they disappear. I currently use a custom ping message with a timer. (Node's Socket.setKeepAlive doesn't work well at all and don't disconnect properly)

There's definitely room for a design that works for both cases -- I don't know exactly how to fit it with your ongoing plans yet. Do you think my goals are general enough to be supported either automatically or with some manual logic around?

@jacobheun
Copy link
Contributor

There's definitely room for a design that works for both cases -- I don't know exactly how to fit it with your ongoing plans yet. Do you think my goals are general enough to be supported either automatically or with some manual logic around?

Yes I do. I think this aligns with the Topology registration system we are creating. It still needs better integration with the connection manager but you can see the interface and general usage at https://github.com/libp2p/js-libp2p-interfaces/tree/master/src/topology.

My workaround was to manipulate the peerStore manually, by removing peers that failed connection, so that they could be rediscovered and inserted properly.

The PeerStore emits a change:multiaddrs event whenever a peers multiaddrs change. So you could listen for this event instead of the peer discovery event. We should probably be checking for changes when we do an update to an existing peer and emit the discovery again if a change did happen. cc @vasco-santos

libp2p.peerStore.on('change:multiaddrs', ({ peerInfo, multiaddrs }) => {
  // ...
})

There are also edge cases e.g. what constitutes a full connection – is it valid if only one connection is established?

Are you hitting edge cases? A single connection should be all that's needed. We try to avoid creating multiple connections to a peer, but we currently don't have any protocol for converging on a single connection if two are simultaneously created, as often happens with mdns.

And they should be eventually be disconnected if they disappear. I currently use a custom ping message with a timer. (Node's Socket.setKeepAlive doesn't work well at all and don't disconnect properly)

I think this could be valuable to add to Connection Manager once the priority of Topology peers is being handled. We could automatically ping priority peers we have connections to in order to maintain them and track disconnects better.

@vasco-santos
Copy link
Member

We should probably be checking for changes when we do an update to an existing peer and emit the discovery again if a change did happen. cc @vasco-santos

I am not sure we should go this way. If the peer was already discovered and in the PeerStore, I would not expect to listen a new peer:discovery event. We could add a peer:changed event though, if users do not care about what changed, just that something is new.

I think this could be valuable to add to Connection Manager once the priority of Topology peers is being handled. We could automatically ping priority peers we have connections to in order to maintain them and track disconnects better.

This would be nice to add. We should create an issue tracking all the things we would like to achieve with the { ConnectionManager, Registrar, Topology }, so that someone can work on this in the future

@betamos
Copy link
Contributor Author

betamos commented May 5, 2020

The PeerStore emits a change:multiaddrs event whenever a peers multiaddrs change. So you could listen for this event instead of the peer discovery event.

I wasn't aware, so I should definitely try that before moving forward.

But nevertheless there's still one piece of the puzzle that isn't solved: I still need to connect to the old peers when the libp2p bundle is restarted.

Either

  1. I manually traverse libp2p.peerStore and attempt connecting to each peer, or
  2. I manually clear the libp2p.peerStore, or
  3. I recreate the libp2p bundle altogether

If it helps, in my case the network interface may have changed in-between, yielding all previous peers irrelevant. So (2) or (3) is going to be more reliable.

We try to avoid creating multiple connections to a peer, but we currently don't have any protocol for converging on a single connection if two are simultaneously created, as often happens with mdns.

Ooh interesting, then you might be interested in this workaround that I've been using:

libp2p.on('peer:discovery', async (other) => {
  if (other.id.toB58String() > me.id.toB58String) {
      return // Other has right of way
  }
  ..
}

This has been very reliable for me and I've tested on all major OS:s. However, I'm using an interval so it'd need to be tested with a less chatty approach.

I think this aligns with the Topology registration system we are creating.

It looks promising, but an example would really help. I'm not completely sure I'm affected by it (which wouldn't necessarily be a bad thing).

@vasco-santos
Copy link
Member

But nevertheless there's still one piece of the puzzle that isn't solved: I still need to connect to the old peers when the libp2p bundle is restarted.

Do you mean when you do libp2p.stop() and then libp2p.start()?

@betamos
Copy link
Contributor Author

betamos commented May 5, 2020

Yes.

@vasco-santos
Copy link
Member

When a node is started, it will try to dial the peers from the peerstore according to the autodial configuration:

https://github.com/libp2p/js-libp2p/blob/master/src/index.js#L425-L428

The same should happen in 0.28: https://github.com/libp2p/js-libp2p/blob/0.28.x/src/index.js#L430-L433

@betamos
Copy link
Contributor Author

betamos commented May 5, 2020

Maybe I should reconsider autoDial then. Do you think we can solve the issue of duplicate connections, e.g. by introducing an asymmetry like in the comment above?

@vasco-santos
Copy link
Member

Maybe I should reconsider autoDial then. Do you think we can solve the issue of duplicate connections, e.g. by introducing an asymmetry like in the comment above?

@betamos sorry totally missed this!
I do not understand what you mean with the asymmetry like suggestion. Can you explain me?

Also, the new PeerStore work is available in 0.28.0-rc.0. Let me know if this is working as you were expecting

@vasco-santos
Copy link
Member

Closing this as we iterated on the Peerstore in [email protected]. @betamos if you still have a problem, please open an issue so that we discuss it there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants