Skip to content
This repository has been archived by the owner on Jan 6, 2022. It is now read-only.

Replication may be pushing too many feeds into the connection #39

Open
pfrazee opened this issue Jan 4, 2017 · 7 comments
Open

Replication may be pushing too many feeds into the connection #39

pfrazee opened this issue Jan 4, 2017 · 7 comments

Comments

@pfrazee
Copy link
Collaborator

pfrazee commented Jan 4, 2017

If you look in hypercore-archiver, the replication code adds all stored feeds to the connection. (Its current usage, in archiver-server, does not set passive to false.)

I'm guessing this means that hypercloud will, at minimum, announce all currently stored archives at the time of connect. That can't scale. Shouldn't the hypercloud sit and wait for requests, passively?

@pfrazee
Copy link
Collaborator Author

pfrazee commented Jan 4, 2017

@maxogden I think this might relate to your remarks earlier about the archiver-server being passive. The current archiver-bot does set passive to true when it replicates. There's definitely a scaling issue there.

But, if passive is true, then the public peer wont ask other public peers for anything. I'm I'm understanding this correctly, we'll need some kind of middle-ground; an algorithm for asking for updates with proper throttling.

@max-mapper
Copy link

Clarifying question on that code (hard for me to understand due to vague method/variable names), is this the line that 'adds' a feed to a connection? https://github.com/mafintosh/hypercore-archiver/blob/dd34d62253d56604c94d8785e5e39b83816fb30f/index.js#L194 So the issue is the archiver will call .replicate many times over one connection?

Why is it doing that in the first place? Can't we just only call .replicate() for the hypercore that the connection is asking for?

@pfrazee
Copy link
Collaborator Author

pfrazee commented Jan 4, 2017

Why is it doing that in the first place? Can't we just only call .replicate() for the hypercore that the connection is asking for?

As I understand it, you need to call feed.replicate() for every feed you want to sync.

I believe the issue is, that we only have two modes: 1) ask to sync every feed we have stored locally, or 2) don't ask to sync anything and let the peer make the feed.replicate() calls.

The latter is passive-mode. If two passive-mode peers connect, no transfer will occur. That's the problem you remarked on, earlier.

However, non-passive-mode will have a scaling problem at some point. You'll ask to sync too many feeds for the connection.

@max-mapper
Copy link

What if we just used 1 connection per .replicate()?

@pfrazee
Copy link
Collaborator Author

pfrazee commented Jan 4, 2017

No that wouldn't solve the problem. Basically the problem is that hyperclouds are interested in too many hypercores. A peer will show up and the hypercore will ask "you have anything new for 10mm cores?" Too thirsty.

We do want the hypercloud to ask about some of their cores. Just not all of them, every time.

@joehand
Copy link
Collaborator

joehand commented Jan 4, 2017

I'm guessing this means that hypercloud will, at minimum, announce all currently stored archives at the time of connect. That can't scale. Shouldn't the hypercloud sit and wait for requests, passively?

Important to note that announcing is separate from opening the feed. In archiver-server, there is a random timeout to avoid flooding all those announcements but still likely a problem.

But both are issues: 1) have many feeds open and 2) announcing too many things at once

pfrazee: jhand: to clarify, there's two places where a flood could happen. The one you linked to is announcing on the discovery network. The other one, which max and I are discussing, is announcing feeds once a connection is established between peers

Ah!

@pfrazee
Copy link
Collaborator Author

pfrazee commented Jan 4, 2017

(Max and I clarified our points in IRC)

garbados pushed a commit to garbados/hypercloud that referenced this issue Aug 14, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants