Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAKE P2P CONNECTION MORE RESILIENT #43

Open
g00dnatur3 opened this issue Jun 2, 2019 · 7 comments
Open

MAKE P2P CONNECTION MORE RESILIENT #43

g00dnatur3 opened this issue Jun 2, 2019 · 7 comments

Comments

@g00dnatur3
Copy link

I have a ZEN pool with the p2p enabled.

Always the first two or three p2p connection attempts fail:

34|pool_zen | 2019-06-01 21:36:57 [Pool] [horizen] (Thread 1) p2p had a socket error {"errno":"ECONNRESET","code":"ECONNRESET","syscall":"read"}

To fix this issue and make the p2p connection more resilient i modified the pool.js as follows:

   let retryCount = 1000

    function SetupPeer() {
        if (!options.p2p || !options.p2p.enabled)
            return;

        if (options.testnet && !options.coin.peerMagicTestnet) {
            emitErrorLog('p2p cannot be enabled in testnet without peerMagicTestnet set in coin configuration');
            return;
        }
        else if (!options.coin.peerMagic) {
            emitErrorLog('p2p cannot be enabled without peerMagic set in coin configuration');
            return;
        }

        _this.peer = new peer(options);
        _this.peer.on('connected', function () {
            emitLog('p2p connection successful');
        }).on('connectionRejected', function () {
            emitErrorLog('p2p connection failed - likely incorrect p2p magic value');
            retryCount--
            if (retryCount > 0) {
              emitLog('p2p trying to connect again soon... retryCount: ' + retryCount)
              setTimeout(SetupPeer, 2000)
            }
        }).on('disconnected', function () {
            emitWarningLog('p2p peer node disconnected - attempting reconnection...');
        }).on('connectionFailed', function (e) {
            emitErrorLog('p2p connection failed - likely incorrect host or port');
        }).on('socketError', function (e) {
            emitErrorLog('p2p had a socket error ' + JSON.stringify(e));
        }).on('error', function (msg) {
            emitWarningLog('p2p had an error ' + msg);
        }).on('blockFound', function (hash) {
            _this.processBlockNotify(hash, 'p2p');
        });
    }

As you can see I haved added a retry.

Now this implementation is obviously NOT perfect, BUT it proves to work.

Before i would have 0 p2p connections... maybe 1 in begining then they would all die..

With the code above i am able to maintain p2p connections

FYI my horizen daemon has TLS enabled

Please include some form of retry in your code as I have layed out.

Thanks

Cheers!

@egyptianbman
Copy link
Member

egyptianbman commented Jun 3, 2019

For my pool, I ended up modifying this to always reconnect -- with no retry limit. I didn't want to make the change here until I dug into it to identify why the nodes were getting disconnected so frequently. My guess is there is some sort of timeout that's being hit somewhere that needs to be addressed.

@g00dnatur3
Copy link
Author

g00dnatur3 commented Jun 4, 2019

This is NOT a timeout issue, or atleast net.Socket doesnt support it..

i put THIS code:

        client.setTimeout(10000, () => client.destroy());
        client.once('connect', () => client.setTimeout(0));

Inside the Connect() function of the peer.js file..

reference: https://github.com/nodejs/node/issues/5757

I still had the SAME issue:

34|pool_zen  | 2019-06-04 02:35:55 [Switching]	[Setup] (equihash) Setting proxy difficulties after pool start
34|pool_zen  | 2019-06-04 02:35:55 [Pool]	[horizen] (Thread 3) p2p had a socket error {"errno":"ECONNRESET","code":"ECONNRESET","syscall":"read"}
34|pool_zen  | 2019-06-04 02:35:55 [Pool]	[horizen] (Thread 3) p2p connection failed - likely incorrect p2p magic value
34|pool_zen  | 2019-06-04 02:35:56 [Switching]	[Setup] (Thread 4) Loading last proxy state from redis
34|pool_zen  | 2019-06-04 02:35:56 [Pool]	[horizen] (Thread 4) Share processing setup with redis (127.0.0.1:6379)
34|pool_zen  | 2019-06-04 02:35:56 [Pool]	[horizen] (Thread 4) No rewardRecipients have been setup which means no fees will be taken
34|pool_zen  | 2019-06-04 02:35:56 [Pool]	[horizen] (Thread 4) Stratum Pool Server Started for horizen [ZEN] {equihash}
34|pool_zen  | 2019-06-04 02:35:56 [Switching]	[Setup] (equihash) Setting proxy difficulties after pool start
34|pool_zen  | 2019-06-04 02:35:56 [Pool]	[horizen] (Thread 4) p2p connection successful

As you can see Thread 3 will never EVER again try to reconnect...

If you can find a better solution than a retryCount -- i am all ears.

@g00dnatur3
Copy link
Author

g00dnatur3 commented Jun 4, 2019

Retry Count Log Here -- Maintains ALL P2P Connections.. yay!

34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 2) p2p connection successful
34|pool_zen  | 2019-06-04 02:41:40 [Master]	[PoolSpawner] Spawned 1 pool(s) on 4 thread(s)
34|pool_zen  | 2019-06-04 02:41:40 [Switching]	[Setup] (Thread 3) Loading last proxy state from redis
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 3) Share processing setup with redis (127.0.0.1:6379)
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 3) No rewardRecipients have been setup which means no fees will be taken
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 3) Stratum Pool Server Started for horizen [ZEN] {equihash}
34|pool_zen  | 2019-06-04 02:41:40 [Switching]	[Setup] (equihash) Setting proxy difficulties after pool start
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 3) p2p had a socket error {"errno":"ECONNRESET","code":"ECONNRESET","syscall":"read"}
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 3) p2p connection failed - likely incorrect p2p magic value
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 3) p2p trying to connect again soon... retryCount: 999
34|pool_zen  | 2019-06-04 02:41:40 [Switching]	[Setup] (Thread 4) Loading last proxy state from redis
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 4) Share processing setup with redis (127.0.0.1:6379)
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 4) No rewardRecipients have been setup which means no fees will be taken
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 4) Stratum Pool Server Started for horizen [ZEN] {equihash}
34|pool_zen  | 2019-06-04 02:41:40 [Switching]	[Setup] (equihash) Setting proxy difficulties after pool start
34|pool_zen  | 2019-06-04 02:41:40 [Pool]	[horizen] (Thread 4) p2p connection successful
34|pool_zen  | 2019-06-04 02:41:41 [Pool]	[horizen] (Thread 1) p2p had a socket error {"errno":"ECONNRESET","code":"ECONNRESET","syscall":"read"}
34|pool_zen  | 2019-06-04 02:41:41 [Pool]	[horizen] (Thread 1) p2p connection failed - likely incorrect p2p magic value
34|pool_zen  | 2019-06-04 02:41:41 [Pool]	[horizen] (Thread 1) p2p trying to connect again soon... retryCount: 998
34|pool_zen  | 2019-06-04 02:41:42 [Pool]	[horizen] (Thread 3) p2p connection successful
34|pool_zen  | 2019-06-04 02:41:42 [Pool]	[horizen] (Thread 1) Authorized znoz9Dxs4FRCFpVpEqvrgXq4c1TJ449Zg8d:x [::ffff:73.223.74.177]
34|pool_zen  | 2019-06-04 02:41:43 [Pool]	[horizen] (Thread 2) Authorized znoz9Dxs4FRCFpVpEqvrgXq4c1TJ449Zg8d:x [::ffff:73.223.74.177]
34|pool_zen  | 2019-06-04 02:41:43 [Pool]	[horizen] (Thread 3) Authorized znoz9Dxs4FRCFpVpEqvrgXq4c1TJ449Zg8d:x [::ffff:73.223.74.177]
34|pool_zen  | 2019-06-04 02:41:43 [Pool]	[horizen] (Thread 1) p2p had a socket error {"errno":"ECONNRESET","code":"ECONNRESET","syscall":"read"}
34|pool_zen  | 2019-06-04 02:41:43 [Pool]	[horizen] (Thread 1) p2p connection failed - likely incorrect p2p magic value
34|pool_zen  | 2019-06-04 02:41:43 [Pool]	[horizen] (Thread 1) p2p trying to connect again soon... retryCount: 997
34|pool_zen  | 2019-06-04 02:41:45 [Pool]	[horizen] (Thread 1) p2p connection successful

I'm not making this up... this literally happens... look at the log.

Now i know the implementation needs improvement cause it will stop maintaining connections after some LONG amount of time...

the p2p connections come and go.. i guess peers go offline....

and the initial connections dont handshake the first time around for some reason...

again I am using the latest ZEN release... TLS enabled, a secure node.

the node tracker is HAPPY. My node must be setup correctly.

I have successfully mined multiple blocks as well...

@g00dnatur3
Copy link
Author

g00dnatur3 commented Jun 4, 2019

I think a good middle ground could be to use https://www.npmjs.com/package/backoff -- backoff

you dont want to burden the system to try and do p2p connections for ever, maybe the magic value really is wrong... so the delay between retries and grow -- maybe the exponential strategy or linear stretegy... until eventually it stops retrying after a long while...

this way you can avoid an extra config...

if a successfull connection occurs the backoff is reset ... or re-initialized...

OR -- KEEP IT SIMPLE --

modify the existing retryCount to be 10

and on successful connection retryCount goes back to 10

i like this implementation the best

   let retryCount = 10

    function SetupPeer() {
        if (!options.p2p || !options.p2p.enabled)
            return;

        if (options.testnet && !options.coin.peerMagicTestnet) {
            emitErrorLog('p2p cannot be enabled in testnet without peerMagicTestnet set in coin configuration');
            return;
        }
        else if (!options.coin.peerMagic) {
            emitErrorLog('p2p cannot be enabled without peerMagic set in coin configuration');
            return;
        }

        _this.peer = new peer(options);
        _this.peer.on('connected', function () {
            retryCount = 10
            emitLog('p2p connection successful');
        }).on('connectionRejected', function () {
            emitErrorLog('p2p connection failed - likely incorrect p2p magic value');
            retryCount--
            if (retryCount > 0) {
              emitLog('p2p trying to connect again soon... retryCount: ' + retryCount)
              setTimeout(SetupPeer, 2000)
            }
        }).on('disconnected', function () {
            emitWarningLog('p2p peer node disconnected - attempting reconnection...');
        }).on('connectionFailed', function (e) {
            emitErrorLog('p2p connection failed - likely incorrect host or port');
        }).on('socketError', function (e) {
            emitErrorLog('p2p had a socket error ' + JSON.stringify(e));
        }).on('error', function (msg) {
            emitWarningLog('p2p had an error ' + msg);
        }).on('blockFound', function (hash) {
            _this.processBlockNotify(hash, 'p2p');
        });
    }

you follow?

@egyptianbman
Copy link
Member

In my experience, some nodes (i.e. genx) don't accept p2p connections from the pool stratum at all. Most other nodes stay connected for a while but eventually disconnect. On my stratums, I just have them keep trying forever.

With that being said, I've been running into some issues with one of my ZER nodes. It tends to get into a p2p reconnect loop that eats up a lot of cpu. I end up having to restart the node and stratum to get everything working again.

I think the best solution is one where if disconnects are excessive, it adds a small delay to reconnecting -- but there is no limit. If you have p2p enabled in the config, it should always reconnect.

@TheComputerGenie
Copy link

#53 will fix the once running disconnect issues. PONG is always the proper response to PING; no PONG = disconnect after a time.

@g00dnatur3
Copy link
Author

g00dnatur3 commented Feb 7, 2021

FYI

I updated my fork

added:

  ...
    var commands = {
        ...
        ping: commandStringBuffer('ping')
    };
   
// then down below
.....
            case commands.ping.toString():
                SendMessage(commandStringBuffer('pong'), Buffer.alloc(0));
                break;
                

based on the PR you provided

still same issue -- does not fix issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants