-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not give up on HOST_CANDIDATE / SERVER_REFLEXIVE_CANDIDATE #53
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this was some attempt at assuming the relay connection was the most reliable and other types would block it.
Good catch!
There is a bug on Windows / some AV causing sockets to be randomly closed when you open to many, which is more likely to occur with multiple network interfaces. To reduce the number of sockets created, the ice adapter will disable all host candidates after the first try and all reflective candidates after the second try. The only guaranteed method that should work for everybody is the relay connection, which is why that is kept enabled. Do not merge this unless you want 5% of your player base to be unable to connect. |
Well there's currently a big amount of playerbase who cannot (re)connect right now - even on relay connections - and this was identified as one of the causes and we are looking desperate for causes. So I get the problem and wonder if there is alternative approaches not to spam to many sockets.
So alternative ideas:
|
Do you still have a link to some kind of documentation of this bug so that we can look at it and see if there is a better mitigation today. |
Hey @Geosearchef! Good to see you around :) Do you happen to have any idea why (during reconnects) there doesn't seem to be any Relay candidates offered? Currently indeed it seems to reject srflx (and hosts?) candidate after 1 failed attempt, but from some relatively simple but reproducible reconnection testing, we find that it never comes with any RELAY candidates. Thus we end up with an infinite sequence of empty candidate lists after the first attempt. During initial lobby connection it does have RELAY candidates I call it Failure mode B. So currently I had the following questions we'd ideally understand to fix reconnecting: |
At the end of ice development, we had issues on a few windows systems that we couldn't explain. The system was randomly closing ice's sockets. This occurred on around 3-10% of windows systems. I somehow managed to reproduce it on my win7 machine. Investigating the issue with a debugger lead me through the Java stdlib down to its native code. Going through Openjdk code, and watching it using a debugger, the socket close wasn't issued in any part of the JVM but instead occurred after the WinAPI syscall in Windows Code. Temporarily, I could fix the issue by enabling! a VPN connection. Other people have sometimes seen better results by disabling unused network adapters. By disabling host and srflx, I got a connection in almost all of those instances, which is why this workaround exists. What is the current issue? To me it sounds like the problem is with the relay, not the connectivity establishment. @Brutus5000 One more idea: Instead of completely disabling the fallback mechanism if host succeeded once, how about you do both, first all, if that fails, relay only, if that fails, host/srflx only, if that fails, back to the start. (@Ravandel95 I will take a look later.) |
Seems like it gives up HOST_CANDIDATE after one failed connection attempt and then don't try that again until 2 minutes have passed
(same with SERVER_REFLEXIVE_CANDIDATE after 2 attempts)
I wonder why that logic is needed, shouldn't ice4j have the chance to try all methods at the time of reconnection, regardless of earlier attempts? For instance when someone temporarily lost their internet access but got it back.