Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permanently drops writing to UDP socket on WLAN reconnnect #30

Open
easyvictor opened this issue Aug 27, 2019 · 6 comments
Open

Permanently drops writing to UDP socket on WLAN reconnnect #30

easyvictor opened this issue Aug 27, 2019 · 6 comments

Comments

@easyvictor
Copy link

I have kplex configured to write to a UDP stream on my wlan interface. My host machine is on permanently and I expect UDP to be published nonstop. However, I notice that occasionally the machine running kplex might drop WiFi carrier and reconnect, every one or two days. After this, kplex seems to basically be permanently "stuck" and never publishes anymore data. I have to manually restart the program at this point. It's as if the socket it has opened enters a bad state and it cannot detect the problem and reopen the socket. I apologize for any ignorance on this topic as I'm not an expert on sockets in Linux.
Also, thanks for this program, overall it has been fantastic for me.

@stripydog
Copy link
Owner

Thanks for reporting. I've had one other anecdotal report of something similar but this gives me more info on replicating. I'm a bit busy over the next few days but will try and replicate this when I get a chance

@easyvictor
Copy link
Author

Sounds good. I have linux syslogs from the issue that causes this; in short what I see is a "wlan0 carrier lost" then a quick reconnect. I have kplex logging the data to a file as well as UDP. After the drop, if I look at the output log file, it is empty, instead of being filled with data. And there is nothing being published over the UDP socket. Those are the symptoms. Thanks again.

@easyvictor
Copy link
Author

easyvictor commented Apr 1, 2020

Thanks for reporting. I've had one other anecdotal report of something similar but this gives me more info on replicating. I'm a bit busy over the next few days but will try and replicate this when I get a chance

Hey, any chance you are in a place to help look at this? I run kplex for my boat on a Raspberry Pi, and every now and then on the RPi I get a recycle of my Wifi interface. Specifically starts with "wlan0 carrier lost" in the syslog and a reconnect (you can google, seems to be common) and I have not been able to prevent that. Seems like it could be at the router or wifi hardware level, or simply bad wifi signal. Anyway, when this happens, I lose network broadcast of my data. Here are the more specific symtoms:

  1. If logged into the RPi, if I look at the network traffic on the UDP port (I use "netcat -lku 2000") it is still being published.
  2. From any other computer on the network, if I look at the UDP port, I see nothing, either from any app or using "netcat -lku 2000". When everything is working normally, obviously this isn't the case.
  3. I cannot fix the problem by simply restarting kplex. In fact it seems to reliably solve the issue, I need to stop Kplex, restart the wifi interface on the RPi, then start again:
    echo "Stopping seaplex..."; ./stopSeaplex.sh; echo "Restarting wifi..."; ifconfig wlan0 down && ifconfig wlan0 up; sleep 15; echo "Starting Seaplex..."; ./startSeaplex.sh

This is somewhat concerning to me, as I'm traveling and living on my boat, and I consume this data for watching key instruments like depth, wind, and location (for anchor alarms) on my devices. It's concerning to wake up and see that all my monitoring is gone because the wifi connection recycled and I have to manually restart everything. I'm just trying to get to a state of reliable operation where I know the data will continue to be broadcast if my wlan0 recycles. Any help appreciated.

@kuttkutt
Copy link

kuttkutt commented Jun 19, 2020

Hello,

let me jump into this topic.
I also noticed a lot of drops. The NMEA is piped into kplex with a constant rate, but on the receiving side the dataflow stoppes occationally for up to 30 secs.

Shame on me, I still don't use a recent version of klpex, since an apt upgrade always brings a lot of problems. So I won't upgrade if I know that it won't resolve this problem.

At the moment I use a udp broadcast option to pipe to the IP of my wlan0 device. In the doc is written:

Broadcast Interfaces

Broadcast interfaces are now deprecated and will be removed from a future version of kplex. Use udp interfaces instead if possible.

This method involves nmea sentences encapsulated within UDP datagrams sent to a broadcast address.

The question is now: Does this change resolves the dropping issue when broadcasting?
(I need the broadcast option, since we use multiple phones, tablets and laptops - all assigned via dhcp)

My lines now are:

# ---- OUTPORTS TO SLAVES ---- 
[udp]
  address=10.10.11.255
  type=broadcast   
  port=10110
  direction=out

I got a similar one for eth0 (subnet 10.10.10.0)

Can you tell me how this should look like using the non depreated functions?

@stripydog
Copy link
Owner

@kuttkutt , assuming your subnet mask is 255.255.255.0 then your config should work, although "port=10110" is the default (so is not needed in the config file) and kplex should be able to work out that 10.10.11.255 is a broadcast address IF that is the wireless LAN's broadcast address (so "type=" should be unnecessary). A simple config might just be:
[udp]
device=wlan0
direction=out
type=broadcast

...but what you have should work.

...however..I don't think this will fix your data transmission problem: There's no reason why kplex should stop broadcasting for 30 seconds. Per easyvictor's issue, a change to dhcp address will stop a broadcast interface. My thought here is packet loss due to some other interference. I've seen awful packet loss using an old raspberry pi and cheap usb dongle (~90%). broadcast/multicast over 802.11 doesn't have the datalink layer reliability that unicast 802.11 has so can be really lossy. If you want to test it out, try doing a tcpdump on the wlan interface, e.g.:
tcpdump -i wlan0 udp port 10110

DO the same on a another computer on your wireless network. If you see data happily being broadcast on the pi but not being received by another computer on the network, this is not a kplex issue

@stripydog
Copy link
Owner

@easyvictor . So sorry for dropping the ball on this issue. Now revisiting kplex issues and profusely apologising for not having responded for a year. I did look at this last year and the issue others were reporting seemed associated with dhcp changing addresses. If that happens a udp interface will fail and there's no mechanism within kplex to restart it. The proposed solution was an external one: using a dhclient exit hook to restart kplex on a change of IP address.

However it sounds here like that is not your problem if you need to down-up the interface. Is this purely a problem with kplex? If from the box you're running kplex on you do a ping -b and do a tcpdump looking for icmp traffic on another machine on the network can you see it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants