Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESOCKETTIMEDOUT and ETIMEDOUT #15

Closed
deanlyoung opened this issue Nov 10, 2016 · 46 comments
Closed

ESOCKETTIMEDOUT and ETIMEDOUT #15

deanlyoung opened this issue Nov 10, 2016 · 46 comments

Comments

@deanlyoung
Copy link

Not really sure what to make of these, but they appear in the error log when I attempt to change the lights. Using the iOS 10 Home app, I get a "Failed" state on Philips Hue bulbs when attempting to tap a scene. From what I can tell, it looks like this usually occurs for bulbs that are trying to change more than one parameter (e.g. hue, saturation, brightness) within on scene.

[11/10/2016, 11:31:59 AM] [Hue] Philips hue: hue bridge communication error Error: ESOCKETTIMEDOUT
[11/10/2016, 11:33:59 AM] [Hue] Philips hue: hue bridge communication error Error: ESOCKETTIMEDOUT
[11/10/2016, 11:58:02 AM] [Hue] Philips hue: hue bridge communication error Error: ESOCKETTIMEDOUT
[11/10/2016, 12:05:58 PM] [Hue] Philips hue: hue bridge communication error Error: ESOCKETTIMEDOUT
[11/10/2016, 12:10:58 PM] [Hue] Philips hue: hue bridge communication error Error: ESOCKETTIMEDOUT
[11/10/2016, 12:20:14 PM] [Hue] Philips hue: hue bridge communication error Error: ETIMEDOUT

I've also noticed in the log file that bulbs appear to be changed from "0" to "false" (or vice versa), and "1" to "true". Seems like the type of a variable isn't set correctly, but I could be wrong...

[11/10/2016, 10:28:48 AM] [Hue] Bedroom - Tall: set homekit power from false to 0
[11/10/2016, 10:28:48 AM] [Hue] Bedroom - Medium: set homekit power from false to 0
[11/10/2016, 10:28:48 AM] [Hue] Bedroom - Medium: set homekit color temperature from 2000K to 3984K
[11/10/2016, 10:28:48 AM] [Hue] Bedroom - Short: set homekit power from false to 0
[11/10/2016, 10:28:48 AM] [Hue] Bedroom - Short: set homekit color temperature from 2000K to 3984K

I am running Homebridge on a Raspberry Pi 2 B+ and I have the 1st Generation Hue Hub. This is my config file:

{
	"bridge": {
		"name": "Homebridge",
		"username": "<<REDACTED>>",
		"port": <<REDACTED>>,
		"pin": "<<REDACTED>>"
	},

	"description": "Description - not very helpful...",

	"platforms": [
		{
			"platform": "Hue",
			"name": "Hue",
			"host": "<<REDACTED>>",
			"users": {
				"<<REDACTED>>": "<<REDACTED>>"
			},
			"heartrate": 5,
			"timeout": 5,
			"lights": true,
			"alllights": true,
			"groups": false,
			"sensors": true,
			"schedules": false,
			"rules": false
		}
	],
	
	"accessories": [
		{
			"accessory": "WUNDERGROUND",
			"name": "Weather Conditions",
			"device": "Raspberry Pi",
			"key": "<<REDACTED>>",
			"city": "<<REDACTED>>"
		},
		{
			"accessory": "people",
			"name": "People",
			"people": [
				{"name": "<<REDACTED>>", "target": "<<REDACTED>>"},
				{"name": "<<REDACTED>>", "target": "<<REDACTED>>"}
			],
			"threshold": 15
		},
		{
			"accessory": "HttpTemphum",
			"name": "Living Room Conditions",
			"url": "<<REDACTED>>",
			"sendimmediately": "",
			"http_method": "GET"
		}
	]
}
@ebaauw
Copy link
Owner

ebaauw commented Nov 12, 2016

Hi @deanlyoung,

The first series of errors are network errors, timeouts homebridge-hue gets when trying to communicate with the hue bridge. Typically these network timeouts happen when the network connection to the hue bridge is poor, or when the bridge is too busy. If you run homebridge -D, homebridge-hue logs the requests it sends to the hue bridge, so you have an idea when the timeouts occur. You could try increasing the timeout parameter in config.json. Is your Raspberry Pi connected through a network cable or trough WiFi?

As to your second question: homekit defines the power or On characteristic as a Boolean, but uses values 0 and 1 instead of false and true. To wit, if you switch off power from the iOS 10 Home app, you see a log message "homekit power changed from 1 to 0". However some Homekit applications use false and true instead of 0 and 1, and instead you would see "homekit power changed from 1 to false" in the log. Long story short: don't worry about this. I'll see if I can suppress the messages in a future release.

Regards,
Erik.

@deanlyoung
Copy link
Author

Thanks for the reply, @ebaauw! I'll look into the logs some more and try out your timeout suggestion. I have my RPi connected via Ethernet.

@deanlyoung
Copy link
Author

deanlyoung commented Nov 13, 2016

Running homebridge -D gets me this error after initializing everything. I've seen this come up before, but that unhandled error is a tricky one to track down.


events.js:154
      throw er; // Unhandled 'error' event
      ^

Error: listen EADDRINUSE :::51826
    at Object.exports._errnoException (util.js:856:11)
    at exports._exceptionWithHostPort (util.js:879:20)
    at Server._listen2 (net.js:1231:14)
    at listen (net.js:1267:10)
    at Server.listen (net.js:1363:5)
    at EventedHTTPServer.listen (/usr/local/lib/node_modules/homebridge/node_modules/hap-nodejs/lib/util/eventedhttp.js:58:19)
    at HAPServer.listen (/usr/local/lib/node_modules/homebridge/node_modules/hap-nodejs/lib/HAPServer.js:148:20)
    at Bridge.Accessory.publish (/usr/local/lib/node_modules/homebridge/node_modules/hap-nodejs/lib/Accessory.js:468:16)
    at Server._publish (/usr/local/lib/node_modules/homebridge/lib/server.js:92:16)
    at Server.<anonymous> (/usr/local/lib/node_modules/homebridge/lib/server.js:351:14)

On mobile, so I couldn't figure out how to copy the logs from my terminal app, but here are some screenshots of the errors when attempting to update xy bri etc.

I could be wrong, but it looks like updating certain aspects of the lights aren't being executed in the correct order. So for example, the current state (on/off) should be checked, if "off" → "on", and then change other values, like brightness/hue/saturation.

img_0412
img_0413

@ebaauw
Copy link
Owner

ebaauw commented Nov 13, 2016

Hi Dean, @deanlyoung

Hmm, I think that first error is thrown by homebridge itself, not by homebridge-hue. I think 51826 is the port you masked from your config.json. On startup, homebridge creates at web server at that port, for your iOS devices to connect and interact with the homebridge bridge accessory. EADDRINUSE means homebridge cannot create this web server, because the port is already used by another program. Are you running multiple instances of homebridge?

I'm a bit puzzled by the logs from the first screenshot. It looks like you switched light 3 on from homekit (that message should have appeared just before these), and homebridge-hue sends a put of /lights/3/state to the hue bridge to mirror that change. However that doesn't seem to reach the bridge, due to a connection reset error. Then the bridge justly complains that it cannot change the brightness nor colour, as the light is still off. Now, there's more going on, the communication error occurs twice, so another command to the bridge has failed as well. That's the joy of asynchronous I/O: the error message doesn't necessarily apply to the latest communication attempt. I really need more of the logs. Could you start homebridge -D > logfile.log 2>&1 to capture the logs? You can still monitor the logs by tail -f logfile.log from another terminal session.
The real question, of course, is what causes the ECONNRESET? It could be a network error, an error on the Pi, or an error in the hue bridge giving up because it's flooded with requests. The first seems unlikely since you use Ethernet. I'm unfamiliar with the Pi (as I have a mac server running), but in theory you run into a limit on number of open network sockets. Are you running many other programmes on the Pi, under the same username. What load do the other homebridge plug-ins generate? For the third: are you running any other apps or programs communicating with the hue bridge? Do you have many rules on the bridge? I really need the full logfile to assertain whether homebridge-hue is flooding the bridge.

The logs from the second screenshot look fine to me. Homebridge-hue does a get /lights to receive the updated light states from the hue bridge and then sets the homekit characteristics to match any changes.

@deanlyoung
Copy link
Author

Ah, I think running homebridge -D while my /etc/init.d/homebridge instance is running is clogging port 51826. However, that's not the root issue here because even when only one instance of homebridge is running I still get errors changing the lights. Specifically, changing multiple characteristics at once using a scene command in the Home app.

It is possible that the other plugins are causing too much of a load on the connection... They seem to update pretty frequently, like every 20-30 seconds. I'm looking at [Living Room Conditions] as an example. That value shouldn't be changing as frequently though. I have a cron job running that generates that data from sensors every 5 minutes, so I can look into reducing the load from the HttpTemphum accessory.

In terms of rules on the bridge, I think there is only one, and it affects only the bathroom because the motion sensor is in there. The rest of my automation tasks are done through the Home app. How do I view what has been saved on the bridge itself? (This is the Hue Bridge, no Homebridge, correct?) I should also probably point out that I only have a Gen 1 Hue Bridge. Maybe there is a different behavior because of that?

The only issue I see in the second screenshot above is that of the three lights, "Tall" does not show 3 commands, while the other two do. It's missing brightness, which threw an error in screenshot one. To show you an example of outcome, I've attached a few photos below, executed via "scenes" in the Home app.

Some example taps in the Home app: (not necessarily a representation of the logfile)

  • start at Off
  • tap "Bedtime" (turn lights to 1% brightness, change color to an off-white), receive error
  • tap "Bedtime" again, "Bedtime" button becomes highlighted in app
  • tap "Good morning" (turn lights to 30% brightness, same color so no change there technically), receive error, only 1-2 lights have turned brightness up)
    tap "Good morning" again, all lights turned up now, but "Good morning" button does not become highlighted in the app
  • tap "It's getting late" (turn lights to 1% brightness, change color to red), receive error, only 1 light turns red, brightness levels vary
  • tap "It's getting late" again, throws error, brightness is correct across all lights, but 1-2 remain off-white
  • tap "It's getting late" again, all lights are correct, button highlights in the app

From the above example, pretty consistently, the most errors get thrown (requires the most taps/retries) when the color is changing as opposed to only changing hue or saturation. Given that I could try to bump up the timeout from 10 to 15 (was previously at the default 5). The Home app reports back an error ("Failed!") well before the 10 seconds are up though.

Let me know if there is any more information I can provide. I'm sure there could always be something I'm overlooking...

Turn on to full brightness
Turn on to full brightness

Turn down lights and make them red, error
Turn down lights and make them red, error

Tap the same scene one more time, still throwing an error
Tap the same scene one more time, still throwing an error

logfile.txt

@ebaauw
Copy link
Owner

ebaauw commented Nov 14, 2016

Hi Dean, @deanlyoung

Nice lamps!

The logfile is very insightful, thanks. When activating a homekit scene, the homekit app sends an event to homebridge-hue for each characteristic in the scene. That's three characteristics for each light (brightness, hue, and saturation, even though the app might show them as one colour). On each event, homebridge-hue checks whether the characteristic value has changed and if so, issues a put request to the hue bridge. By design, homebridge-hue sends these requests asynchronously, that is, it doesn't wait for the response from the hue bridge, but continues handling the next event. Consequently, the hue bridge receives the next request(s) when it's still processing the previous one(s).

I've done some testing on my old gen1 bridge, and I consistently get ECONNRESET when the homekit scene contains 8 characteristics. With 7 characteristics, everything works fine. With three lights times three characteristics, the math doesn't really work in our favour.
I tried the same on my gen2 bridge, with up to 24 characteristics per scene: no problem. I read the gen1 bridge was less powerful than the gen2 bridge, but I didn't expect such a huge difference.

So what could we do about this (short from upgrading to a gen2 bridge)?

  1. I double-checked the source for the other plugin, homebridge-philipshue. As far as I can tell, it actually works the same as homebridge-hue: sending a separate put request to the bridge for each characteristic. It does however check specifically for ECONNRESET and, instead of logging the error, it silently (!) resends the put request after 300ms. As a result, the whole scene would eventually be activated, but the lights wouldn't change at the same time (I also noticed that when testing). I could implement something similar in homebridge-hue (but I would probably still log the ECONNRESET).
  2. Alternatively, I could make homebridge-hue throttle the requests to the hue bridge, making sure there's no more than seven the outstanding requests to a gen1 bridge, preventing the ECONNRESET in the first place. Again this would cause the lights to react at different times.
  3. As mentioned in issue Wiping out Home app settings #16, I could make homebridge-hue collect all the changes to a single light and combine them in one request to the hue bridge. That would help your case with only three lights in a homekit scene, but the problem would still occur for scenes with more than seven lights.
  4. You could create a group on the hue bridge, containing all lights of the homekit scene. Change homebridge-hue's config.json to expose groups and put that group in the homekit scene instead of the individual lights. This is actually the reason why homebridge-hue supports groups in the first place: to limit the interaction with the hue bridge. Unfortunately, neither the old (v1) nor new (v2) official Philips Hue app supports creating groups. You would have to use another app, or do that manually using the bridge's built-in API debugger, see http://www.developers.meethue.com/documentation/getting-started
  5. You could use a scene on the hue bridge, instead of a homekit scene. To activate that scene from homekit, you would create a CLIPGenericFlag sensor on the hue bridge, and a rule that activates the scene when the sensor's flag changes. Homebridge-hue exposes the CLIPGenericFlag as a Switch to homekit. This is the setup I use at home, actually, using a flag per room. I'm setting the same flag from the room's Hue Dimmer switch and/or Hue Motion Sensor. This is quite elaborate configuration on the hue bridge. I'm actually using a series of shell scripts to maintain the groups, sensors, and rules (posting those on GitHub might very well be my next project).

Please let me know what you think. 4 is probably easiest and quickest to implement, if you're comfortable typing json by hand. Implementing 1 or 2 would be relatively easy for me, but I'd rather focus on the dynamic accessory model (and issue #16). I do think I should implement 2 eventually anyway, probably making the number of parallel requests a config parameter.

ebaauw pushed a commit that referenced this issue Nov 14, 2016
Resend bridge request after 300ms on ECONNRESET
@ebaauw
Copy link
Owner

ebaauw commented Nov 14, 2016

I stand corrected: 1 was easiest to implement, just published version 0.0.12. @deanlyoung could you see if that version solves this issue for you?

@ebaauw
Copy link
Owner

ebaauw commented Nov 14, 2016

One more thing, @deanlyoung , I wouldn't worry about the other plugins. Once every 20-30 seconds is peanuts compared to the load to the hue bridge.

For something completely different: how about creating a CLIPTemperature and a CLIPHumidity sensor on the hue bridge? Homebridge-hue exposes these CLIP sensors to homekit. These could then be updated periodically from Wunderground from a cron job. This job would be a simple script doing an http get on wunderground.com, extracting the temperature and humidity values from the response, and two http puts to send these values the hue bridge. No need for an additional homebridge plug-in.

@deanlyoung
Copy link
Author

deanlyoung commented Nov 14, 2016

@ebaauw it's definitely more stable, but that "It's getting late" scene is being really stubborn. I forgot to mention that that scene does in fact control more than just the 3 hue color lights. It also turns every other light off connected to the bridge, so 5 others, in addition to controlling the 3 color ones, 8 total. However, when I remove control of all other lights (except turning off an additional bedroom light) it still performs the same way. It just doesn't show the ECONNRESET now in the logs, which is a step in the right direction!

Another upside is that the other scenes seem to work properly now and show up highlighted in the Home app after tapping them. Yay! (whereas, the "It's getting late" scene highlights, then un-highlights when it realizes the scene hasn't actually finished)

Unfortunately, I can't seem to get homebridge -D > logfile.log 2>&1 to restart homebridge after stopping it at this time (so as not to clog the port again). Not sure what's going on, but another thing for me to look into... I was able to copy this from Terminal though:

I've also noticed, maybe anecdotally, that only the "Tall" light is usually the one that doesn't turn red, but tapping the scene again eventually gets it to change, and yet still no recognition by the Home app that it's in fact completing the scene now.

Anyway, that's all to say that I think I can live with it the way it is now, versus the extra effort/debugging/development it would take to track down the issue more. Happy to keep trying things, but I don't want to waste too much of your time :)

THANK YOU!

As for the other plugins, I was mistaken. 20-30 seconds was actually how often I was going back into the Home app while testing, and the HttpTemphum plugin only runs when you open the app (I noticed from watching the logs). And as to converting them to CLIP sensors... well, they've actually sensors hard-wired to my RPi, so I have to run a Python script to pull the data and expose the data to the web (in my case). I'll read up on CLIP more, but these are sensors attached via GPIO. What you suggested might also be beyond my understanding at this point, and I could just be missing your point.

Edit: @ebaauw sorry, I missed a lot of your earlier comment above the most recent 2! which I only picked up on after seeing your comment on #16...

That's a huge insight about the difference between Gen1 and Gen2. Totally makes it seem like there is more value in upgrading (I thought it was mostly a hardware encryption limitation to be HomeKit Siri compatible, but I suppose an added benefit is more processing capability).

I'll explore the json route from 4 above. I'm pretty comfortable with json, but if I get hung up, I have some other apps that might work.

@ebaauw
Copy link
Owner

ebaauw commented Nov 14, 2016

Hi Dean, @deanlyoung

I find homekit sometimes reverts changes I make to scenes or rules when syncing with my other devices over iCloud. I'd double check that the changes to the getting late scene have been propagated to your other devices.

I now log a debug message on ECONNRESET, so you would only see it with homebridge -D. I tested with 16 characteristics in a scene. Some updates actually cause an ECONNRESET on the first retry as well, but eventually work on the second retry.

The CLIP sensors on the hue bridge aren't real sensors, they're more like variables where you can store sensor output. So instead of exposing the data to the web, your Python script would do an http put to the hue bridge to update the CLIP sensor value.

@deanlyoung
Copy link
Author

deanlyoung commented Nov 14, 2016

Ah, gotcha! I'm currently reading through the Hue API documentation. Really good stuff!

edit: @ebaauw Wow! I'm really liking the Hue API. Thanks for the tip on updating sensors locally!

@fedhat
Copy link

fedhat commented Nov 15, 2016

Hi @ebaauw !

Thank you for all the work you've done on this hue homebridge plugin.

I too ran into the ECONNRESET issue and worked on a solution locally. I'm running a Hue 1st Gen bridge and noticed that error appeared when I loaded a scene that required many API calls (turn on, set brightness, set color, etc. on 5-6 lights). When I googled the error, I found information about a limitation in 1st Gen bridges:

http://www.developers.meethue.com/faq-page

How many commands you can send per second?

You can send commands to the lights too fast. If you stay roughly around 10 commands per second to the /lights resource as maximum you should be fine. For /groups commands you should keep to a maximum of 1 per second.

My solution was to write a request cache loop that makes a maximum N requests per T milliseconds, repeating until the cache is empty. When I set it to 10 requests per 1000 milliseconds, I still got errors, though fewer of them. Through trial and error I found that 5 requests per second works reliably on my setup. The only time I still have failures is when a change request is made on a light that is in the off state (ex: requesting set brightness to 30% when light is off). The Hue bridge complains about an invalid API request in those situations.

I didn't realize that you were also working on the problem (though was happy to see an open source project that was so active). The changes I made were to version 0.0.11 and was planning to incorporate them in version 0.0.12 with the 300ms request retry. Would you be interested in reviewing my changes? I could fork your project and submit a PR.

@deanlyoung
Copy link
Author

btw, @ebaauw (and @fedhat) it seems that by hardcoding a group for my three color lights via Hue's API, I was able to fully solve this issue. The transitions are much faster and I haven't seen an issue yet. @ebaauw, I'm not sure if you want to leave this issue open in light of that insight you gave me, but maybe there are some refinements to the module you want to take note of. We discussed a lot... :) Thanks again!

@ebaauw
Copy link
Owner

ebaauw commented Nov 15, 2016

Hi @fedhat,

Yes, I know that FAQ. It refers to the limitations of the Zigbee network, the Hue bridge sending commands to the lights; the ECONNRESET is a limitation on the IP side, homebridge-hue sending requests to the Hue bridge. On that side, a PUT /group is just as resource intensive as a PUT /light; on the Zigbee network, a group command is a broadcast, whereas a single light command is a unicast, hence the difference in number of requests between lights and groups. Also note that the bridge caches the light states, to a GET /lights wouldn't even result in any Zigbee traffic.

The IP problem is not so much the number of requests per second, but the number of parallel requests. As homebridge-hue sends requests to the Hue bridge asynchronously, the bridge receives the next request before it has responded to the previous. The 1st Gen bridge cannot seem to handle more that 6 open requests. Consequently, we need to limit the number of open requests to the bridge (option 2 in my earlier comment), rather than the rate of these requests. As all requests go through HueBridge.request() the implementation would be relatively straightforward. Keep a counter on the number of open requests. Queue a new request when too many are open and check for queued requests when a request is resolved or rejected. Probably combined with a config parameter for the threshold how many requests might be open.

@ebaauw
Copy link
Owner

ebaauw commented Nov 15, 2016

That's good to hear, @deanlyoung. I'll keep the issue open, until I have implemented the limit on number of open requests.

@fedhat
Copy link

fedhat commented Nov 16, 2016

Thank you, @ebaauw. I didn't realize that the FAQ referred to the limitation of the Zigbee network, nor that the actual limitation of the 1st Gen bridge is the number of open requests.

For my own edification, I'll update the request cache system I added to account for open requests, rather than requests per second. My modifications are built around HueBridge.request(), and as you said, it's pretty straightforward.

@deanlyoung thank you as well. I'll keep that in mind.

@fedhat
Copy link

fedhat commented Nov 16, 2016

@ebaauw (and @deanlyoung in case you're curious)
I submitted a pull request that implements the request cache I described earlier, except that I updated the logic to limit concurrent/unresolved requests rather than requests per second. After making the change, updates to scenes and lights happen very quickly and I get 0 ECONNRESET errors.

@jojost1
Copy link
Contributor

jojost1 commented Dec 30, 2016

I get a this at least once a day, seems to resolve itself (after one hour, looking at the logs):

[2016-12-30 15:16:38] [Hue] Bridge Maurice: hue bridge communication error ETIMEDOUT
[2016-12-30 15:16:38] [Hue] undefined
[2016-12-30 15:16:41] [Hue] Bridge Maurice: hue bridge communication error EHOSTDOWN
[2016-12-30 15:16:41] [Hue] undefined

This is the pattern, which is repeating every [heartbeat] seconds.

@ebaauw
Copy link
Owner

ebaauw commented Dec 30, 2016

Hi Joost, @jojost1

Please ignore the undefined messages. This are caused by a cosmetic error in homebridge-hue, which I'll correct in the next version.

The ETIMEDOUT and EHOSTDOWN errors indicate that homebridge-hue cannot connect to the Hue bridge, typically because the Hue bridge or the network connection the Hue bridge is down. It's generated on every communication attempt with the Hue bridge, which is every heartbeat seconds when homebridge-hue polls the Hue bridge.

What type of server do you run homebridge on? Is this server connected to the network by wire or by WiFi? What is your home bridge configuration? What is your Hue bridge config?

@jojost1
Copy link
Contributor

jojost1 commented Dec 30, 2016

Running it on an iMac, connected by wire. Homebridge-hue settings in config are all default.

Unrelated question: There is no way to hide the hue-bridge and/or daylight sensor right? Kind of useless for those to be there ;-)

@ebaauw
Copy link
Owner

ebaauw commented Dec 30, 2016

Could it be related to the iMac sort of sleeping and suspending its network interface?

No, there isn't. I don't think they're useless. The bridge accessory will be used to persist the bridge username, so you wouldn't need to edit config.json to add it manually, once I'll have moved to dynamic platform accessories (see issue #4). Currently, it supports dumping the bridge state on Identify for troubleshooting. The daylight sensor accessory could be used for HomeKit rules in other apps than the iOS Home app. There should probably be a way to suppress it, see issue #7.

Are you running into the 99 bridged accessories limit? Otherwise, just create a dummy room in HomeKit and move all the accessories you don't use there (including the accessory for homebridge itself). Be sure to unmark them as favourite.

@jojost1
Copy link
Contributor

jojost1 commented Dec 30, 2016

Could it be related to the iMac sort of sleeping and suspending its network interface?
Maybe, but don't think so. I'll look into that.

For the other issue: thanks for the info!

Are you running into the 99 bridged accessories limit? Otherwise, just create a dummy room in HomeKit and move all the accessories you don't use there (including the accessory for homebridge itself). Be sure to unmark them as favourite.

Nope don't have +99 accessories, I already unmarked them and made a separate room. Thanks!

@ebaauw
Copy link
Owner

ebaauw commented Jan 1, 2017

The undefined errors should not longer appear as of v0.1.16.

@pponce
Copy link

pponce commented Jan 11, 2017

i'm also receiving this message often on every heart beat currently set to 5 seconds.
Philips hue: hue bridge communication error ETIMEDOUT
Homebridge server is running over wifi.

@ebaauw
Copy link
Owner

ebaauw commented Jan 12, 2017

@pponce You don't get any EHOSTDOWN errors? My best guess is hiccups in your WiFi network. Do you see lost packets when you do a ping -s 1472 on the homebridge server to your Philips Hue bridge?

Do you also get these errors when connecting the homebridge server by wire instead of over WiFi? When only using the homebridge-hue plugin? If so, than please attach a full debug log and your config.json here. See Troubleshooting in the README.

@ebaauw
Copy link
Owner

ebaauw commented Jan 14, 2017

this is what the ha-bridge returned "B827EBFFFE906FD2": "961f615166664b86a5803b75d0db6de0"

Indeed, that's not a Philips Hue bridge ID. It's derived from the Mac address of your Raspberry Pi, courtesy of HA-Bridge, I assume. Where did that username come from? Did you (do the equivalent of) press(ing) the link button on HA-Bridge at homebridge-hue's request?

If you supported multiple host ID's perhaps it would just work?

Version 0.1.10 supports multiple values for host.

@pponce
Copy link

pponce commented Jan 14, 2017

Tested 0.1.10. seems to work. Now just getting the HA-bridge errors.
I never had to press any links. Supposedly it "auto presses" and can accept any username.
my working config for the homebridge-philipshue plugin was
{
"platform": "PhilipsHue",
"name": "HA-Bridgee",
"ip_address": "192.168.1.99",
"username": "my-homebridge"
},

@pponce
Copy link

pponce commented Jan 17, 2017

My issue with ETIMEDOUT errors i believe has resolved.
It was probably because my Philips hub was connected via powerline.
Once i connected straight to router the timeouts went away.

@deanlyoung
Copy link
Author

@pponce that's an interesting point about connecting via powerline. I have my RPi connected to the router via powerline (as opposed to wireless, which is flakey on a 2B with a USB wifi stick), and my bridge connected directly. I wonder if I'm getting the Philips hue: hue bridge communication error ESOCKETTIMEOUT for the same reason (albeit, the other way around). For the most part, my requests go through and things are generally stable, but I see the error pop up about every minute. It does seem to coincide with updates to the ambient light sensor on the Hue motion sensor, but that could be a coincidence instead of causation until I can look into it more.

@pponce
Copy link

pponce commented Feb 20, 2017

I bet it's the powerline. I decided to connect everything (pi3 and hue hub) direct to the router and the timeout went away. I had originally been running wifi on pi3 and powerline to philips hub. I had the timeout issues like crazy. Once i hooked it all up direct i was able to do the 2 second heartbeat with no timeout issues.
I'm cutting back on powerline use as much as i can.

@deanlyoung
Copy link
Author

deanlyoung commented Feb 20, 2017

@pponce I can safely conclude, the powerline ethernet was the an issue. No Most errors not occurring whatsoever now. That's exciting, while at the same time depressing that it was that "simple". (cc: @ebaauw)

edit: after the longest period without an ESOCKETTIMEDOUT (and of course as I was typing that and submitting it), I still got one more... :-(

@pponce
Copy link

pponce commented Feb 20, 2017

I've also had luck improving homekit related networking issues by turning on IGMP proxy and IGMP snooping on my router if you have those options (for each wireless radio and for lan).
Worth a shot. For me it improved things noticeably.

@ebaauw
Copy link
Owner

ebaauw commented Feb 21, 2017

I double checked the documentation and code of request, the npm module homebridge-hue uses to send commands to the Hue bridge:

  • ETIMEDOUT is returned, when request doesn't receive any answer from the Hue bridge within the specified timeout period. This period is specified using the timeout parameter in config.json, 5 seconds by default.
  • ESOCKETTIMEDOUT is returned when part of the answer was received, but the next part takes too long. Not sure, but I think request uses the same timeout setting for this.
  • ECONNRESET is returned when the Hue bridge "hangs up" on homebridge-hue.

For all these errors, home bridge-hue was able to open a network connection to the Hue bridge, but received no timely response. ETIMEDOUT and ESOCKETTIMEDOUT could indicate network hick-ups, but could also indicate that the Hue bridge is too busy with other stuff, like processing a homebridge-hue heartbeat, processing requests from other Hue apps, doing firmware updates, whatever.
I've only seen ECONNRESET when sending too many parallel requests to the Hue bridge (I still need to implement request throttling to prevent that).

You could try and increase timeout to get rid of the ETIMEDOUT and ESOCKETTIMEDOUT errors, but you're fighting the symptom. When the Hue bridge doesn't respond within 5 seconds, your lights will appear unresponsive.

To check for network problems, you could try ping or curl as I described above. Wired ethernet is more reliable than Wifi or powerline. In my experience, powerline is more reliable than Wifi, but your mileage may vary.
To check for a busy bridge, run homebridge -D and observe when the errors occur: how many open requests, during a heartbeat, etc. You could try and reduce the heartrate.

I started out retrieving the full bridge state on each heartbeat (with a single request get /). Then I moved to using separate requests get /sensors, get /lights, etc, but only when homebridge-hue is actually exposing the corresponding resources (issue #3). That caused startup issues (#26) so I moved back to using a single get /.
I should probably do a full state request only on startup and move back to separate (sequential!) requests on the heartbeat. This would result in more requests, but probably on the Hue bridge and less network traffic (especially with a lot of unexposed rules). My next idea is using different heartrates for sensors (react quickly to buttons) vs other resources (react more slowly to light changes to take it easier on the bridge).

ebaauw added a commit that referenced this issue Feb 22, 2017
Heartbeat now (again) does separate requests for sensors, lights,
groups, etc.
@ebaauw
Copy link
Owner

ebaauw commented Feb 22, 2017

v0.3.5 again does separate requests per resource type on heartbeat.

@ebaauw
Copy link
Owner

ebaauw commented Feb 24, 2017

Finally, homebridge-hue v0.3.7 throttles requests to the Hue bridge, see point 2 in #15 (comment). homebridge-hue will issue maximum 3 requests in parallel to a v1 (round) bridge, and maximum 10 parallel requests to a v2 (square) bridge. This can be changed in config.json using the parallelRequests parameter.
The throttling is implemented using deferred.gate(). @fedhat, thanks again for introducing me to that package.
v0.3.5 re-introduced separate requests per resource type on heartbeat (issue #3 and #26).
Since v0.2.4, homebridge-hue collects multiple HomeKit updates to the same light into a single bridge request (issue #59 and point 3 in #15 (comment)).

@pponce, @deanlyoung, this is all I can think off to improve homebridge-hue's performance. Please let me know if you see any improvements.

@pponce
Copy link

pponce commented Feb 25, 2017

My issues were almost completely solved by not using WiFi or power-line, instead hooking up direct via ethernet.
IGMP snooping and proxy settings enabled on the router I believe also helped but that might be a placebo effect not sure.
The last two ESOCKETTIMEDOUT errors i saw were both for /sensor requests (i can now tell because of your recent update to request by resource type).
I have 8 dimmers and 1 tap switch. Plus a daylight sensor and 2 led strips for lights that i'm using the plugin for. The rest of my lights (over 20) are exposed by the hue bridge directly.
I'll monitor to see if my sensor request timedout errors go away with 0.3.7. But even those were not happening often.
Assuming my switches and tap switch count as sensors i do have more of those than lights being exposed by homebridge-hue so that may be why it's the sensor request that is timing out.

@ebaauw
Copy link
Owner

ebaauw commented Feb 25, 2017

Thanks, @pponce.

IGMP snooping and proxy settings enabled on the router I believe also helped but that might be a placebo effect not sure.

IGMP snooping is a technique for a network router/switch to reduce the network traffic to hosts that do not need multicast traffic, used e.g. for IPTV, but also for UPnP. Basically, a multicast server sends messages to a special broadcast address per service (e.g. 239.255.255.250 for UPnP). Without IGMP snooping, hosts (the Hue bridge, your RaspPi running homebridge) filter out the multicast messages they don't need, but that costs some network-related processing. With IGMP snooping, the router/switch forwards the multicast messages only to ports connected to a host that actually wants them, reducing the network traffic on the other ports.
All power-line connected hosts connect to the router/switch using the same port (connected to the power-line adapter). If one of those hosts subscribes to e.g. IPTV, the corresponding multicast stream is forwarded to all power-line connected hosts.

Assuming my switches and tap switch count as sensors i do have more of those than lights being exposed by homebridge-hue so that may be why it's the sensor request that is timing out.

Yes they do. Additionally, the Hue app or some other app for the Hue bridge might have created CLIP sensors working in conjunction with the rules for these switches.
Homebridge-hue does an http get of /sensors and an http get of /lights. The bridge responds with all sensors and all lights respectively, not just the ones exposed by homebridge-hue. You can use the API debugger on your Hue bridge to issue those requests manually and see the corresponding Hue bridge responses.
By my count you have 24 lights (20 exposed by the Hue bridge and 2x2 for the LED strips) and at least 10 sensors (8 dimmers, 1 tap, and the built-in daylight sensor). Depending on the amount of CLIP sensors, the response for the get /lights could very well be larger than the response for get /sensors.
On heartbeat, homebridge-hue sends the get requests sequentially, i.e. it only sends the second request after it has received the response for the first request. This is different from update requests, which are sent in parallel, i.e. the second request is sent before the response of the first request is received. When a get request returns an error, the remaining requests for that heartbeat are skipped (no need to worsen a bad situation). As homebridge-hue issues the get /sensors before the get /lights, you'll more likely see the ETIMEDOUT or ESOCKETTIMEDOUT for that request.

@deanlyoung
Copy link
Author

@ebaauw as always, you da man! I'm installing now and will keep an eye on performance. Really helps that you walked us through the technical aspects of all this, too. I certainly learned some things and can understand how to better optimize my setup because of that., and not just think "I don't know why it works, but it works". The biggest thing was understanding how power-line sends all requests through the same port.

Even before the latest updates, I can still say that performance was drastically improved by connecting the RPi running Homebridge directly to the router improved speed and reliability. Most noticeable with changing multiple lights' hue/saturation/brightness at the same time. Before, I'd have to trigger a Scene 3 times to get the lights to change "all the way" (i.e. usually only one light, or one state would take effect - only color changed, only brightness changed, etc.).

@pponce
Copy link

pponce commented Feb 26, 2017

@ebaauw Yes thanks for all the detail.
Is there a somewhat easy way to clean up the config on the hue-bridge? I see a ton of whitelisted users and would like to delete some of them and any associated rules/configs/sensors associated to those users.

@pponce
Copy link

pponce commented Feb 26, 2017

Never mind. i see how i can do it one by one.
thanks for pointing me to the debug tool and api.

@ebaauw
Copy link
Owner

ebaauw commented Feb 26, 2017

Indeed, one by one is the only way provided by the Hue API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants