-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
App in Homey gets set to Paused status #38
Comments
Thanks for the thorough report! I got your diagnostic reports but they are unfortunately empty. I suspect the logs are cleared when the app gets killed. I also don't get any automatic crash reports from when this happens. It can of course be a resource quota like memory that is exceeded. I did not see a way to increase any resource request for the app. If the limit is hard coded, I guess it can be a bit harder to fix. Of course, optimizations might be possible. If you check the historic memory usage, so you see any spike (in any app) before the pause? |
I did some checking but I cannot find any other app to use so much memory. For now the workaround of restarting every hour is working (I'm missing just 0m45s-1m45s of data in Prometheus every hour) for me. |
Hi, |
Last time I checked, most memory was spent inside the Homey API. It usually prints warnings about using excessive numbers of event listeners if having a lot of devices. I suspect the Homey API was not really built for subscribing to every single device in the same app. I think it can be given another shot given that the tooling and API has improved since I last attempted this. |
Describe the bug
The Prometheus.IO app is randomly being set to paused status which is only cleared by restarting the app
Diagnostics report ID
This was made yesterday but I'm not sure if it is of any use: c423a8f9-8a7a-4f0d-84b2-c665ee3a9b27
And another instance of it happening while I was creating this bug report: bebdbc25-2950-4d7e-a98c-da4aee3438a4
Another instance at just after 17:00 today: a99f6a78-2de3-4e0b-8e89-9b38f125ff8d
Again just after 20:00 even with scraping set to every 60 seconds (instead of 15 seconds): 6941bcf4-fc1e-49f0-a469-8c7182d7d944
And again just after 21:00 6ff6f11f-d0a3-4c53-8c28-65a172d1546b
Configuration
Hardware revision: "Homey Pro (Early 2019)".
Firmware version : 8.1.4
Additional context
I recently added 17 virtual devices to Homey (13 with just "measure power" capability and 4 others with "measure power" and "meter power" capability). Before that there were no apparent issues. Also I researched a bit an found somewhere that Homey does automatically sets apps in a paused state when said app used more than 80MB of memory.
The Prometheus.IO app is regularly using between 28 and 34MB of memory (while Athom/Homey mentions apps should not use more than 30MB) with occasional spikes to around 70MB but i could not find any occurrence in prometheus of the app using 80MB or more (possibly because the app is then on paused status and not sending it to the Prometheus server).
I already tried removing some unused apps and also removed some devices from Homey to see if that reduces the Prometheus.IO memory footprint but it does not appear to make any perceivable difference. I also have set now a Homey Flow to restart the Prometheus.IO app at 3:33 am but that does also not prevent the issue from happening it seems to be happening at random and the memory footprint does not seem to increase or decrease towards the time that it is stopped by Homey. The Prometheus server is querying Homey/Prometheus.io every 15 seconds.
I did find that the Grafana query "sum(rate(homey_exporter_self_time_seconds[$__rate_interval]))" shows a high spike a few seconds before the stop of Prometheus.IO of around 30 seconds while typically this would be no more than 3 seconds.
the following items seems to contribute the bulk of the spike "{action="updatedevicelist", device="_updatedevicelist", instance="192.168.34.161:9414", job="prometheus", name="_updatedevicelist", type="real", zone="_updatedevicelist", zones="_updatedevicelist"}", "{action="updatedevicelist", device="_updatedevicelist", instance="192.168.34.161:9414", job="prometheus", name="_updatedevicelist", type="system", zone="_updatedevicelist", zones="_updatedevicelist"}" and "{action="updatedevicelist", device="_updatedevicelist", instance="192.168.34.161:9414", job="prometheus", name="_updatedevicelist", type="user", zone="_updatedevicelist", zones="_updatedevicelist"}" apparently at about just after the whole hour mark (at least that is what Grafana shows [15-45 seconds after the whole hour]) these contribute a total of 26-28 seconds. Normally the spikes for this contribute about 5 seconds at just past the whole hour mark.
edit 19:14 CEST : I've now put the scraping interval of the Prometheus server to once every 60 seconds instead of once every 15 seconds. In the assumption that the multi second spikes take too long causing multiple scraping requests to queue up and increase memory usage until the cause of the spike has resolved.
edit 21:17 CEST: just after 20:00 CEST Prometheuse.IO was put to paused again even though the scraping interval was put to 60 seconds
edit 21:33 CEST: i have no created a flow that restarts Prometheus.IO every hour (with a delay of 4 minutes) so that at least if it gets paused just after the hour it gets restarted.
The text was updated successfully, but these errors were encountered: