Uptime/Failure Monitoring #517

kramerrs · 2024-08-12T14:18:29Z

A consequence of preinitialized containers, is that it is possible to get a sense when an app starts to fail. Often times this is data related, some entry in a database is outside the scope of the developer's expectation, and it cause some visualization to fail. With preinitialized containers it would be nice to get some sort of warning email when an application is repeatedly failing.

LEDfan · 2024-09-17T09:00:10Z

We currently don't have plans to implement the email warning, although I see a few alternatives.
There is already a metric for app failures, but this does not include pre-initiliazed containers that failed to start. Therefore, we could create a new metric that exposes the number of failed pre-initiliazed containers. Using prometheus and alertmanager you can then alert on these metrics (https://shinyproxy.io/documentation/usage-statistics/).

For the next release, we plan to further integrate the pre-initialization feature into the admin dashboard. We could include here if the containers are failing to start. This data would then also be exposed in the admin api (https://shinyproxy.io/downloads/swagger/?urls.primaryName=ShinyProxy%203.1.1#/ShinyProxy/adminData) and again this could be used to report on this.

I'll keep this open as a enhancement request for the email report.

kramerrs · 2024-09-17T19:07:35Z

I was able to get some traction on this. It's possible to spin up a docker container monitor for ShinyProxy from a lightweight alpine container and send emails messages from it. I am thinking about how best to identify a repeated failure, as opposed to a one off. For example, I could monitor and use regular expressions to test for the delegate failure in the logs. I think I can put the whole thing in the shinyproxy compose yml.

kramerrs · 2024-09-23T19:32:06Z

I tried to monitor the logs for "Delegate Failed" messages. This works, however it wasn't a reliable metric for monitoring the app. I tried setting up an app that failed during setup, and it didn't send this message. It tried to connect then sent a 410 response. However, when I tried to connect with a browser, it generated the "Delegate Failed" message. Seems like I have seen the "Delegate Failed" response other times. Just wondering if there is any way to reliably detect when the app fails during load. Is this a Docker service thing? Should I look in the docker logs to see when the service needs to be relaunched?

LEDfan added the enhancement label Sep 17, 2024

kramerrs mentioned this issue Oct 15, 2024

Repeated container failures on ShinyProxy startup #529

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uptime/Failure Monitoring #517

Uptime/Failure Monitoring #517

kramerrs commented Aug 12, 2024

LEDfan commented Sep 17, 2024

kramerrs commented Sep 17, 2024

kramerrs commented Sep 23, 2024

Uptime/Failure Monitoring #517

Uptime/Failure Monitoring #517

Comments

kramerrs commented Aug 12, 2024

LEDfan commented Sep 17, 2024

kramerrs commented Sep 17, 2024

kramerrs commented Sep 23, 2024