Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uptime/Failure Monitoring #517

Open
kramerrs opened this issue Aug 12, 2024 · 3 comments
Open

Uptime/Failure Monitoring #517

kramerrs opened this issue Aug 12, 2024 · 3 comments

Comments

@kramerrs
Copy link

A consequence of preinitialized containers, is that it is possible to get a sense when an app starts to fail. Often times this is data related, some entry in a database is outside the scope of the developer's expectation, and it cause some visualization to fail. With preinitialized containers it would be nice to get some sort of warning email when an application is repeatedly failing.

@LEDfan
Copy link
Member

LEDfan commented Sep 17, 2024

We currently don't have plans to implement the email warning, although I see a few alternatives.
There is already a metric for app failures, but this does not include pre-initiliazed containers that failed to start. Therefore, we could create a new metric that exposes the number of failed pre-initiliazed containers. Using prometheus and alertmanager you can then alert on these metrics (https://shinyproxy.io/documentation/usage-statistics/).

For the next release, we plan to further integrate the pre-initialization feature into the admin dashboard. We could include here if the containers are failing to start. This data would then also be exposed in the admin api (https://shinyproxy.io/downloads/swagger/?urls.primaryName=ShinyProxy%203.1.1#/ShinyProxy/adminData) and again this could be used to report on this.

I'll keep this open as a enhancement request for the email report.

@kramerrs
Copy link
Author

I was able to get some traction on this. It's possible to spin up a docker container monitor for ShinyProxy from a lightweight alpine container and send emails messages from it. I am thinking about how best to identify a repeated failure, as opposed to a one off. For example, I could monitor and use regular expressions to test for the delegate failure in the logs. I think I can put the whole thing in the shinyproxy compose yml.

@kramerrs
Copy link
Author

I tried to monitor the logs for "Delegate Failed" messages. This works, however it wasn't a reliable metric for monitoring the app. I tried setting up an app that failed during setup, and it didn't send this message. It tried to connect then sent a 410 response. However, when I tried to connect with a browser, it generated the "Delegate Failed" message. Seems like I have seen the "Delegate Failed" response other times. Just wondering if there is any way to reliably detect when the app fails during load. Is this a Docker service thing? Should I look in the docker logs to see when the service needs to be relaunched?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants