Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health check not working on 12.3.5172 #243

Open
danavilacon opened this issue Sep 19, 2017 · 3 comments
Open

Health check not working on 12.3.5172 #243

danavilacon opened this issue Sep 19, 2017 · 3 comments

Comments

@danavilacon
Copy link

danavilacon commented Sep 19, 2017

I'm running datadog agent on DC/OS 1.8, and using the image datadog/docker-dd-agent:12.3.5172 the health check is not longer working:

# supervisorctl status 
http://localhost:9001 refused connection 
# supervisord -v 
3.3.3 
# /opt/datadog-agent/embedded/bin/python /opt/datadog-agent/bin/supervisorctl status 
http://localhost:9001 refused connection
# service datadog-agent status
unix:///opt/datadog-agent/run/datadog-supervisor.sock refused connection
Datadog Agent (supervisor) is NOT running all child processes

Maybe the supervisor conf is not longer compatible with the version 3.3.3:
https://github.com/DataDog/dd-agent/blob/master/packaging/supervisor.conf

Just check this:
https://askubuntu.com/questions/911994/supervisorctl-3-3-1-http-localhost9001-refused-connection
http://supervisord.org/configuration.html#inet-http-server-section-settings

@xvello
Copy link
Contributor

xvello commented Sep 25, 2017

Hi @danavilacon

12.3.5172 removed the supervisord port and switched to a shell script /probe.sh for the healthcheck, as exposing supervisord had security implications.
You'll need to update your service definition to do so. I'm currently working on a universe update to do that

@xvello xvello self-assigned this Sep 25, 2017
@xvello xvello closed this as completed Oct 20, 2017
@danavilacon
Copy link
Author

@xvello But probe script is using the supervisord, and is not working:
https://github.com/DataDog/docker-dd-agent/blob/master/probe.sh

# /opt/datadog-agent/embedded/bin/python /opt/datadog-agent/bin/supervisorctl -c /etc/dd-agent/supervisor.conf status
unix:///opt/datadog-agent/run/datadog-supervisor.sock refused connection
# sh probe.sh
# echo $?
1

@danavilacon
Copy link
Author

@xvello finally, this is the health check that is working for us "healthChecks": [ { "protocol": "COMMAND", "command": { "value": "/etc/init.d/datadog-agent status" }, "gracePeriodSeconds": 300, "intervalSeconds": 60, "timeoutSeconds": 20, "maxConsecutiveFailures": 3 } ],

@xvello xvello reopened this Nov 17, 2017
@xvello xvello removed their assignment Jul 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants