-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add prometheus exporter for nova-bigvm #388
base: stable/xena-m3
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part 1 of the review. I don't think I can check the logic stuff today.
nova/conf/base.py
Outdated
'bigvm_exporter_listen_port', | ||
default=9847, | ||
help=""" | ||
Port where the BigVM prometheus exporter to listen for HTTP requests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo "to listen" doesn't fit with "where". needs to be something like "exporter listens"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
nova/bigvm/exporter.py
Outdated
|
||
|
||
def start_bigvm_exporter(): | ||
port = int(CONF.bigvm_exporter_listen_port or 9847) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this int
necessary? The option is defined as IntOpt
, so I would assume we don't need it here.
|
||
def start_bigvm_exporter(): | ||
port = int(CONF.bigvm_exporter_listen_port or 9847) | ||
start_http_server(port, registry=REGISTRY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this start a new process or a new thread?
If it starts a thread, does it start an eventlet greenthread (because eventlet patched the threading
module) or does it spawn a native thread?
If it spawns a greenthread, does the prometheus_client
library do anything blocking that could hinder the manager to run properly?
If it spawns a native thread, we cannot use logging
(or anything else take takes a threading.Lock
) anywhere inside that native thread or we risk a hanging service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything behind is based on daemonic threading.Thread
, which to my understanding is patched by eventlet
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You missed to answer one question: Does prometheus_client
use anything that would block the process. eventlet
greenthreads are not preempted, but give up the CPU when they would do blocking operations. This needs library support (or usage of one of the eventlet-patched functions). If the greenthread doesn't give up the CPU on blocking operations, no other greenthread will run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code adds one more greenthread to the pgt()
output. Here is it:
2 <greenlet.greenlet object at 0x7f802a44d510 (otid=0x7f8031506a00) suspended active started>
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/thread.py", line 42, in __thread_body
func(*args, **kwargs)
File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/thread.py", line 63, in wrap_bootstrap_inner
bootstrap_inner()
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/socketserver.py", line 232, in serve_forever
ready = selector.select(poll_interval)
File "/usr/lib/python3.8/selectors.py", line 323, in select
r, w, _ = self._select(self._readers, self._writers, [], timeout)
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/select.py", line 80, in select
return hub.switch()
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch
return self.greenlet.switch()
Looking at the stack trace there is the socketserver.py
which is the module that's being used by the prometheus library to expose the endpoint.
So this runs in a greenthread.
nova/bigvm/exporter.py
Outdated
def __init__(self, registry=REGISTRY): | ||
self.host_errors_counter = \ | ||
Counter('nova_bigvm_host_errors', | ||
'Nova BigVM errors counter.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be more descriptive as I could not tell you, what this means without looking into the code that increases the counter.
nova/bigvm/exporter.py
Outdated
|
||
self.free_hosts_count_gauge = \ | ||
Gauge('nova_bigvm_free_hosts_count', | ||
'Nova BigVM hosts available.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expand on the description a little more, please? hosts available for what? Is this the count of the resource-providers that are ready for a BigVM deployment?
nova/bigvm/exporter.py
Outdated
self.host_freeing_up_gauge.remove( | ||
rp['vc'], rp['host'], rp['rp']['name']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if that provider is not in here, because we just restarted and don't have any data about this provider anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/prometheus/client_python#disabling-default-collector-metrics
It sounds like we have to disable these automatic metrics? Or do we want them? I don't think we need GC and process stats.
I haven't seen any other metric apart from the ones defined by us, while testing the exporter endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but still 2 questions remaining.
|
||
def start_bigvm_exporter(): | ||
port = int(CONF.bigvm_exporter_listen_port or 9847) | ||
start_http_server(port, registry=REGISTRY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You missed to answer one question: Does prometheus_client
use anything that would block the process. eventlet
greenthreads are not preempted, but give up the CPU when they would do blocking operations. This needs library support (or usage of one of the eventlet-patched functions). If the greenthread doesn't give up the CPU on blocking operations, no other greenthread will run.
nova/bigvm/manager.py
Outdated
else: | ||
free_hosts += 1 | ||
|
||
bigvm_metrics.set_free_hosts_count(free_hosts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to show the free hosts per hv_size instead of the overall number, because it depends on the size if we can spawn certain instances?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added hv_size label.
f3ffa01
to
60b8dd1
Compare
Exposing the following metrics: Counter nova_bigvm_host_errors{error, vc, host, rp} Counter nova_bigvm_no_candidate_error{hv_size} Gauge nova_bigvm_host_freeing_up{vc, host, rp} Gauge nova_bigvm_free_hosts_count{hv_size} Change-Id: I050eeb1036910c03428eaa8aad7e992f241f6f51
60b8dd1
to
be76d9d
Compare
Currently exposing the following metrics:
Counter nova_bigvm_host_errors{error, vc, host, rp}
Counter nova_bigvm_no_candidate_error{hv_size}
Gauge nova_bigvm_host_freeing_up{vc, host, rp}
Gauge nova_bigvm_free_hosts_count{}