-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for tagging profiles with opentelemetry trace identifiers #1568
Conversation
I've been working on integrating Datadog's continuous profiler with opentelemetry traces (see DataDog/dd-trace-rb#1568). The profiler runs on a background thread, and needs to be able to access the current context for a thread, to be able to get the current span's trace id and span ids (if any active). To do so, I was originally using ```ruby thread = ... ::OpenTelemetry::Trace.current_span(thread[::OpenTelemetry::Context::KEY]) ``` Unfortunately, after open-telemetry#807, this interface was changed, and more importantly, `Context::KEY` was removed and replaced with `Context::STACK_KEY`. `STACK_KEY` was marked as `private_constant`. With 1.0.0.rc2, the only way of getting this information is by relying on private implementation details, which isn't great. Thus, I would like to ask if it'd be reasonable to add an optional `thread` parameter to `Context.current`. This would make it easy to access the needed information, and it would even be more future-proof as the outside code doesn't need to care anymore where and how the context is stored.
I've been working on integrating Datadog's continuous profiler with opentelemetry traces (see DataDog/dd-trace-rb#1568). The profiler runs on a background thread, and needs to be able to access the current context for a thread, to be able to get the current span's trace id and span ids (if any active). To do so, I was originally using ```ruby thread = ... ::OpenTelemetry::Trace.current_span(thread[::OpenTelemetry::Context::KEY]) ``` Unfortunately, after open-telemetry#807, this interface was changed, and more importantly, `Context::KEY` was removed and replaced with `Context::STACK_KEY`. `STACK_KEY` was marked as `private_constant`. With 1.0.0.rc2, the only way of getting this information is by relying on private implementation details, which isn't great. Thus, I would like to ask if it'd be reasonable to add an optional `thread` parameter to `Context.current`. This would make it easy to access the needed information, and it would even be more future-proof as the outside code doesn't need to care anymore where and how the context is stored.
tracer = Datadog.tracer | ||
return unless tracer.respond_to?(:active_correlation) | ||
|
||
@tracer = tracer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing the tracer can possibly hold the wrong instance of Datadog.tracer
when Datadog.configure
is called, as we rebuild components at that time, creating a brand new tracer instance at Datadog.tracer
.
If this class is recreated alongside the profiler on Datadog.configure
calls, then all is likely well if #trace_identifiers_for
cannot be invoked before the new tracer is created during Datadog.configure
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. This class is indeed re-created with the profiler (in helper.rb), but with the whole thing about the components lock doesn't guarantee that the new tracer will be available in Datadog.tracer
before the profiler starts sampling. I'll tweak this a bit to avoid this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in f6711e2 . Let me know your thoughts, as I know this is usually not the way things are done in ddtrace ;)
This refactor will enable us to support getting the trace identifiers from more than just datadog traces without adding yet more logic to the stack collector.
We are now able to gather the trace identifiers from ongoing traces being done with the opentelemetry gems. Priority is still given to ddtrace traces, and we also take care to impact users that do not use opentelemetry. NOTE: I'm somewhat unsure about adding `opentelemetry-api` to the `Gemfile` vs adding it as an appraisal. All our current appraisals seem to be for tracing integrations, which is not the case, and we do already have an "opentelemetry" rake task that existed so it may be confusing to have it changed into an appraisal (or to have two) so I took the simplest route. Suggestions welcome!
This avoids surprises if customers try to use an older version.
In particular open-telemetry/opentelemetry-ruby#807 changed some of the APIs we were using to get the current span for a thread. To test both 0.17.0 and this new version I moved opentelemetry to appraisals.
As discussed during PR review of #1568, it's problematic to just cache whatever's in `Datadog.tracer` because we can observe the old value during component initialization. To break the loop, let's instead directly initialize the `TraceIdentifiers::Ddtrace` class with the correct tracer during component initialization. IMHO this has a further advantage: it makes it really explict where there is a tracer-to-profiler dependency whereas previously there was just a call to `Datadog.tracer` deep in the bowels of the profiler that could be called at any point.
d185617
to
f6711e2
Compare
P.s.: I did a rebase/force push to get the effects of #1569 in this PR (hiding generated gemfiles). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's great to have a concrete tracer
instance passes around, instead of referencing Datadog.tracer
when possible, and given the profiler+tracer are both initialized by us, it makes a lot of sense to have this direct reference.
I've marked this as draft, as there's some ongoing discussion on open-telemetry/opentelemetry-ruby#842 that may end up factoring in how we support opentelemetry, so I'd like to hold on merging this until there's a clearer path forward. |
As discussed during PR review of #1568, it's problematic to just cache whatever's in `Datadog.tracer` because we can observe the old value during component initialization. To break the loop, let's instead directly initialize the `TraceIdentifiers::Ddtrace` class with the correct tracer during component initialization. IMHO this has a further advantage: it makes it really explict where there is a tracer-to-profiler dependency whereas previously there was just a call to `Datadog.tracer` deep in the bowels of the profiler that could be called at any point.
TODO: This needs to be rebased on top of #1591 (which is now on master). |
Unfortunately there hasn't been a lot of movement by upstream opentelemetry, so the team decided to close this. Since we merged #1591 anyway (which enables pluggable trace identifiers) we should be in a good shape to revisit this at any point in the future. |
See also #3510 |
This PR adds support for gathering the trace identifiers from ongoing traces being done with the opentelemetry gems.
Priority is still given to ddtrace traces, if available. I've refactored this code out of the stack collector, so now it's easier to control the dependency between profiler and tracer, as well as to have the additional opentelemetry support.
As I was about to open this PR, the interface used to access the current trace was refactored and part of it was marked private in the opentelemetry-api 1.0.0.rc2 release. I plan to follow-up with them and ask if it's possible to make this public again (open-telemetry/opentelemetry-ruby#842). For now, we still work on that version, but use the private interface.