-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add span monitor process to optionally monitor the spans in a process #84
Conversation
This is partly wrong. I realized right before opening the PR that this shouldn't be done for the current process when Not sure yet how I want to fix this. |
Two questions I have are:
|
A span can optionally set `monitor` to true at the time it is created. This results in a monitor on the process and if that process exits for any reason all spans started in that process will be ended.
It makes sense that you'd want to close all spans in the monitored process since the process dying is fatal to all of them. |
@bryannaegele my thought too. But people do/want weird things. I should probably change it to be a function called to monitor all spans in a process instead of adding it to the span start opts then... |
I think I'd want to add this to my tracer config to cover every span it starts. I'd also want a way to turn the exit details into extra span attributes, though the latter'd also apply to any |
@tsloughter is there any remaining work left on this PR? I'm seeing this manifest when using One of the tasks times out with no child span reported |
@davydog187 oh, hm, I think I just wanted to offer a function to start monitoring in a process (instead of just being an option to Maybe a config value so it could be turned on globally but we probably will want a pool of monitoring processes before supporting that so it'll come in a separate PR. |
So I want to finish this up and can't decide what the best interface is. An issue is how to handle inactive spans. I think this means that when attaching a context the active span in that context has to have its pid updated in the ets table... |
I implemented (by peeking at this PR) this in a project I'm working on. In the end I went with the following interface: OpenTelemetry.Tracer.with_span "my_span" do
OpenTelemetry.Span.monitor(OpenTelemetry.Tracer.current_span_ctx())
end Here's the code I ended up with, for posterity: defmodule OT.Monitor do
use GenServer
def start_link(_arg) do
GenServer.start_link(__MODULE__, nil, name: __MODULE__)
end
def init(nil) do
_table_id = :ets.new(__MODULE__, [:bag, :public, {:write_concurrency, true}, :named_table])
{:ok, nil}
end
def handle_call({:monitor, pid}, _from, state) do
Process.monitor(pid)
{:reply, :ok, state}
end
def handle_info({:DOWN, _ref, :process, pid, :normal}, state) do
:ets.take(__MODULE__, pid)
|> Enum.each(fn {_pid, ctx} ->
_span_ctx = OpenTelemetry.Tracer.set_current_span(ctx)
_ = OpenTelemetry.Tracer.end_span()
end)
{:noreply, state}
end
def handle_info({:DOWN, _ref, :process, pid, {:shutdown, _}}, state) do
:ets.take(__MODULE__, pid)
|> Enum.each(fn {_pid, ctx} ->
_span_ctx = OpenTelemetry.Tracer.set_current_span(ctx)
_ = OpenTelemetry.Tracer.end_span()
end)
{:noreply, state}
end
def handle_info({:DOWN, _ref, :process, pid, reason}, state) do
:ets.take(__MODULE__, pid)
|> Enum.each(fn {_pid, ctx} ->
_span_ctx = OpenTelemetry.Tracer.set_current_span(ctx)
_ = OpenTelemetry.Tracer.add_event("Process died", [{"reason", inspect(reason)}])
_ = OpenTelemetry.Tracer.end_span()
end)
{:noreply, state}
end
def monitor(span_ctx) do
if Application.fetch_env!(:opentelemetry, :enabled) do
# monitor first, because the monitor is necessary to clean the ets table.
:ok = GenServer.call(__MODULE__, {:monitor, self()})
true = :ets.insert(__MODULE__, {self(), span_ctx})
end
end
end |
I think this functionality is pretty critical. Personally I would want to have this for as many spans as possible, and definitely for active spans. So that means I would prefer not to set it as an option for every span, but rather as a config value, but I'll take any version of this feature I can get 🙂 |
If we either make span processors not a pain or extend them to allow additional functionality attached to them then I think we could easily allow it to be configured for all spans instead of an option per-span. |
@tsloughter i was thinking if spans actually shoudn't live in a external process, that monitor the process starting the spans, in case of the process dying or raises, it closes all remaining open spans with error status. |
@fcevado that could work. There are other tracing libs that do just that, trying to remember which one... Ah, it is Tapper. And I think New Relic or AppSignal actually use (or used) a separate process for each trace -- which I wouldn't think would scale well, at least compared to Tapper or our use of ETS. What I'd hope to be able to experiment with is alternate SDKs. Technically it should be possible for an SDK to be created that does make all span operations messages to another process. No user code should have to change, just which SDK they include in their project. |
Closing for #602 |
A span can optionally set
monitor
to true at the time it is created.This results in a monitor on the process and if that process exits
for any reason all spans started in that process will be ended.