-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add otel_monitor #461
base: main
Are you sure you want to change the base?
Add otel_monitor #461
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
%%%------------------------------------------------------------------------ | ||
%% Copyright 2020, OpenTelemetry Authors | ||
%% Licensed under the Apache License, Version 2.0 (the "License"); | ||
%% you may not use this file except in compliance with the License. | ||
%% You may obtain a copy of the License at | ||
%% | ||
%% http://www.apache.org/licenses/LICENSE-2.0 | ||
%% | ||
%% Unless required by applicable law or agreed to in writing, software | ||
%% distributed under the License is distributed on an "AS IS" BASIS, | ||
%% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
%% See the License for the specific language governing permissions and | ||
%% limitations under the License. | ||
%% | ||
%% @doc | ||
%% Process that can optionally monitor the process a span is in and end the | ||
%% span if the process stops for any reason with the span still unfinished. | ||
%% @end | ||
%%%------------------------------------------------------------------------- | ||
-module(otel_monitor). | ||
|
||
-behaviour(gen_server). | ||
|
||
-export([start_link/0, | ||
monitor/1]). | ||
|
||
-export([init/1, | ||
handle_call/3, | ||
handle_cast/2, | ||
handle_info/2]). | ||
|
||
-include("otel_tracer.hrl"). | ||
|
||
-define(SERVER, ?MODULE). | ||
-define(TABLE, ?MODULE). | ||
|
||
start_link() -> | ||
gen_server:start_link({local, ?SERVER}, ?MODULE, nil, [{name, ?SERVER}]). | ||
|
||
init(nil) -> | ||
_TableId = ets:new(?TABLE, [set, named_table]), | ||
{ok, nil}. | ||
|
||
handle_call({monitor, SpanCtx, Pid}, _From, State) -> | ||
Ref = erlang:monitor(process, Pid), | ||
true = ets:insert(?TABLE, {Ref, SpanCtx}), | ||
{reply, ok, State}. | ||
|
||
handle_info({'DOWN', Ref, process, _Pid, normal}, State) -> | ||
case ets:take(?TABLE, Ref) of | ||
[] -> nil; | ||
[{_Ref, SpanCtx}] -> otel_span:end_span(SpanCtx) | ||
end, | ||
{noreply, State}; | ||
|
||
handle_info({'DOWN', Ref, process, _Pid, {shutdown, _}}, State) -> | ||
case ets:take(?TABLE, Ref) of | ||
[] -> nil; | ||
[{_Ref, SpanCtx}] -> otel_span:end_span(SpanCtx) | ||
end, | ||
{noreply, State}; | ||
|
||
handle_info({'DOWN', Ref, process, _Pid, Reason}, State) -> | ||
case ets:take(?TABLE, Ref) of | ||
[] -> nil; | ||
[{_Ref, SpanCtx}] -> | ||
otel_span:add_event(SpanCtx, 'Process died', #{<<"reason">> => iolist_to_binary(io_lib:format("~p", [Reason]))}), | ||
otel_span:end_span(SpanCtx) | ||
end, | ||
{noreply, State}. | ||
|
||
handle_cast(_Msg, State) -> | ||
{noreply, State}. | ||
|
||
monitor(SpanCtx) -> | ||
ok = gen_server:call(?SERVER, {monitor, SpanCtx, self()}), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. User probably doesn't want to crash if the monitor fails. I'd say probably just want a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like that idea, returning true / false based for success / failure. |
||
true. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be more than 1 span being monitored in the same process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but each span will get its own ref, no? I guess this is what I will find out when I write tests...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, my mistake. I was thinking it was like my implementation which monitors all spans in a process, so it looks up all spans for a pid, not by ref.
Which is a separate question I have, is it really worth doing it per-span, are there spans that a user wouldn't want ended when a process crashes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are asking why I didn't write a PR that monitors every span's process, it's because bigger changes like that are less likely to be accepted. I am aiming with this PR for the incremental approach: first offer the api optionally, if it turns out that everyone uses it all the time, then contemplate making it mandatory.
But if you are comfortable going directly to monitoring every span's process, then I am happy coding that up.
I'm not sure whether it's cheaper to have a single monitor per pid and juggling spans associated with that pid in an ets table, or just creating a new monitor and ets row per span.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'm asking why not monitor every span in a single process the user requests.
If the user requests to monitor Span X in process P do they really not want Span Y that is also active in process P to be ended when the process dies? If they do they have to call
monitor(X)
andmonitor(Y)
. The API I had in mind was justmonitor()
, and then every span inself()
is monitored. I suppose also amonitor(Pid)
makes sense to include if we go that route.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if going the route of just monitoring the Pid and all its spans we can add a
pid
to the span itself. In my PR for the monitor I add a pid to span so that the existing table can just be used to look up the spans to end.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm asking in slack. It is hard to know what is best without people's input on how they'd use it, or if they are already using their own solution, like you have been.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 glad we are having this discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would want to have this applied to every span.