Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make trace_id and span_id types binaries instead of integers #590

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

tsloughter
Copy link
Member

I think this is cleaner. I'd be 100% on this change if it wasn't that technically users may be using the integers in tests and thus this will break their tests.

decide(TraceId, IdUpperBound) ->
Lower64Bits = TraceId band ?MAX_VALUE,
case erlang:abs(Lower64Bits) < IdUpperBound of
decide(<<_:65, LowerBits:63>>, IdUpperBound) ->
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I uh.. don't know why this is only 63 bits.. I just know that is what equaled the value I'd get with TraceId band ?MAX_VALUE which I copied from somewhere else..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is there any kind of spec explaining this specific behavior? I guess this is just because ?MAX_VALUE is the max 64 bits signed integer value (2^63-1) so that's unsurprising you'd look at the 63 lower bits, but I have no idea where the MAX_VALUE requirement comes from.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the sign!

Trying to find now where it comes from.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may need to change (not in this PR) because the w3c spec now specifies 56 bits to be random if the new random trace id flag is set: https://w3c.github.io/trace-context/#random-trace-id-flag

@codecov
Copy link

codecov bot commented May 18, 2023

Codecov Report

Attention: Patch coverage is 96.77419% with 1 line in your changes missing coverage. Please review.

Project coverage is 38.41%. Comparing base (34ca33a) to head (1d0d7c5).
Report is 365 commits behind head on main.

Files Patch % Lines
apps/opentelemetry/src/otel_resource_detector.erl 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #590      +/-   ##
==========================================
+ Coverage   38.16%   38.41%   +0.24%     
==========================================
  Files          61       56       -5     
  Lines        3595     3512      -83     
==========================================
- Hits         1372     1349      -23     
+ Misses       2223     2163      -60     
Flag Coverage Δ
api 73.72% <100.00%> (+5.55%) ⬆️
elixir ?
erlang 38.41% <96.77%> (+0.26%) ⬆️
exporter 8.11% <100.00%> (ø)
sdk 78.41% <80.00%> (+0.19%) ⬆️
zipkin 52.85% <ø> (-1.31%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tsloughter tsloughter force-pushed the binary-trace-span branch 3 times, most recently from 087f639 to faac95a Compare May 18, 2023 16:40
Copy link
Member

@ferd ferd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like the kind of stuff that's gonna be clearer for anyone working around the spec (even if it were to break some tests), and I figure in terms of interop leaving the confines of the VM it gets serialized as safely anyway.

It's likely to make some comparisons safer around zero signedness in values anyway.

decide(TraceId, IdUpperBound) ->
Lower64Bits = TraceId band ?MAX_VALUE,
case erlang:abs(Lower64Bits) < IdUpperBound of
decide(<<_:65, LowerBits:63>>, IdUpperBound) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is there any kind of spec explaining this specific behavior? I guess this is just because ?MAX_VALUE is the max 64 bits signed integer value (2^63-1) so that's unsurprising you'd look at the 63 lower bits, but I have no idea where the MAX_VALUE requirement comes from.

@DianaOlympos
Copy link

Break the tests. This is technically a major breaking change but it is minimal.

Copy link
Contributor

@yordis yordis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 💜

@tsloughter
Copy link
Member Author

@DianaOlympos it broke tests you know about or do you just mean bc it broke tests in this repo?

@DianaOlympos
Copy link

No i mean literally do it. Break the users :)

@marcdel
Copy link
Contributor

marcdel commented May 19, 2023

IMHO any tests this would break will be straightforward to fix. I doubt there would be very many since I assume most folks are doing something like span_id: ^parent_span_id 99% of the time anyway.

@tsloughter
Copy link
Member Author

@marcdel that is what I figured as well. Maybe some check that it is 0 when they are testing that a no-op is used, but that is all I can think of.

@tsloughter
Copy link
Member Author

So @bryannaegele has a good point, using a binary is more memory than an integer. The integer is 3 words for trace id and 2 words for a span id, while the binary in both cases is 6 words.

A small amount until you have a massive amount of spans obviously.

@tsloughter
Copy link
Member Author

Thought I'd do a quick microbenchmark on my laptop:

$ ./erlperf 'rand:uniform(2 bsl 127 - 1).'                                    
Code                                 ||        QPS       Time
rand:uniform(2 bsl 127 - 1).          1    2327 Ki     429 ns
$ ./erlperf 'rand:bytes(8).'                                                 
Code                   ||        QPS       Time
rand:bytes(8).          1    4995 Ki     200 ns

I don't know if it makes up for the extra memory usage.

@tsloughter
Copy link
Member Author

I suppose we can at least switch to <<X:64>> = rand:bytes(8) to get the integer.

@DianaOlympos
Copy link

note: try to pull out the integer 2 bsl 127 - 1 into a constant and inject it instead of recomputing it for every iteration

@tsloughter
Copy link
Member Author

Barely budges when I change to use a constant:

./erlperf 'X = rand:uniform(340282366920938463463374607431768211455), X.'   ─╯
Code                                                       ||        QPS       Time
X = rand:uniform(3402823669209384634633746074317682         1    2279 Ki     438 ns

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants