-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: eap EndpointTraceItemTable spare aggregate attribute transformer #6817
Conversation
@@ -2101,6 +2136,94 @@ def test_bad_aggregation_filter(self, setup_teardown: Any) -> None: | |||
with pytest.raises(BadSnubaRPCRequestException): | |||
EndpointTraceItemTable().execute(message) | |||
|
|||
def test_sparse_aggregate(self, setup_teardown: Any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test ensures desired behavior as described in this ticket https://github.com/getsentry/eap-planning/issues/158
that there arent the 0 rows
@@ -131,6 +131,41 @@ def gen_message( | |||
} | |||
|
|||
|
|||
def write_eap_span( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is unrelated to the PR, I added it to make gen_message
easier to use in my tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are unit tests for the transformer
❌ 2 Tests Failed:
View the top 2 failed tests by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py
Outdated
Show resolved
Hide resolved
tests/web/rpc/v1/test_endpoint_trace_item_table/test_endpoint_trace_item_table.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the following suggestion:
When we do aggregations we already transform them into aggregate_functionIf
which has the needed conditions. The problem is that the If
combinator in clickhouse returns the default value (0) if no rows match the condition. One way to address this is to add the orNull
combinator to the aggregation so that it returns Null when no rows match the condition. We could then decide what to do with the null values depending on the behaviour we want. This would also be the first part of addressing https://github.com/getsentry/eap-planning/issues/169.
This would also ensure we don't add additional overhead from copying the request and adding conditions that we already add to the aggregations.
@davidtsuk I don't think I follow it. https://github.com/getsentry/eap-planning/issues/169 requires us to distinguish between empty and null, whereas you seem to suggest that we should return null when no rows match the condition (i.e. empty result). Isn't that contradictory to what's expected in the ticket? |
@onkar Our protobuf definition currently doesn't allow us to distinguish between a null result and a default result (i.e. empty). The point I was trying to make is that we should distinguish between them because a default value (e.g. 0) can actually represent a result (e.g. if you add -2 + 2) so we cannot use empty/default values to represent the case where no rows were aggregated. |
bc8f702
to
ab0fa3c
Compare
8b6d348
to
b26b993
Compare
if you have the following filter in your `TraceItemTableRequest` `exists(sentry.duration_ms)` it will get translated to `isNotNull(sentry.duration_ms)` which causes a clickhouse error bc that column doesnt actually exists, everywhere else is is translated to `duration_ms AS sentry.duration_ms` so this PR fixes that bug by translating these special columns properly by using the `attribute_key_to_expression` function that is used everywhere else. honestly I dont know many internal details of this function so im slightly worried that there will be edge cases that could break things. but based on the name of the function and how i see it used elsewhere in the codebase, I will assume that if the tests pass with this change then its fine. this bug is blocking #6817
assert measurement_reliabilities == [ | ||
Reliability.RELIABILITY_LOW, | ||
Reliability.RELIABILITY_UNSPECIFIED, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why have these changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you look to the top of the test, the messages with bar
have no custom_measurement
https://github.com/getsentry/snuba/pull/6817/files/cb324ebce1e4392d60bc65138a8ae15d0af652b9#diff-440653fbc080c88c781f59384bcd3722c93ec37540b50595da42233e67449c1aR478-R481
it used to be the case that those rows would still be included in the results, but due to the new query processor that adds the hasAttribute(custom_measurement)
filter, they are no longer part of the result
this pr addresses this ticket for EndpointTraceItemTable. https://github.com/getsentry/eap-planning/issues/158
It makes it so when we have a query like
it will add
Major change
The major change in this PR is the addition of
SparseAggregateAttributeTransformer
this is a class responsible for transforming the originalTraceItemTableRequest
into the new one with the additional filter. This query transformation step is then added as a step in the resolver that all requests go through.I think it should be sufficient to only do EndpointTraceItemTable and not EndpointTimeSeries since timeseries wont graph the 0 data but ill ask tony to make sure. If timeseries is needed it should only be 1-2 days of extra work.