Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(aci milestone 3): call process_data_sources in subscription processor #81788

Closed
wants to merge 11 commits into from

Conversation

mifu67
Copy link
Contributor

@mifu67 mifu67 commented Dec 5, 2024

Wrap the subscription update in a DataPacket and call process_data_sources, then feed the results into process_detectors and log the result (process_detectors calls a dummy evaluate function for now). Step one of the pipeline.

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Dec 5, 2024
Copy link

codecov bot commented Dec 6, 2024

Codecov Report

Attention: Patch coverage is 78.78788% with 7 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/incidents/grouptype.py 65.00% 7 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master   #81788    +/-   ##
========================================
  Coverage   80.35%   80.36%            
========================================
  Files        7252     7257     +5     
  Lines      320551   320812   +261     
  Branches    20859    20859            
========================================
+ Hits       257577   257812   +235     
- Misses      62579    62605    +26     
  Partials      395      395            

src/sentry/incidents/subscription_processor.py Outdated Show resolved Hide resolved
data_packet = DataPacket(
query_id=self.subscription.snuba_query.id, packet=subscription_update
)
detectors = process_data_sources([data_packet], query_type=self.subscription.type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 - i was intending the query_type to be more of the DataSource, so this would be like snuba_subscription or something along those lines. maybe we should have those exported from data_sources as an enum and applied to the type field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Colleen is writing an enum for this purpose!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: change query_type to be DataSourceType.SNUBA_QUERY_SUBSCRIPTION once Colleen's PR lands

src/sentry/incidents/subscription_processor.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added the Scope: Frontend Automatically applied to PRs that change frontend components label Dec 10, 2024
@mifu67 mifu67 force-pushed the mifu67/aci/process-data-sources branch from 60aa52f to f0768e7 Compare December 10, 2024 18:27
@getsentry getsentry deleted a comment from github-actions bot Dec 10, 2024
@mifu67 mifu67 removed the Scope: Frontend Automatically applied to PRs that change frontend components label Dec 10, 2024
@@ -2956,6 +2958,38 @@ def test_resolved_alert_updates_metric_issue(self, mock_produce_occurrence_to_ka
assert status_change.new_status == GroupStatus.RESOLVED
assert occurrence.fingerprint == status_change.fingerprint

@with_feature("organizations:workflow-engine-m3-process")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty messy. Any suggestions for how we can better log the data flowing through the pipeline?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding centralized logging to the process_* methods, so we shouldn't need to add much in the way of metrics / logging here. It will all be based off of the types set in each of those flows. We should only have to worry about anything that's specific to metric alerts implementation here.

If we make the change to have process_data_packets as i was talking about before, you would just have to verify that process_data_packets is being invoked correctly in different states, then you can assume the platform would be handling the rest. (yay #platforms)

If you wanted to test this stuff w/o that method, it may want to be like different snuba query results with a configured detector, and how does that invoke process_data_sources / process_detectors differently.

@mifu67 mifu67 marked this pull request as ready for review December 10, 2024 21:51
@mifu67 mifu67 requested a review from a team as a code owner December 10, 2024 21:51
@mifu67 mifu67 requested a review from a team December 11, 2024 01:06
Comment on lines +26 to +27
result=None,
event_data=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice for this to be a little more complete for tests by adding the occurrence and event data,

occurrence, event_data = self.build_occurrence_and_event_data("dummy_group", 0, PriorityLevel.HIGH)

and then update the result / event_data,

Suggested change
result=None,
event_data=None,
result=occurrence,
event_data=event_data,



class MetricAlertDetectorHandler(StatefulDetectorHandler[QuerySubscriptionUpdate]):
pass
def evaluate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we be overriding the evaluate method from a stateful detector? just looking a little closer at the base class, it looks like this would call through to check the detectors conditions and that should all work from the base if the models are constructed. (lemme double check with dan on some of his intentions on code in this area though)

src/sentry/incidents/subscription_processor.py Outdated Show resolved Hide resolved
src/sentry/incidents/subscription_processor.py Outdated Show resolved Hide resolved
Comment on lines 372 to 376
detectors = process_data_sources([data_packet], query_type=self.subscription.type)
results = []
for data_packet, detectors in detectors:
results.append(process_detectors(data_packet, detectors))
# NOTE: this is temporary, to verify in the tests that the right information is flowing through the pipeline.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was thinking of adding another process method that wraps process_data_sources and process_detectors and then returns a list of the results.

That would turn this code into something along these lines:

Suggested change
detectors = process_data_sources([data_packet], query_type=self.subscription.type)
results = []
for data_packet, detectors in detectors:
results.append(process_detectors(data_packet, detectors))
# NOTE: this is temporary, to verify in the tests that the right information is flowing through the pipeline.
results = process_data_packets([data_packet], query_type=self.subscription.type)

Inside of process_data_packets i was thinking we could add centralize metrics / logging there. Whatcha think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an excellent idea.

@@ -2956,6 +2958,38 @@ def test_resolved_alert_updates_metric_issue(self, mock_produce_occurrence_to_ka
assert status_change.new_status == GroupStatus.RESOLVED
assert occurrence.fingerprint == status_change.fingerprint

@with_feature("organizations:workflow-engine-m3-process")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding centralized logging to the process_* methods, so we shouldn't need to add much in the way of metrics / logging here. It will all be based off of the types set in each of those flows. We should only have to worry about anything that's specific to metric alerts implementation here.

If we make the change to have process_data_packets as i was talking about before, you would just have to verify that process_data_packets is being invoked correctly in different states, then you can assume the platform would be handling the rest. (yay #platforms)

If you wanted to test this stuff w/o that method, it may want to be like different snuba query results with a configured detector, and how does that invoke process_data_sources / process_detectors differently.

src/sentry/incidents/grouptype.py Outdated Show resolved Hide resolved
@mifu67
Copy link
Contributor Author

mifu67 commented Dec 16, 2024

Closing because the rebase is too complicated and there aren't that many lines anyway—I'll start a new PR instead.

@mifu67 mifu67 closed this Dec 16, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jan 1, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants