-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monitor celery tasks in cloudwatch (PP-1150) #1813
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1813 +/- ##
==========================================
+ Coverage 90.01% 90.04% +0.02%
==========================================
Files 299 300 +1
Lines 39643 39742 +99
Branches 8596 8615 +19
==========================================
+ Hits 35686 35784 +98
- Misses 2626 2627 +1
Partials 1331 1331
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
bdec0af
to
679401c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Assuming you fix the mypy issues first. |
@jonathangreen : would it be helpful also to aggregate stats based on the type of task? It wasn't completely clear to me if the task type is being captured. If it is not being captured, it would be helpful to get a sense of the task run time based on the type of task. That way we would be able to monitor performance in a more granular way if necessary. For example, if a change was introduced to a query within a job that started to cause a bottleneck we would be in a better position to troubleshoot it. |
@dbernstein The stats we push have a dimension for the task name. So per task name we get the number of tasks that ran, failed and the task runtime. I think this is what you are asking for. If not, what is it that you would use to group tasks into "type"? |
I'm going to merge this one since the tests are passing, so I can test it out on Minotaur tomorrow. @dbernstein if there are additional stats you want captured or dimensions added to the stats lets discuss on PP-1150. Can either add them as a follow up on that ticket. |
Description
Add new custom metrics for Celery tasks and queues in Cloudwatch:
Tasks:
TaskFailed
TaskSucceeded
TaskRuntime
Queues:
QueueWaiting
Motivation and Context
Allow us to monitor our Celery queues via Cloudwatch.
How Has This Been Tested?
This one is tough to test fully locally, so it may need some ineration after actually being deployed to AWS and fully pushing metrics into cloudwatch.
Checklist