-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dashboard reporting incorrect notification counts #1369
Comments
Navigate to an affected service and click on the "Emails sent in the last week" link and view the sent notifications. You can verify, without using the database or Redis keys, that the counts are incorrect by:
This implies we have the information we need to display the correct counts somewhere in the data we pass to the dashboard. We can use that data temporarily as stop-gap measure while we determine where things are going wrong in our current process for fetching and displaying the notification counts. |
Possibly related: #1378 |
Hey team! Please add your planning poker estimate with Zenhub @andrewleith @jzbahrai @whabanks |
Started work on a Jupyter notebook to compare Redis notification counts to notification counts in the DB to identify services with discrepancies to make investigation simpler. |
Uncovered some leads from investigating affected services, identified via the Jupyter notebook mentioned above.
As the Next steps / plan moving foward:
|
After further investigation with both @andrewleith and @jzbahrai we've narrowed the cause of the discrepancies down to a couple issues, when fetching notifications from the
For provincial services with a retention period of 3 days this means we are not fetching notifications for the entire previous week.
This leaves a 5 hour window between 00:30 and 05:30 where not all notifications for the week are being collected. |
Further investigation/analysis confirms the issue with UTC times affecting counts. |
|
PR was reviewed, some small refactors to improve testability to come before merging. |
Good to go for QA once code freeze lifts |
Code has been merged and is ready for QA in staging. |
Current state:
Notes for when this is picked up later
|
Made some headway on this. Looks like there are 2 areas causing issues:
|
@jzbahrai to review today. |
@jzbahrai QA'd and will move this back to product backlog for now |
Notification table and notification history table. When we store into facts table, it's within a certain timeframe. Our timeframes are not actually adding up. When we download that report, with aggregate data, it's not adding up. Had teamed up with Core, one of the celery tasks for doing aggregation was using different time. Many different parts |
Revisit priority in Q2 (early July) |
This is now complete. Stats are matching on the dashboard, the notifications report page, as well as the notification reports download. 🎉 |
Describe the bug
Notification counts, under the "Sent in the last week" section of a users dashboard, are being reported incorrectly. The counts are much larger than the actual count of sent notifications found in the DB and in Redis under the
total_notifications
key for that service.Bug Severity
See examples in the documentation
SEV-2 Major
To Reproduce
Steps to reproduce are currently unknown
Expected behavior
Notification counts under the "Sent in the last week" section should be reported accurately.
Impact
Users are unable to accurately evaluate their sent notification count against their failure and bounce rates, making it very confusing to understand the current state of their service. Users may feel deterred from addressing their problem email addresses due to this confusion, and thus have an affect on our overall bounce rate in AWS. Users who sent frequently in the morning are more affected.
Impact on Notify users:
Confusion around how many notifications a user's service have sent. This makes it difficult for a user to understand their send limits, failure rates, and bounce rates. Concern, around the trust level of their service, that recipients may have received duplicate emails.
Impact on Recipients:
None that we are aware of at this time. There is no evidence to suggest that any notification sends have been duplicated.
Impact on Notify team:
Increased load on the support team needing to answer tickets to clarify to users that the counts are incorrect.
Screenshots
This user's dashboard is reporting 28,712 sent notifications.
[Private Zenhub Image]
(https://api.zenhub.com/attachedFiles/eyJfcmFpbHMiOnsibWVzc2FnZSI6IkJBaHBBMVpNQVE9PSIsImV4cCI6bnVsbCwicHVyIjoiYmxvYl9pZCJ9fQ==--89aed9ef8ef19b7dc7778fd2145ab85009a7b16a/image.png)
However in actuality over the last week, the user has only sent a total of
Additional context
This issue was brought to our attention by two support tickets asking about the discrepancy:
https://cds-snc.freshdesk.com/a/tickets/15832
https://cds-snc.freshdesk.com/a/tickets/15825
Discussion threads:
https://gcdigital.slack.com/archives/C03FA4DJCCU/p1697628973581769
https://gcdigital.slack.com/archives/C03FA4DJCCU/p1697554077616559
The text was updated successfully, but these errors were encountered: