-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API gateway timing out on spreadsheet upload #308
Comments
Looking at the files uploaded, they actually had an extra named column that wasn't in the template. I verified with a small send that this all works (ie the proper columns are used to construct the notification. In this case, they have a
Note that the original files were |
recreated template on staging. Send 50K row file. upload was fine After clicking "Send all now" button, eventually got an error When went back to dashboard, job was processing normally |
Add some logging to help figure out in staging where the time is being spent: |
In Athena with the query
We can see the IRCC requests timing out after 29 seconds, but all other recent jobs have taken less than 19 seconds. Don't know how big those jobs were though. |
Logging is in staging! Running the big send again we see from this query: So checking that the service will not go over its limit is taking about a minute! Since we are sending emails, this is the code that is taking a minute to run:
There are three things here that could be the issue:
|
With a bit more logging we get So it's taking half the time now. I believe what is happening is that |
Can get the notification count from the s3 object's metadata instead. preparing a PR... |
This should fix it! |
This should fix it! |
testing on staging with the big IRCC file now takes about one second with no timeouts. |
I will QA this morning. |
Jimmy will QA for sure this time! |
Tested and working with the same template that IRCC used and with a 50K entries spreadsheet. The upload took a few seconds (~5s) but there were no timing out. The period of time after clicking Send all was almost instantaneous. This can be considered fixed and working. |
Describe the bug
The upload of a spreadsheet by IRCC triggered a 502 error on the API gateway, s it seems that the API it taking too long to process the request.
Bug Severity
See examples in the documentation
SEV-2 Major - This impedes user experience, indicates a potential failure upon spreadsheet upload and leads to confusion. The notifications to be sent are successful, hence this is not a critical issue.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
No errors
Impact
We get repeated 502 errors on these spreadsheet uploads. This can involve the support team to answer on warning or critical alerts.
QA
Screenshots
No screenshot available for now. The whole page said to retry the operation later, hence the error is not even properly surfaced as the website is blanking.
Additional context
This error was seen repeatedly during the mass send of 1M emails over 2 days by one of our user.
The service and template involved.
The text was updated successfully, but these errors were encountered: