You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We were using waitUntilFunctionUpdated to wait for a function to stabilize after updating it. We have since moved to waitUntilFunctionUpdatedV2, I don't know if this matters.
Since then, we have gotten reports from the Amplify team that tests were failing with the error:
TimeoutError: Resource is not in the expected state due to waiter status: TIMEOUT. Waiter has timed out.
We spent days looking over timings, comparing code between v2 and v3 waitiers, and theorizing what might be the problem that caused the waiter to hit the timeout. We ultimately had to give up because we couldn't think of anything.
Then, 2 weeks later, at the least opportune time (blocked days for re:Invent), we figured out what the issue was: the policy they were using had permissions for lambda:GetFunction, but it needed permissions for lambda:GetFunctionConfiguration.
We could have known this immediately, but the waiter swallowed the permissions error and instead reported an "oh well the service doesn't seem to stabilize ¯_(ツ)_/¯" error.
This is extremely unexpected behavior, and this poor error reporting has costs us many days and sweat and stress.
I understand you're probably doing this to proceed in the face of transient errors, but I would ask for one of the following behaviors, in order of preference:
If you catch a non-retryable error, throw it immediately instead of continuing to wait.
Upon throwing the TimedOut error, if you notice that you've been consistently getting the same error again and again over the period of the wait (not a single success), just throw that error instead.
Upon throwing the TimedOut error, include the (unique?) error messages of the errors you've seen in the error description.
Regression Issue
Select this option if this issue appears to be a regression.
kuhe
added
p2
This is a standard priority issue
queued
This issues is on the AWS team's backlog
and removed
needs-triage
This issue or PR still needs to be triaged.
labels
Dec 2, 2024
Checkboxes for prior research
Describe the bug
We were using
waitUntilFunctionUpdated
to wait for a function to stabilize after updating it. We have since moved towaitUntilFunctionUpdatedV2
, I don't know if this matters.Since then, we have gotten reports from the Amplify team that tests were failing with the error:
We spent days looking over timings, comparing code between v2 and v3 waitiers, and theorizing what might be the problem that caused the waiter to hit the timeout. We ultimately had to give up because we couldn't think of anything.
Then, 2 weeks later, at the least opportune time (blocked days for re:Invent), we figured out what the issue was: the policy they were using had permissions for
lambda:GetFunction
, but it needed permissions forlambda:GetFunctionConfiguration
.We could have known this immediately, but the waiter swallowed the permissions error and instead reported an "oh well the service doesn't seem to stabilize ¯_(ツ)_/¯" error.
This is extremely unexpected behavior, and this poor error reporting has costs us many days and sweat and stress.
I understand you're probably doing this to proceed in the face of transient errors, but I would ask for one of the following behaviors, in order of preference:
TimedOut
error, if you notice that you've been consistently getting the same error again and again over the period of the wait (not a single success), just throw that error instead.TimedOut
error, include the (unique?) error messages of the errors you've seen in the error description.Regression Issue
SDK version number
@aws-sdk/[email protected]
Which JavaScript Runtime is this issue in?
Node.js
Details of the browser/Node.js/ReactNative version
v22.11.0
Reproduction Steps
Create and assume a role that has the statement:
Run the following code:
Observed Behavior
After 10 seconds, see:
Expected Behavior
Immediately, see:
Possible Solution
No response
Additional Information/Context
No response
The text was updated successfully, but these errors were encountered: