Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Amplify Auth v1 to v2 migration fails 5-10% of the time, logs user out #2929

Open
1 task done
camhart opened this issue Sep 24, 2024 · 37 comments
Open
1 task done
Labels
auth Related to the Auth category/plugins bug Something isn't working

Comments

@camhart
Copy link

camhart commented Sep 24, 2024

Before opening, please confirm:

Language and Async Model

Java

Amplify Categories

Authentication

Gradle script dependencies

// Put output below this line

implementation 'com.amplifyframework:aws-auth-cognito:2.21.0'

Environment information

# Put output below this line
C:\Users\Cam\projects\project-android>gradlew --version

------------------------------------------------------------
Gradle 8.7
------------------------------------------------------------

Build time:   2024-03-22 15:52:46 UTC
Revision:     650af14d7653aa949fce5e886e685efc9cf97c10

Kotlin:       1.9.22
Groovy:       3.0.17
Ant:          Apache Ant(TM) version 1.10.13 compiled on January 4 2023
JVM:          20.0.2 (Oracle Corporation 20.0.2+9-78)
OS:           Windows 10 10.0 amd64

Please include any relevant guides or documentation you're referencing

No response

Describe the bug

I've updated my Android app to use AWS Amplify V2. I deployed it to beta users, and ~5-10% of them had issues with the data migration. Essentially they ended up logged out of the app after their app updated and migrated from v1 to v2. This shouldn't happen. If I have those customers uninstall/reinstall the android app, and login, everything works moving forward, however this isn't an acceptable solution.

I created a ticket with AWS support and they told me to create a github issue. See case 172444220700816.

Here's an example log output when the app attempts to make API calls but is unable to due to being logged out.

D/ 09-23 15:31:15.551 BackendCallTask( 5715): AUTH fetchAuthSessionRequest
D/ 09-23 15:31:16.729 BackendCallTask( 5715): AUTH fetchAuthSessionRequest result, isSignedIn=true
D/ 09-23 15:31:16.729 BackendCallTask( 5715): AUTH exception: SessionExpiredException{message=Your session has expired., cause=NotAuthorizedException(message=Invalid Refresh Token.), recoverySuggestion=Please sign in and reattempt the operation.}
W/ 09-23 15:31:16.732 System.err( 5715): SessionExpiredException{message=Your session has expired., cause=NotAuthorizedException(message=Invalid Refresh Token.), recoverySuggestion=Please sign in and reattempt the operation.}
W/ 09-23 15:31:16.732 System.err( 5715):  at com.amplifyframework.auth.cognito.actions.FetchAuthSessionCognitoActions$refreshUserPoolTokensAction$$inlined$invoke$1.execute(SourceFile:48)
W/ 09-23 15:31:16.732 System.err( 5715):  at com.amplifyframework.auth.cognito.actions.FetchAuthSessionCognitoActions$refreshUserPoolTokensAction$$inlined$invoke$1$1.invokeSuspend(Unknown Source:12)
W/ 09-23 15:31:16.733 System.err( 5715): Caused by: NotAuthorizedException(message=Invalid Refresh Token.)
W/ 09-23 15:31:16.733 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.model.NotAuthorizedException$Builder.a(SourceFile:4)
W/ 09-23 15:31:16.733 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.NotAuthorizedExceptionDeserializer.c(SourceFile:27)
W/ 09-23 15:31:16.733 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializerKt.d(SourceFile:344)
W/ 09-23 15:31:16.733 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializerKt.b(SourceFile:1)
W/ 09-23 15:31:16.733 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializer.c(SourceFile:43)
W/ 09-23 15:31:16.733 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializer.b(SourceFile:1)
D/ 09-23 15:31:28.709 BackendCallTask( 5715): AUTH fetchAuthSessionRequest
D/ 09-23 15:31:28.963 BackendCallTask( 5715): AUTH fetchAuthSessionRequest result, isSignedIn=true
D/ 09-23 15:31:28.963 BackendCallTask( 5715): AUTH exception: SessionExpiredException{message=Your session has expired., cause=NotAuthorizedException(message=Invalid Refresh Token.), recoverySuggestion=Please sign in and reattempt the operation.}
W/ 09-23 15:31:28.963 System.err( 5715): SessionExpiredException{message=Your session has expired., cause=NotAuthorizedException(message=Invalid Refresh Token.), recoverySuggestion=Please sign in and reattempt the operation.}
W/ 09-23 15:31:28.963 System.err( 5715):  at com.amplifyframework.auth.cognito.actions.FetchAuthSessionCognitoActions$refreshUserPoolTokensAction$$inlined$invoke$1.execute(SourceFile:48)
W/ 09-23 15:31:28.963 System.err( 5715):  at com.amplifyframework.auth.cognito.actions.FetchAuthSessionCognitoActions$refreshUserPoolTokensAction$$inlined$invoke$1$1.invokeSuspend(Unknown Source:12)
W/ 09-23 15:31:28.963 System.err( 5715): Caused by: NotAuthorizedException(message=Invalid Refresh Token.)
W/ 09-23 15:31:28.963 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.model.NotAuthorizedException$Builder.a(SourceFile:4)
W/ 09-23 15:31:28.963 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.NotAuthorizedExceptionDeserializer.c(SourceFile:27)
W/ 09-23 15:31:28.963 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializerKt.d(SourceFile:344)
W/ 09-23 15:31:28.963 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializerKt.b(SourceFile:1)
W/ 09-23 15:31:28.963 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializer.c(SourceFile:43)
W/ 09-23 15:31:28.963 System.err( 5715):  at aws.sdk.kotlin.services.cognitoidentityprovider.serde.InitiateAuthOperationDeserializer.b(SourceFile:1)
D/ 09-23 15:31:3

I'd like to request a feature addition to this library, where the migration creates persistent migration logs that the app developer can request to help troubleshoot issues like this. Also, it'd be able to be able to retry the migration. Right now it seems to destroy all the old v1 data and just assumes everything worked when it doesn't. The migration fails sporadically and I have no clue why, with no recourse for troubleshooting. I have to wait for a customer support ticket complaining about the problem in order to get logs, but they aren't really too helpful as they just show the user was signed out for some reason. I've been using aws amplify auth v1 for several years without any issue keeping users logged in.

Reproduction steps (if applicable)

I've been unable to reproduce the issue myself.

Code Snippet

// Put your code below this line.

Log output

// Put your logs below this line


amplifyconfiguration.json

{
"auth": {
"plugins": {
"awsCognitoAuthPlugin": {
"IdentityManager": {
"Default": {}
},
"CredentialsProvider": {
"CognitoIdentity": {
"Default": {
"PoolId": "us-west-2:xxxxxxxxxxxx",
"Region": "us-west-2"
}
}
},
"CognitoUserPool": {
"Default": {
"PoolId": "us-west-2_xxxxxxxxx",
"AppClientId": "xxxxxxxxx",
"AppClientSecret": "xxxxxxxxx",
"Region": "us-west-2"
}
},
"Auth": {
"Default": {
"OAuth": {
"WebDomain": "cognitoauth.xxxxxxxxx.io",
"AppClientId": "xxxxxxxx",
"AppClientSecret": "xxxxxxxxx",
"SignInRedirectURI": "xxxxxxxx://callback/",
"SignOutRedirectURI": "xxxxxxxx://signout/",
"Scopes": [
"email",
"openid",
"profile",
"aws.cognito.signin.user.admin"
]
},
"authenticationFlowType": "USER_SRP_AUTH"
}
}
}
}
}
}

GraphQL Schema

// Put your schema below this line

Additional information and screenshots

One more detail. V1 of the amplify auth library has code that Google Play throws big warnings about and claims it'll stop accepting app updates that use it. Fixing this issue with the v1 -> v2 migration should be a top priority, as continuing to use v1 in the interim isn't an option. I essentially can't update my app unless it's using amplify v2.

@github-actions github-actions bot added pending-triage Issue is pending triage pending-maintainer-response Issue is pending response from an Amplify team member labels Sep 24, 2024
@mattcreaser
Copy link
Member

Sorry to hear you're having issues @camhart. Can you please confirm that you updated directly to 2.21.1 and did not first try to use an older version of v2? There was a known issue in the migration code that was fixed in version 2.16.1.

Is reinstalling the app the only solution? What about calling Amplify.Auth.fetchAuthSession with options specifying forceRefresh = true?

Are there any obvious similarities between the affected users?

@mattcreaser mattcreaser added bug Something isn't working auth Related to the Auth category/plugins labels Sep 24, 2024
@github-actions github-actions bot removed pending-maintainer-response Issue is pending response from an Amplify team member pending-triage Issue is pending triage labels Sep 24, 2024
@ruisebas ruisebas added the pending-maintainer-response Issue is pending response from an Amplify team member label Sep 25, 2024
@camhart
Copy link
Author

camhart commented Sep 25, 2024

Can you please confirm that you updated directly to 2.21.1 and did not first try to use an older version of v2? There was a known issue in the migration code that was fixed in version 2.16.1.

Yes, we went direct from v1 to v2.21.1.

Is reinstalling the app the only solution? What about calling Amplify.Auth.fetchAuthSession with options specifying forceRefresh = true?

I haven't tried this, but didn't think it would be needed. The SDK is supposed to detect when credentials are expired and handle refreshing them automatically isn't it?

@mattcreaser
Copy link
Member

That's correct, it should - I only suggested trying to force refresh the tokens as a way to gather more information about what is going wrong. Another thought is to try catching the exception and invoking signOut.

We will need to investigate this issue to see what's going on - unfortunately it sounds like it will be difficult to reproduce. Any additional details about the affected users would be beneficial.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 4, 2024
@camhart
Copy link
Author

camhart commented Oct 10, 2024

unfortunately it sounds like it will be difficult to reproduce

Ideally you can add more tools to the library so I can better troubleshoot the issue to provide more info. I'm confident if I release the app to another 1% of my customers, I'll get a few emails about it. But I don't want to do that until there's some ability to troubleshoot. We need some sort of migration record to indicate what happened to the migration and to understand why it failed. I'm not asking for you to solve it immediately. But adding some support for better troubleshooting migration issues seems like a low hanging fruit that moves the needle forward.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 10, 2024
@harsh62
Copy link
Member

harsh62 commented Oct 21, 2024

@camhart Can you please share the code snippets so that we can try to reproduce the issue in a local environment.. Snippets of how Auth category is being used from from both V1 and V2 will be really helpful to isolate how we investigate the issue.
Please share any other details you think will help us isolate the issue.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 21, 2024
@camhart
Copy link
Author

camhart commented Oct 21, 2024

@harsh62 I don't have code snippets to share that can reproduce the issue. I've tried multiple times with my entire app to replicate the problem and can't replicate it locally, but it is happening. This is why I'm arguing for better tools to investigate/troubleshoot problems relating to the migration.

Here are all the Amplify method calls I use:

  • Amplify.Auth.signIn
  • Amplify.Auth.fetchAuthSession
  • Amplify.Auth.signUp
  • Amplify.Auth.signInWithWebUI
  • Amplify.Auth.fetchUserAttributes
  • Amplify.Auth.confirmSignUp
  • Amplify.Auth.signOut

V1 used the same method calls but adjusted for the api changes between the two. I don't use Amplify for anything else--only Auth.

Please share any other details you think will help us isolate the issue.

My app is a long running background app that stays running 24/7 in the background on the device (it's a parental control app). It automatically launches itself after an app update has occurred.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 21, 2024
@harsh62
Copy link
Member

harsh62 commented Oct 21, 2024

Are you able to isolate if the issue is happening with customers using Amplify.Auth.signInWithWebUI compared to Amplify.Auth.signIn?
Another follow up to that would be, if your customers are able to use Amplify.Auth.signIn and Amplify.Auth.signInWithWebUI interchangeably? i.e. customer could be using Amplify.Auth.signInWithWebUI in Amplify V1 and decided to use Amplify.Auth.signIn in Amplify V2.

If you could answer this, it would greatly narrow down our reproduction codepath.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 21, 2024
@camhart
Copy link
Author

camhart commented Oct 21, 2024

Are you able to isolate if the issue is happening with customers using Amplify.Auth.signInWithWebUI compared to Amplify.Auth.signIn?

Not easily. If the problem is happening to customers logged in via one of those calls, it's not happening 100% of the time. I can release the app to another 1% of customers and wait for the support tickets to come in, but I'm really hoping to avoid doing that without having better tools in place to troubleshoot the migration.

Another follow up to that would be, if your customers are able to use Amplify.Auth.signIn and Amplify.Auth.signInWithWebUI interchangeably?

They can use one or the other, but not both. Once logged in one way, we don't give them the option to login again without signing out first.

i.e. customer could be using Amplify.Auth.signInWithWebUI in Amplify V1 and decided to use Amplify.Auth.signIn in Amplify V2.

We don't give customers the ability to logout once the device is setup (there's additional steps they have to take after logging in to set the device up with my app). There's only a very brief window where they can logout where the customer has logged in but not setup the device. Once the device is setup, if they want to logout they need to uninstall/reinstall the app. The customers who've reported the issue to me have all had their device setup fully, so there is no longer an option for them to logout at that point. So, long story short, it's not possible for them to use Amplify.Auth.signIn and then use Amplify.Auth.signInWithWebUI (or vice versa). Does that make sense?

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 21, 2024
@harsh62
Copy link
Member

harsh62 commented Oct 21, 2024

@camhart This is good information. Another question I have is that has your amplifyconfiguration.json file changed in anyway from Amplify V1 to V2?

From the issues reported, are you able to see if anything common in the affected users, device types, OS versions, manufacturer type, or anything else?

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 21, 2024
@camhart
Copy link
Author

camhart commented Oct 21, 2024

Another question I have is that has your amplifyconfiguration.json file changed in anyway from Amplify V1 to V2?

No it hasn't changed.

From the issues reported, are you able to see if anything common in the affected users, device types, OS versions, manufacturer type, or anything else?

I haven't kept track of this. However, I do recall Samsung being one of the devices and it was on OS version 13. I have multiple samsung test devices though and I haven't been able to replicate the issue on any of them. When I release the app update to more customers, we get reports of customers having issues, but I can guarantee many have the issue but never report it. They'll just cancel their subscription with us or try and resolve it on their own.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 21, 2024
@harsh62
Copy link
Member

harsh62 commented Oct 21, 2024

Thanks for providing all the information, one of our engineers will try to reproduce this issue locally by trying out different codepaths.. Will get back to you when we have more updates.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 21, 2024
@tylerjroach
Copy link
Member

@camhart One more question that would help in our research. Can you post all of the AWS dependencies you are using in Gradle? Ex Amplify as well as any other AWS SDKs.

@camhart
Copy link
Author

camhart commented Nov 1, 2024

    implementation 'com.amplifyframework:aws-auth-cognito:2.21.0'
    coreLibraryDesugaring 'com.android.tools:desugar_jdk_libs:2.0.3'

    implementation 'com.amazonaws:aws-android-sdk-apigateway-core:2.16.1'

Those are the only dependencies being used. Let me know if you need anything else!

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Nov 1, 2024
@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 6, 2024
@tylerjroach
Copy link
Member

What is the purpose of the cognitoCredentialsProvider variable? Is it being used elsewhere?

Do you have device tracking enabled on Cognito side? I'm just trying to think of any additional areas we would need to look into.

With this seemingly being an edge case type of scenario you are running into, please keep us updated if you notice any patterns (ex: sign ins over a year, etc).

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 6, 2024
@camhart
Copy link
Author

camhart commented Dec 6, 2024

What is the purpose of the cognitoCredentialsProvider variable? Is it being used elsewhere?

No it's not. It's left over from the refactor I made previously and can be removed without impacting anything.

Do you have device tracking enabled on Cognito side? I'm just trying to think of any additional areas we would need to look into.

It's likely set to the default. How do I check? I don't believe I intentionally use it in anyway (outside what the SDK would do by default).

With this seemingly being an edge case type of scenario you are running into, please keep us updated if you notice any patterns (ex: sign ins over a year, etc).

So far today the trend has continued to hold true--all impacted devices have been for long term customers of ours created at least a year or more ago. I'm assuming these are all long term logins as a result of that. I don't keep track of the logins though, so I can't tell you for certain. I'll be sure to let you know if I find anything contrary to this trend, but based on the number of customers I've already worked with, I'm 95% confident the trend is going to hold. We have lots of new customers, so it'd be really strange at this point if it doesn't hold.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 6, 2024
@tylerjroach
Copy link
Member

@camhart What version of Amplify v1 were you using before bumping to v2. I'm slightly concerned about the API Gateway version of 2.16.1. com.amazonaws:aws-android-sdk-apigateway-core:2.16.1@jar transitively pulls in com.amazonaws:aws-android-sdk-core:2.16.1@jar.

v2.16.1 was released on October 2019. The last Amplify v1 release was using v2.73.0 of the AWS SDK, released in 2023. This means com.amazonaws:aws-android-sdk-core:2.73.0@jar would have been transitively pulled in your code. API Gateway would have been best to match this version #. By dropping the Amplify v1 dependency, it is now actually downgrading API Gateway to this 2019 version.

Earlier in this thread you stated

I've replaced the CognitoCredentialsProvider with the AmplifyCredentialsProvider indicated in the article mentioned.

With this type of downgrade, this could have also been a major problem, as it looks like there were some keystore changes that had happened between 2019 and 2023. A CognitoCredentialsProvider from the 2019 may fail to read (and possibly corrupt) a keystore from 2023.


With all this said, Amplify v2 attempts to open the old Amplify v1 / AWS SDK credentials, and migrate to the v2 format, without any dependency on the AWS Android SDK to do so. Since you are no longer using CognitoCredentialsProvider, I don't believe AWS Core and AWS Gateway have any codepaths that would attempt to write credentials in the old format (which would interfere with Amplify v2).

Are you sure that these log out reports are from users that were actively using a version with Amplify v1, and upon recently being added to the rollout, begun having refresh token issues with Amplify v2. Is it possible that any of these reports are delayed? We know that the old implementation with CognitoCredentialsProvider (and MobileClient if it was present) would have corrupted the credential storage. Is it possible some of these customers are just now noticing? I know you said this issue was sporadic originally, but given what we know, I would have expected the initial implementation to fail 100% of the time.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 9, 2024
@camhart
Copy link
Author

camhart commented Dec 9, 2024

I'm eating my words here. We just had a customer report the issue who's account was created on October, 15th 2024. I apologize for the misdirection--age of tokens may not be a factor. I did say 95% confident and not 100% :D.

What version of Amplify v1 were you using before bumping to v2.

implementation 'com.amplifyframework:aws-auth-cognito:1.37.4'

For some reason, Android studio doesn't indicate updates are available for apigateway core (notice the missing highlighting). This is most likely why I haven't been upgrading it.
image

Are you sure that these log out reports are from users that were actively using a version with Amplify v1, and upon recently being added to the rollout, begun having refresh token issues with Amplify v2

Yes, I'm 100% positive this is the case. I release the update and then emails come in within a few days indicating the app just stopped working. I track the last recorded app version for each of these devices, and they were all working on the prior major release (the one that still used amplify v1). Customers send in screenshots of the page the app gets stuck on, which shows the current app version and it's the one where the amplify v2 upgrade occurred.

Is it possible that any of these reports are delayed?

Not like what I think you're indicating. Delayed 1 or 2 days yes. Delayed for months, no. I go from 0 reports of issues ever to a handful of issues within a few days (it takes time for their device to update the app) once I increase the rollout % via Google Play.

Is it possible some of these customers are just now noticing? I know you said this issue was sporadic originally, but given what we know, I would have expected the initial implementation to fail 100% of the time.

I've tested the implementation many many times myself and it works fine. We have tons of customers using it without issue as is. It's just an estimated 5-10% that run into the problem.

Should I simply upgrade com.amazonaws:aws-android-sdk-apigateway-core to the latest version (2.77.1)? Would that potentially stop transitively calling code and wiping out credentials? How confident are you that it'll fix the problem I'm facing?

I'd still prefer some sort of migration logging and migration retry ability be added to the SDK. I can't help but fear even after upgrading aws-android-sdk-apigateway-core to the latest version that the underlying issue still won't be discovered. Beggars can't be choosers but this is painful. It's damaging our companies relationship with customers and our brand (we're known for being extremely reliable, so this is a direct hit against that). I've proactively granted customers impacted by it credits which will impact revenue. Thankfully it's only 5-10% of the 1% impacted, so it's not the end of the world. But even if I'd handled the rollout perfectly, having to pay the price of rolling out to 1% of customers to try each new iteration of a fix isn't great.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 9, 2024
@camhart
Copy link
Author

camhart commented Dec 9, 2024

I do want to add--thank you for your help. I'm not trying to be a complainer here, but I do want to pass the pain that I'm feeling along so you have an appropriate understanding of the impact this troubleshooting experience has had.

@tylerjroach
Copy link
Member

Hi @camhart, I understand your frustrations. Thank you for quickly answering all of the questions sent your way. I know it has been a lot, but these types of edge cases are always difficult to figure out with lack of logs that highlight the problem. It's especially hard considering our team members, and yourself, have been unable to replicate the failure.

There could be something unique about these 5-10% of users that we haven't yet tracked down (ex: sign in method, device type, device OS, etc). We are continuing to look at any failure paths on our end.


Should I simply upgrade com.amazonaws:aws-android-sdk-apigateway-core to the latest version (2.77.1)? Would that potentially stop transitively calling code and wiping out credentials? How confident are you that it'll fix the problem I'm facing?

I don't believe this would directly fix the problem, but it is always best to try and keep up to date with our latest versions. You are using a version of API Gateway that is 5 years old, which means it is missing 5 years of any bug fixes that would have possibly been added along the way. Given that you are confident the issues are happening with each rollout, and CognitoCachingCredentials provider is no longer being used, I do not expect this cause the invalid refresh token error you are seeing.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 9, 2024
@camhart
Copy link
Author

camhart commented Dec 9, 2024

Sounds good, I'll wait to hear further instruction from you then before trying anything. Getting this fixed is top priority on my end, so I'll respond quickly and as clearly as possible.

@tylerjroach
Copy link
Member

@camhart If you wouldn't mind, join our discord channel https://discord.com/invite/amplify and you can reach out to me @tylerjroach. We can dm and set up a screenshare call.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 9, 2024
@camhart
Copy link
Author

camhart commented Dec 9, 2024

Just sent you a DM.

@tylerjroach
Copy link
Member

Thanks, we can continue discussion there!

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 9, 2024
@tylerjroach
Copy link
Member

tylerjroach commented Dec 10, 2024

I have identified an issue with migrating logins that have Device Tracking enabled. I am recognizing this ticket as a bug and we are actively working on a fix.

This is not an issue with your hosted ui (web) sign ins, as they do not use device tracking. This will be an issue with any SRP sign ins that use device tracking.

@tylerjroach tylerjroach added bug Something isn't working and removed question General question labels Dec 10, 2024
@tylerjroach
Copy link
Member

@camhart I have discovered the root cause and am working on a fix here: #2963

I believe we should be able to migrate the missing device metadata to our new credential store, which would result in token refreshes immediately working without requiring another sign in.

The cause is due to aliased userIds. When email is used for signIn, the users actual userId is a UUID. During the migration process, Amplify v2 will attempt to migrate based on the email address, when it should be looking at the UUID userId instead.


In my testing, I also identified a workaround. If you are not actually using Device Tracking (primarily used to prevent repeated MFA validations on sign in), I believe the issue can immediately be resolved by changing the "Remember User Devices" setting to "Don't Remember" in the Cognito console. This turns off the device tracking verification on token refreshes. The refresh calls that were failing would now succeed, because Cognito no longer checks the device metadata upon refresh. If you were to re-enable this setting, the refreshes would begin failing again until our official fix is released.


TLDR: We are working on a fix, but if you don't actually need Device Tracking enabled for your use case, token refreshes will begin working again if you toggle "Remember User Devices" to "Don't Remember".

@camhart
Copy link
Author

camhart commented Dec 11, 2024

Thank you for the update. Great news if we can migrate without causing people to have to sign in again.

Is there any risk that changing the "Remember User Devices" setting could have an adverse effect that couldn't easily be undone by changing it back?

The plan was to eventually offer MFA support. That's still the plan.

We are working on a fix

How long does a fix like this typically take to get released? A week? Three months?

Thanks again! Really happy to finally get this figured out.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 11, 2024
@tylerjroach
Copy link
Member

I do not believe there is any risk in the change.

  • Tokens that were generated while device tracking was turned on (and currently failing to refresh) would begin successfully refreshing with device tracking turned off. If device tracking were turned back on, they would begin failing again until this PR fix is ready.
  • Tokens that were generated while device tracking was turned off would continue to work even if device tracking were turned on again.

I don't see any adverse side effects in your case. MFA could still be enabled. Device Tracking is used as a way to bypass subsequent MFA requirements on future sign ins. Considering your app doesn't have signOut functionality, this really wouldn't matter in your case.

Once a fix is merged and ready, it will typically go in the next release. We try and release weekly if there are commits ready to go live.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 12, 2024
@camhart
Copy link
Author

camhart commented Dec 12, 2024

I can confirm that disabling device tracking fixed the issue for one of our customers (hopefully all of them--time will tell). I realized I had it disabled already in my dev environment--that's why my testing didn't catch the issue when I did my own 24 hour tests. Thank you for all the help! Very much appreciated.

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 12, 2024
@thisisabhash thisisabhash removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auth Related to the Auth category/plugins bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants