-
Notifications
You must be signed in to change notification settings - Fork 11.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: Implement Cloud communication reliability #32856
Conversation
Looks like this PR is ready to merge! 🎉 |
🦋 Changeset detectedLatest commit: 15f9f51 The changes in this PR will be included in the next version bump. This PR includes changesets to release 35 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #32856 +/- ##
========================================
Coverage 74.85% 74.85%
========================================
Files 470 470
Lines 20743 20743
Branches 5294 5294
========================================
Hits 15528 15528
Misses 4595 4595
Partials 620 620
Flags with carried forward coverage won't be shown. Click here to find out more. |
ce44882
to
d94b784
Compare
30f6f98
to
8c447cf
Compare
8deb1a0
to
2252e0a
Compare
f7ac222
to
73b17e6
Compare
c6874ac
to
fff2293
Compare
9ba3bd2
to
fcd8105
Compare
e9bbdbd
to
cb512e1
Compare
we were passing down a weird string to inform when to remove credentials which was causing a type error
we were assuming a changeset when one was not available
ee560a5
to
f464772
Compare
Proposed changes (including videos or screenshots)
When a Workspace Registers with Cloud it gets a set of credentials it uses to generate access tokens and communicate with Cloud. Can read more about how this works in ADR3.
When a workspace owner needs to scale and deploys multiple Rocket.Chat instances to make up their deployment. Each workspace establishes a change stream request to mongo watching settings. Then updates its collection cache locally.
When a workspace goes to use an access token to talk to a cloud service it first evaluates if the token needs renewed. If it does it request a new one. Then updates a setting containing the access token. Then also another setting containing the expire.
The problem occurs when latency enters the picture. If due to either longer mongo query or network condition the other workspaces don't see both of these changes happen and update its cache. An instance can be using an expired token instead of renewing because the setting cache hasn't been updated.
We've seen many cases of this happen and typically restarting workspace can resolve.
This PR moves all credential related data to the 'workspace_credentials' collection, and stop using settings.get to fetch the configs that can be changed on multiple server instances.
This is a new PR that was created from both #32122 & #31986 in those PRs the migration was separated from the code itself, but that would end up being really hard to get merged properly, since the migration required the code, and the code assumed the migration was already ran, I'd guess we could avoid it by doing some weird PR splitting, but I am not really sure we would benefit a lot from it.
Issue(s)
Steps to test or reproduce
Create a new workspace, on registering, it should create a new record inside the rocketchat_workspace_credentials collection, containing an empty date as expirationDate, an empty string as accessToken and an empty string as scopes.
To test everything else, you must use the cloud functions, not sure how to call them from RC, but you could enter the Marketplace screen, go to paid apps, open one of them, and check if there is a new token in the database.
I would love some suggestions on how we can implement API tests for this feature, I am not sure at all if it is feasible (or if we should add later)
Further comments
This implements: https://adr.rocket.chat/0061
CONN-5