-
-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize sql query ws acl (version 2) #824
Optimize sql query ws acl (version 2) #824
Conversation
Without this optimization, we fetched loads of entries from the database, which led to database and nodejs overloads. We could go further, this is a modest patch towards better performance.
app/gen-server/lib/HomeDBManager.ts
Outdated
const orgQuery = this.org(scope, org, { | ||
markPermissions: Permissions.VIEW, | ||
needRealOrg: true, | ||
allowSpecialPermit: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have doubts on this part: we not only control the permissions for the groupes associated with the workspaces, but also now the permissions for the orgs.
Otherwise, I may remove the call to verifyIsPermitted
and these options when we call that method for the workspaces, if that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds right. Makes sense to me to exclude them when called from that method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up renaming verifyIsPermitted
to verifyEntity
to allow discarding the permission check.
I felt like we may have wanted dealing with monad in that function (like Result
in Rust), or throw exceptions that would be handled in the express route that would deduce the status code to return.
We use two queries: one fetches the workspaces, the second the organization that the workspace belongs to.
9bbe80d
to
b7d0c64
Compare
@paulfitz it seems that tests are passing here, do we need to test some more (e.g. in staging) or can we merge ? This would be a great relief to us if this was fixed :) |
Hi @vviers, sorry for the delay. Previous version of this PR did pass all our own internal tests. Will get going on reviewing new version as soon as someone's available, I understand it is a priority for you. |
app/gen-server/lib/HomeDBManager.ts
Outdated
const orgQuery = this.org(scope, org, { | ||
markPermissions: Permissions.VIEW, | ||
needRealOrg: true, | ||
allowSpecialPermit: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds right. Makes sense to me to exclude them when called from that method.
Thanks @fflorent. Latest changes are passing our internal test suites. Just waiting for a second reviewer now - I left a nudge in our Slack. |
app/gen-server/lib/HomeDBManager.ts
Outdated
|
||
// Also fetch the organization ACLs so we can determine inherited rights. | ||
const orgQuery = this._buildOrgWithACLRulesQuery(scope, workspace.org.id); | ||
const orgQueryResult = await verifyEntity(orgQuery, { discardPermissionsCheck: true }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any concerns about changes to the database that may have happened between this query and the previous one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be sure, by "the previous one", do you mean:
- previous in the code? (which would refer to the workspace ACL selection)
- previous in the commits? (which would refer to the org ACL selection with permission check)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previous in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clearer, I'm asking about this sequence:
- Query 1 happens
- Orgs/workspaces/documents/users... are created/updated/deleted
- Query 2 happens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I am no DB expert, my guess is that the drawback of this change is the lack of isolation.
To limit that drawback, I guess we might otherwise gather these two queries into one using UNION
, though I don't know how much work it would represent.
I am scratching a bit my head tomorrow, I can challenge a bit more myself about it, or ask the opinions of colleagues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still a bit divided, I would say that if there are some resources modified in between, as the function is only used for the API (and client-side by the ACL popup), the user might be slightly misled but not much (they may see outdated rights in workspaces). If the organization is deleted in between, they should see a 404 error.
The important thing is consistency. Without the transaction, it is hard to reason about what the call could return, because a lot could happen in between the queries. With the transaction, it is easier to think about what it will do. Whether the result is more or less stale is not that important, to my mind, because no matter what you do, by the time the caller receives the reply there's no longer any guarantee it is still valid - anything could have happened in the meantime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback. I am sorry, I am not sure to understand your remark, is it OK what I did or should I bring some change to my PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! 690fdce
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @fflorent !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate your work and patience on this @fflorent
Context
In our self-hosted instance, when we fetch the ACLs rules of a workspace, we get a gateway timeout error. This is due to the fact that with this workspace, the SQL request made in getWorkspaceAccess returns ~900 000 results, which not only is heavy for the database but also takes time for TypeORM to parse.
@vviers made an analysis here in this document: https://outline.incubateur.anct.gouv.fr/doc/grist-list-workspace-access-issue-rxer0X7e44
Proposed solution
Here is another proposal than the one in #818
@vviers also suggested to rework the function so it does two separate queries.
I changed my mind as it seemed like:
I hope this goes in the right way.