backport libraries home performance fix #34950
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Original PR: #34523
This is a fix for a performance issue on the libraries home page. It has some drawbacks so please read through the description.
Certain course instructor accounts that had hundreds of libraries and courses encountered very long loading times for the libraries home page, that is when you go to studio and click the libraries tab. Loading times for these users were 45+ seconds and some users could never open the page at all because timeout errors occurred. This never happens for users with few courses/libraries or for global staff users.
The reason for this was imperformant Python code. Python lookup time for course instructors for libraries home had a complexity of O(n*m), where n is the number of all libraries that exist on the openedx instance and m is the number of all assigned roles which is at least the number of assigned courses + assigned libraries. This code did not apply to global staff users.
We changed the lookup time to O(n).
The drawback here is that some other places are now less performant. Certain requests that include editing libraries blocks and loading the library edit modal, but also presumably other places where a user's roles need to be checked, initialize the RoleCache for a user once, which calculates all the users assigned roles. This one-time calculation is now less performant and has a runtime of O(m) where m is the number of assigned roles.
There are further optimization opportunities but the feedback we got and our evaluation of this is that the libraries_home performance problem was a huge problem for some users and is now completely solved. The loading time assuming good network speed appears to be around 3-5 seconds on average for libraries home for every user. The timeout errors have been entirely removed. On the other hand we got feedback that the library edit modal now takes "a little longer to load" but that the overall improvement is worth it.brary, which results in a complexity of O(l*r), l=libraries, r=roles. This PR improves the complexity to O(l).
The BulkRoleCache and RoleCache classes were using a python set to store all roles for a particular user. A user can have a large number of roles, and lookup speed of iterating through a set is slow (O(n)). Most roles don't have the same course id, however. So if you have the course id of the role you're looking for, we can use a dict of course ids that contain related roles. The number of roles per course id is negligible, so we arrive at a lookup speed of O(1) when looking up a user's roles that belong to a specific course id.
The BulkRoleCache now caches and stores user roles in a data structure like this:
{
user_id_1: {
course_id_1: {role1, role2, role3}, # Set of roles associated with course_id_1
course_id_2: {role4, role5, role6}, # Set of roles associated with course_id_2
[ROLE_CACHE_UNGROUPED_ROLES_KEY]: {role7, role8} # Set of roles not tied to any specific course or library. For example, Global Staff roles.
},
user_id_2: { ... } # Similar structure for another user
}
While this changes the data structure used to store roles under the hood and adds the new property
roles_by_course_id
to the RoleCache,when initializing the RoleCache will store roles additionally in the previous data structure - as a flat set - in the
_roles
property accessible viaall_roles_set
. This establishesbackwards compatibility.
We are now storing roles twice in the RoleCache (in each of the two data structures), which means this takes twice as much memory, but only in the scope of a request.
Testing Instructions
Already tested everything very extensively before merging into master. I'll leave it up to you how testing for this should look.
You can't test this too well - performance will just not be significant locally. You would need a user who has at least several courses and libraries.
I guess if you really really wanted to you could go to the code place that calls
accessible_libraries_iter
and add a loop that calls that function 10k times just for testing, and then compare before and after this fix to see whether it has improved.Also this all only applies for course instructors, so there will be no change if you test while being a global staff user.
Mainly we want to test now that nothing is broken.