-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary Redis Session Locking On All HTTP GET Requests - Affecting PWA Studio Concurrent GraphQL Requests #34758
Comments
Hi @mttjohnson. Thank you for your report.
Make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:
For more details, review the Magento Contributor Assistant documentation. Add a comment to assign the issue: To learn more about issue processing workflow, refer to the Code Contributions.
🕙 You can find the schedule on the Magento Community Calendar page. 📞 The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket. 🎥 You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel ✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel |
@mttjohnson,
|
I confirmed the issue. I think is valid to address this at least in GraphQL where we don't really need session |
@Yonn-Trimoreau I feel you. Thank you for taking the time to respond, it's appreciated. |
Seems 2.4.5 and above have this flag which may help provided all the graphql endpoints are coded correctly.
|
Monitoring we have suggests this same issue occurs within the admin wysiwyg editor in Magento 2.4.7. When the editor is loaded it will attempt to load the thumbnails with For now i will implement @colinmollenhour's suggestion #34758 (comment) |
Hello, We have implemented @colinmollenhour's suggestion #34758 (comment) and are testing in pre-production environments. We are primarily concerned about improving performance (ie eliminating unnecessary redis locking) around the checkout flow. Wondering if other's could share the request URIs they are targeting. So far we have identified these:
Logging the requests in this flow looks like this (
|
@denniskopitz Looks like a great step forward, thanks for sharing. Why does |
I've mainly been focussing on the backend since that caused the most problems for our clients for that i had:
I don't think it was Magento framework code but a plugin, but i can definitely remember finding some messages being set on the session in some rest endpoints (after wondering why the hell our headless frontend wasn't showing these messages) |
Apologies for the delayed response @colinmollenhour. When testing, we found that setting the |
@indykoning Did your effort improve performance on the admin wysiwyg editor? We have also experienced that issue. |
We have the same issue with Cm\RedisSession\Handler::read 30sec Here is a Redis session config, I've also set debug log level to catch the issue, maybe this can shed light
|
I'm not familiar with Magento 2 checkout - does it send one request per quote item or something like that? Also if any other requests had a fatal error and the session lock was not released that could explain it. E.g. perhaps there is an out of memory error on the request just before the totals-information one. |
The vanilla m2 one step checkout sends everything via ajax to update quote as you enter information, then shipping options, then payment accepted redirects to success page. We utilize Bolt payment which has to communicate to Bolt servers then updates quote back in Magento via API which is why I'm reading this issue. @colinmollenhour can you clarify, if there is a fatal error anywhere in Magento 2 process then it does not release the lock for session? |
@AndresInSpace there is a "catch all" exception management in place. If any exception occurs, the session is closed cleanly. So the lock is released, don't worry about it. @colinmollenhour some situations can lead to having at least 3 Ajax requests sent in parallel in the normal checkout (not one per item, but each request can be time-consuming, especially if the cart contains a lot of items) And we've already discussed this, the 3 definitive solutions are:
@AndresInSpace you should look into using the patches I have provided above in this discussion. They are presently in place on 5 sites and they fix exactly the kind of issue you are describing. They are however not adapted to the latest version of colinmollenhour's library so I can send you an updated version if you want (which a colleague of mine has made compatible with the latest version). @colinmollenhour just to clarify things up: I understand why it would be problematic to do this in your library, as this would be a major breaking change for all your non-Magento users, and I don't think you should change your mind because of Magento's current issue. |
Edit: Please ignore my misunderstanding here :)
|
@AndresInSpace I think you are going out of road with this reasoning. |
My apologies everyone I had a misunderstanding, did not mean to branch the subject. @vadim4err IIRC the |
Coming back to this. The case was, being spammed by the same notifications. Reason why? And the following endpoint reads notifications from your session, Removes them from your session, and returns the messages. If it cannot remove them from your session, it will simply repeat the cycle. |
This is correct. I can confirm the same issue. |
@Yonn-Trimoreau thank you for the patch, we are going to try it in production. We are also currently using: https://github.com/integer-net/magento2-session-unblocker/ I'm curious on your opinion, do you think there's any extra benefit on top of your patch to use that module? |
@webtekindo this module seems interesting. |
Here are the updated versions of the patches if you need them: implement-write-lock-and-diff.txt Don't forget to flush the redis session storage after applying it in production. For reference, the initial comment: #34758 (comment) |
@Yonn-Trimoreau ok perfect we will try without the module and thanks for the updated patches. |
@Yonn-Trimoreau we are already running the patch in production and I have to say it works perfectly, nice job! We use the old patch that you shared here #34758 (comment) and we had issue on the checkout because of the function _arrayRemoveRecursiv after replacing the function with the one in the new patch the problem was solved. We cannot directly apply the latest version of your patch directly, because probably we still use an older Magento version (2.4.2), is there any other part of the new patch (except _arrayRemoveRecursiv) that is important to adjust? And do you think it should be possible to make a Magernto module out of it instead of patching Cm/RedisSession, I give it a try but still have some issues (with session locking) here the Magento preference that we try: I will be curious if you think it may works that way (notice the read and write functions) ?
Merci. |
I started on a proof of concept that is similar to @Yonn-Trimoreau 's approach. It uses MongoDb with zero locking but uses It may also be possible to do something similar using Lua with Redis, although that would just put more strain on Redis which is single-threaded so may not scale well. MongoDb should be perfect for this, although you would need to call the |
@webtekindo yes this probably would be a preference on \Magento\Framework\Session\SaveHandlerInterface, ultimately that's the way to go. |
@colinmollenhour great! MongoDB would be a right fit since it supports entry-level locking and can read/write at an impressive speed (but not as fast as Postgres! and it also supports entry-level locking). |
@Yonn-Trimoreau Agreed that Postgres (even MySQL) should also be capable of handling this design, although the syntax for setting and deleting multiple keys deeply nested in a single query with SQL looks pretty ugly which is why I chose MongoDb for the proof of concept. One possibility would be in MySQL 9 to use a Javascript stored program to do the updates, but I have no idea how that performs compared to Redis and MongoDb. But yes, it could be done with any database that has decent JSON manipulation. The performance of either Postgres or Mongo would probably be very good in terms of high concurrency, possibly even better than Redis given the removal of multiple round trips required for locking. Also agreed that I hate to add yet another database server to the stack, but I'd argue that the Redis instance should not be shared with the cache anyway. In my opinion they should be separate instances due to needing different eviction policies, and Redis being a single-threaded performance bottleneck - since cache and sessions are separate data types it is an obvious choice for horizontal scaling without going to a full cluster implementation. |
Preconditions (*)
Helpful Insights
Steps to reproduce (*)
Configure Magento
Load the PWA storefront home page
domain:syseng-seldon.cldev.io -media -static graphql
Simulate the PWA storefront requests on ANY Magento site
Replace the domain variable with the domain of the magento site you're testing
Load the HTML page containing js that will fetch multiple graphql requests concurrently
You can open Chrome Developer Tools on the Network tab and where you see one of the GraphQL calls, you can right click on the request line and "Copy as fetch" to get the javascript fetch statement for making that same request in an html page js inline script.
You can do this to re-create all the requests for a specific page by adding each fetch() call to the html file, or duplicate the exact same fetch() graphql request (40x) to reproduce the session locking behavior being seen here.
Simulate SOME of the expected behavior by globally disabling redis session locking on ALL requests
Expected result (*)
When concurrent GraphQL GET requests are made from a visitor, requests should be able to complete in parallel to keep page load time minimal, while still allowing requests that require session locking (where important session data is being written) in order to prevent some other request from overwriting data in the session. Important data getting overwritten in a session can negatively affect critical application behavior.
In simulating some of the correct behavior with Redis session locking disabled, I was able to load the home page and all 15 graphql requests within a window of 600 ms. There are other pages that may contain many more GraphQL requests where this can be even more important to have concurrent requests complete in parallel.
Looking at a waterfall of how the concurrent GraphQL requests complete, can reveal that multiple requests are completing at or within close to the same time.
Actual result (*)
With Redis session locking enabled which is the default and recommended safe behavior for redis session configs, the concurrent requests queue up, each waiting in sequence for a redis session lock to clear before the next request is able to complete.
This makes it look like several of the graphql requests are taking an excessive amount of time to complete, while others completed in less time, but while these requests started close to the same time, they spent a lot of time waiting for session locking to clear, resulting in requests being completed in sequence rather than in parallel.
Cause of Behavior
With the PWA Studio (Client Side React App) running as the frontend "storefront" of Magento and sending GraphQL calls to the backend of Magento there are different behavioral patterns in how requests are sent to and processed by the web server from how we have been seeing interactions when using the Magento "theme" as the frontend.
The PWA sends multiple concurrent AJAX calls to
/graphql
end points on the web server, and these requests are all processed by the Magento backend PHP application. It turns out that Magento architecture has it creating a session and locking that session regardless of the type of request or response being issued.Some
/graphql
requests are able to be cached by Varnish and because cached requests in Varnish do not execute code and therefore do not interact with redis sessions. This can mask the fact that each request that hits the backend will lock the session for the request and cause requests to be completed in sequence. Many of the graphql requests are not able to be cached in Varnish at this time.We have seen this behavior in the Magento "theme" frontend also, but there are typically only a few requests that typically happen concurrently, and we see these results show up in New Relic on transaction traces a lot for AJAX calls that are likely to be running concurrently with other requests with the same session.
The problem is not just isolated to AJAX calls like
customer/section/load
on the Magento "theme" frontend, it also tends to occasionally interfere with other requests on product pages, or AJAX calls in the checkout. In the example below the unlucky visitor ended up waiting 8.8s to get an initial response for loading the page instead of what should have taken 300ms because they had some other request that was locking their session. It doesn't happen often, but it's not pleasant when it does.The big behavioral difference is that most of the requests in the Magento "theme" frontend do NOT happen concurrently, so session locking on a couple concurrent requests doesn't affect things overall that much, and it tends to be more of an exception and most requests (even customer/section/load AJAX requests) are not delayed waiting for session locks to release on average, so this known deficiency is only causing mild problems with the Magento "theme" frontend overall, and limiting concurrent AJAX calls is a way of working around it.
This session locking causes lots of problems with a PWA by delaying concurrent AJAX calls causing them to wait in line and finish sequentially as the locks they are waiting on clear. This ends up making requests randomly appear like they are taking a really long time to complete while under the hood they are primarily just waiting in line for the session to be available so it can lock the session and complete it's request. This makes it very hard to identify from the client side where the problem is as it relates to concurrent requests for the same session and running the requests independently results in a very fast response.
While it's possible to disable session locking, this can cause serious issues with requests that make changes to session data and overwrite each other, and that can result in serious problems on checkout where payments are captured but orders fail to be saved in Magento resulting sometimes in a customer re-submitting the order and getting charged multiple times if they are able to get the order to complete successfully.
The library used by Magento (v2.3.4) is the latest available (colinmollenhour/php-redis-session-abstract v1.4.4) and it does support the ability to process read only requests without locking a session when the global config is set to utilize session locking... it just so happens that Magento does not utilize this functionality and opens all session connections without specifying if it needs to write to the session or not, thus defaulting to write mode and session locking.
Possible Solution
Requests that come into Magento as
GET
requests are typically expected to return generic publicly identical response that can be cached by Varnish, whilePOST
requests are explicitly not allowed to be cached by Varnish as they are expected to contain visitor/session/customer private data in the responses. Important write operations should typically happen inPOST
requests, and those types of requests would be expected to need and utilize session locking, whileGET
requests would generally be for returning generic data that is not visitor/session/customer specific and the same response would be returned to all requests and not involve any kind of write to the session. If any session write activity were to occur onGET
requests, it would likely be to update the timestamp of the most recent request to indicate the session is still active and has not expired yet (this is an assumption that should be verified).The text was updated successfully, but these errors were encountered: