Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary Redis Session Locking On All HTTP GET Requests - Affecting PWA Studio Concurrent GraphQL Requests #34758

Open
1 task
mttjohnson opened this issue Dec 3, 2021 · 53 comments
Labels
Area: APIs Component: Framework/Session Component: GraphQL GraphQL Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Progress: ready for dev Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed.

Comments

@mttjohnson
Copy link

Preconditions (*)

  1. Magento 2.3.4-p1
  2. PWA Studio 12.1.0 (potentially optional)
  3. Setup Magento with Redis Sessions

Helpful Insights

  • New Relic APM Transaction Traces
  • Tested tech stack version (though this may not be relevant)
    • PHP 7.4
    • MySQL 8.0
    • Varnish 6.4
    • Redis 5
    • Elasticsearch 7.12

Steps to reproduce (*)

Configure Magento

  1. Ensure that default configuration of Redis session locking is enabled by setting disable_locking to 0:
cat app/etc/env.php | grep disable_locking
bin/magento -n setup:config:set --session-save-redis-disable-locking=0

Load the PWA storefront home page

  1. Load home page (example: https://syseng-seldon.cldev.io/) which will make multiple graphql calls to the MAGENTO_BACKED_URL
  2. In Chrome Developer Tools on the Network tab you can use a filter to only display requests for the specific domain that are graphql calls domain:syseng-seldon.cldev.io -media -static graphql
  3. Multiple (15) /graphql requests are all sent concurrently to the configured MAGENTO_BACKEND_URL of the PWA app around the same time

Simulate the PWA storefront requests on ANY Magento site

  1. Create a static html file (example: graphqlrepeat.html) in /pub/ of the magento root containing fetch
<!DOCTYPE html>
<html lang="en">
<head><title>JavaScript HTTP Requests</title></head>
<body><script>
 
var domain = "syseng-seldon.cldev.io";
 
fetch("https://" + domain + "/graphql?query=query+GetStoreConfigForCarouselEE%7BstoreConfig%7Bid+product_url_suffix+magento_wishlist_general_is_enabled+enable_multiple_wishlists+__typename%7D%7D&operationName=GetStoreConfigForCarouselEE&variables=%7B%7D", {
  "headers": {
    "accept": "*/*",
    "accept-language": "en-US,en;q=0.9",
    "authorization": "",
    "cache-control": "no-cache",
    "content-type": "application/json",
    "pragma": "no-cache",
    "sec-ch-ua": "\" Not A;Brand\";v=\"99\", \"Chromium\";v=\"96\", \"Google Chrome\";v=\"96\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"macOS\"",
    "sec-fetch-dest": "empty",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-origin",
    "store": "default",
    "x-magento-cache-id": "null"
  },
  "referrer": "https://" + domain + "/",
  "referrerPolicy": "strict-origin-when-cross-origin",
  "body": null,
  "method": "GET",
  "mode": "cors",
  "credentials": "include"
});
 
</script></body>
</html>
  1. Replace the domain variable with the domain of the magento site you're testing

  2. Load the HTML page containing js that will fetch multiple graphql requests concurrently

  3. You can open Chrome Developer Tools on the Network tab and where you see one of the GraphQL calls, you can right click on the request line and "Copy as fetch" to get the javascript fetch statement for making that same request in an html page js inline script.
    image

  4. You can do this to re-create all the requests for a specific page by adding each fetch() call to the html file, or duplicate the exact same fetch() graphql request (40x) to reproduce the session locking behavior being seen here.

Simulate SOME of the expected behavior by globally disabling redis session locking on ALL requests

  1. Simulate SOME of the expected behavior by globally disabling redis session locking on ALL requests
cat app/etc/env.php | grep disable_locking
bin/magento -n setup:config:set --session-save-redis-disable-locking=1

Expected result (*)

When concurrent GraphQL GET requests are made from a visitor, requests should be able to complete in parallel to keep page load time minimal, while still allowing requests that require session locking (where important session data is being written) in order to prevent some other request from overwriting data in the session. Important data getting overwritten in a session can negatively affect critical application behavior.

image
image
image

In simulating some of the correct behavior with Redis session locking disabled, I was able to load the home page and all 15 graphql requests within a window of 600 ms. There are other pages that may contain many more GraphQL requests where this can be even more important to have concurrent requests complete in parallel.

Home Page GraphQL Measurements - with Session Locking Disabled
  First started at: 393ms
  Last started at: 586ms
  Last duration was: 401ms
  Last ended at: 586ms + 401ms = 987ms
  Total window of execution: 586ms + 401ms - 393ms = 594ms
  GraphQL Requests Total: 15
  Page Finished 1020ms

image
image

Looking at a waterfall of how the concurrent GraphQL requests complete, can reveal that multiple requests are completing at or within close to the same time.

image

Actual result (*)

With Redis session locking enabled which is the default and recommended safe behavior for redis session configs, the concurrent requests queue up, each waiting in sequence for a redis session lock to clear before the next request is able to complete.

image
image
image

Home Page GraphQL Measurements - with Session Locking Disabled
  First started at: 313ms
  Last started at: 323ms
  Last duration was: 2800ms
  Last ended at: 323ms + 2800ms = 3123ms
  Total window of execution: 323ms + 2800ms - 313ms = 2810ms
  GraphQL Requests Total: 15
  Page Finished 3120ms

image
image
image

This makes it look like several of the graphql requests are taking an excessive amount of time to complete, while others completed in less time, but while these requests started close to the same time, they spent a lot of time waiting for session locking to clear, resulting in requests being completed in sequence rather than in parallel.

image
image

Cause of Behavior

With the PWA Studio (Client Side React App) running as the frontend "storefront" of Magento and sending GraphQL calls to the backend of Magento there are different behavioral patterns in how requests are sent to and processed by the web server from how we have been seeing interactions when using the Magento "theme" as the frontend.

The PWA sends multiple concurrent AJAX calls to /graphql end points on the web server, and these requests are all processed by the Magento backend PHP application. It turns out that Magento architecture has it creating a session and locking that session regardless of the type of request or response being issued.

Some /graphql requests are able to be cached by Varnish and because cached requests in Varnish do not execute code and therefore do not interact with redis sessions. This can mask the fact that each request that hits the backend will lock the session for the request and cause requests to be completed in sequence. Many of the graphql requests are not able to be cached in Varnish at this time.

We have seen this behavior in the Magento "theme" frontend also, but there are typically only a few requests that typically happen concurrently, and we see these results show up in New Relic on transaction traces a lot for AJAX calls that are likely to be running concurrently with other requests with the same session.

image

The problem is not just isolated to AJAX calls like customer/section/load on the Magento "theme" frontend, it also tends to occasionally interfere with other requests on product pages, or AJAX calls in the checkout. In the example below the unlucky visitor ended up waiting 8.8s to get an initial response for loading the page instead of what should have taken 300ms because they had some other request that was locking their session. It doesn't happen often, but it's not pleasant when it does.

The big behavioral difference is that most of the requests in the Magento "theme" frontend do NOT happen concurrently, so session locking on a couple concurrent requests doesn't affect things overall that much, and it tends to be more of an exception and most requests (even customer/section/load AJAX requests) are not delayed waiting for session locks to release on average, so this known deficiency is only causing mild problems with the Magento "theme" frontend overall, and limiting concurrent AJAX calls is a way of working around it.

image

This session locking causes lots of problems with a PWA by delaying concurrent AJAX calls causing them to wait in line and finish sequentially as the locks they are waiting on clear. This ends up making requests randomly appear like they are taking a really long time to complete while under the hood they are primarily just waiting in line for the session to be available so it can lock the session and complete it's request. This makes it very hard to identify from the client side where the problem is as it relates to concurrent requests for the same session and running the requests independently results in a very fast response.

While it's possible to disable session locking, this can cause serious issues with requests that make changes to session data and overwrite each other, and that can result in serious problems on checkout where payments are captured but orders fail to be saved in Magento resulting sometimes in a customer re-submitting the order and getting charged multiple times if they are able to get the order to complete successfully.

The library used by Magento (v2.3.4) is the latest available (colinmollenhour/php-redis-session-abstract v1.4.4) and it does support the ability to process read only requests without locking a session when the global config is set to utilize session locking... it just so happens that Magento does not utilize this functionality and opens all session connections without specifying if it needs to write to the session or not, thus defaulting to write mode and session locking.

  1. https://github.com/magento/magento2/blob/2.4.3-p1/composer.lock#L321
  2. https://github.com/colinmollenhour/php-redis-session-abstract/releases
  3. https://github.com/colinmollenhour/php-redis-session-abstract/blob/v1.4.4/src/Cm/RedisSession/Handler.php#L434
  4. https://github.com/magento/magento2/blob/2.4.3-p1/lib/internal/Magento/Framework/Session/SaveHandler/Redis.php#L63
  5. https://github.com/colinmollenhour/php-redis-session-abstract/blob/v1.4.4/src/Cm/RedisSession/Handler.php#L263

Possible Solution

Requests that come into Magento as GET requests are typically expected to return generic publicly identical response that can be cached by Varnish, while POST requests are explicitly not allowed to be cached by Varnish as they are expected to contain visitor/session/customer private data in the responses. Important write operations should typically happen in POST requests, and those types of requests would be expected to need and utilize session locking, while GET requests would generally be for returning generic data that is not visitor/session/customer specific and the same response would be returned to all requests and not involve any kind of write to the session. If any session write activity were to occur on GET requests, it would likely be to update the timestamp of the most recent request to indicate the session is still active and has not expired yet (this is an assumption that should be verified).


  • Severity: S2 - Affects non-critical data or functionality and forces users to employ a workaround.
@m2-assistant
Copy link

m2-assistant bot commented Dec 3, 2021

Hi @mttjohnson. Thank you for your report.
To speed up processing of this issue, make sure that you provided the following information:

  • Summary of the issue
  • Information on your environment
  • Steps to reproduce
  • Expected and actual results

Make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, Add a comment to the issue:

@magento give me 2.4-develop instance - upcoming 2.4.x release

For more details, review the Magento Contributor Assistant documentation.

Add a comment to assign the issue: @magento I am working on this

To learn more about issue processing workflow, refer to the Code Contributions.


⚠️ According to the Magento Contribution requirements, all issues must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

🕙 You can find the schedule on the Magento Community Calendar page.

📞 The triage of issues happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

🎥 You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@ihor-sviziev
Copy link
Contributor

@mttjohnson,
I can say that precisely the same issue we have on the luma-based frontend.
Actually, because of too frequent requests to customer/section/load, we had even issues with losing customer sessions. The solution for us was the following:

  1. Reduce not needed session starts 2.3...ihor-sviziev:fix-not-needed-session-start
  2. Installed the following extension: https://github.com/integer-net/magento2-session-unblocker/

@m2-community-project m2-community-project bot added Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed and removed Issue: ready for confirmation labels Feb 10, 2022
@cpartica
Copy link
Contributor

I confirmed the issue. I think is valid to address this at least in GraphQL where we don't really need session

@github-jira-sync-bot github-jira-sync-bot removed the Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed label Feb 10, 2022
@cpartica cpartica added Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed. Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed labels Feb 10, 2022
@cpartica cpartica added the Priority: P0 This generally occurs in cases when the entire functionality is blocked. label Feb 10, 2022
@github-jira-sync-bot github-jira-sync-bot removed the Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed label Feb 10, 2022
@cpartica cpartica added Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed and removed Priority: P0 This generally occurs in cases when the entire functionality is blocked. labels Feb 10, 2022
@github-jira-sync-bot github-jira-sync-bot removed the Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed label Feb 10, 2022
@magento magento deleted a comment from github-jira-sync-bot Mar 1, 2022
@magento magento deleted a comment from github-jira-sync-bot Mar 1, 2022
@magento magento deleted a comment from github-jira-sync-bot Mar 1, 2022
@Swahjak
Copy link
Contributor

Swahjak commented Mar 20, 2023

@Yonn-Trimoreau I feel you. Thank you for taking the time to respond, it's appreciated.

@convenient
Copy link
Contributor

Seems 2.4.5 and above have this flag which may help provided all the graphql endpoints are coded correctly.

private const XML_PATH_GRAPHQL_DISABLE_SESSION = 'graphql/session/disable';

<plugin name="graphql_session_disable" type="Magento\GraphQl\Plugin\DisableSession"/>

@indykoning
Copy link

Monitoring we have suggests this same issue occurs within the admin wysiwyg editor in Magento 2.4.7.

When the editor is loaded it will attempt to load the thumbnails with https://example.com/admin/cms/wysiwyg_images/thumbnail/file/XXXXXXXX/key/XXXXXXXX/
Considering all these thumbnails are requested at the same time and they all implement read locking.

For now i will implement @colinmollenhour's suggestion #34758 (comment)

@denniskopitz
Copy link

denniskopitz commented Jun 12, 2024

Hello,

We have implemented @colinmollenhour's suggestion #34758 (comment) and are testing in pre-production environments. We are primarily concerned about improving performance (ie eliminating unnecessary redis locking) around the checkout flow.

Wondering if other's could share the request URIs they are targeting. So far we have identified these:

'/(amasty_promo|estimate-shipping-methods|amazon_pay|group-prices|customer\/section\/load)/'

Logging the requests in this flow looks like this (SESSION READ indicates where we have modified the redis connection to readOnly):

GUEST CHECKOUT

[2024-06-11T17:48:31.373743+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /mens/watches.html [] []
[2024-06-11T17:48:38.352651+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages&force_new_section_timestamp=false&_=1718128116420 [] []
[2024-06-11T17:49:19.959767+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/cart/add/uenc/aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9tZW5zL3dhdGNoZXMuaHRtbA%2C%2C/product/1297/ [] []
[2024-06-11T17:49:23.301546+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Cdirectory-data%2Cammessages%2Csignifyd-fingerprint%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128116421 [] []
[2024-06-11T17:49:32.360905+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /mens/watches/therunwell41-leather-watch-s0126.html [] []
[2024-06-11T17:49:42.726151+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/cart/add/uenc/aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9tZW5zL3dhdGNoZXMvdGhlcnVud2VsbDQxLWxlYXRoZXItd2F0Y2gtczAxMjYuaHRtbA%2C%2C/product/1293/ [] []
[2024-06-11T17:49:45.351061+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Cdirectory-data%2Cammessages%2Csignifyd-fingerprint%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128176496 [] []
[2024-06-11T17:49:58.398474+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/cart/ [] []
[2024-06-11T17:50:05.137050+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amazon_pay/checkout/config?_=1718128203130 [] []
[2024-06-11T17:50:05.137051+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /reclaim/checkout/reload [] []
[2024-06-11T17:50:05.503370+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/estimate-shipping-methods [] []
[2024-06-11T17:50:07.096253+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128203131 [] []
[2024-06-11T17:50:07.513691+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:07.517021+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:07.568005+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:11.820151+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC9jYXJ0Lw%2C%2C&_=1718128203132 [] []
[2024-06-11T17:50:12.706038+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/ [] []
[2024-06-11T17:50:17.680662+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=idme-verify&force_new_section_timestamp=false&_=1718128215558 [] []
[2024-06-11T17:50:17.922531+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/internal/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/retrieve-adyen-payment-methods [] []
[2024-06-11T17:50:17.979711+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/estimate-shipping-methods [] []
[2024-06-11T17:50:19.993062+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215559 [] []
[2024-06-11T17:50:20.151170+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:20.151197+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:22.767240+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215560 [] []
[2024-06-11T17:50:24.932934+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /reclaim/checkout/email [] []
[2024-06-11T17:50:26.557505+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215561 [] []
[2024-06-11T17:50:26.723169+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/customers/isEmailAvailable [] []
[2024-06-11T17:50:35.854990+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/estimate-shipping-methods [] []
[2024-06-11T17:50:35.854846+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:38.882791+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215562 [] []
[2024-06-11T17:50:39.081188+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:42.093900+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:42.093969+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:44.664860+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:44.664870+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/internal/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/retrieve-adyen-payment-methods [] []
[2024-06-11T17:50:44.664867+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:50:44.665751+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/shipping-information [] []
[2024-06-11T17:50:48.366334+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215563 [] []
[2024-06-11T17:50:50.308453+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215564 [] []
[2024-06-11T17:50:53.869133+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amazon_pay/checkout/config?_=1718128215566 [] []
[2024-06-11T17:50:53.869097+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215565 [] []
[2024-06-11T17:50:54.177965+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/affirm/checkout/verify [] []
[2024-06-11T17:50:55.737571+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215567 [] []
[2024-06-11T17:51:06.361389+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/set-payment-information [] []
[2024-06-11T17:51:06.374592+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information [] []
[2024-06-11T17:51:08.573159+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /captcha/refresh/ [] []
[2024-06-11T17:51:08.573166+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=messages%2Clocation&force_new_section_timestamp=true&_=1718128215568 [] []
[2024-06-11T17:51:08.886128+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals?_=1718128215569 [] []
[2024-06-11T17:51:08.915024+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /amasty_promo/popup/reload/?uenc=aHR0cHM6Ly9sb2NhbC5zaGlub2xhLmNvbS9jaGVja291dC8%2C&_=1718128215570 [] []
[2024-06-11T17:51:13.269034+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/payment-information [] []
[2024-06-11T17:51:18.561105+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /captcha/refresh/ [] []
[2024-06-11T17:51:18.561144+00:00] report.INFO: group:NONE: SESSION READ  Request URI: /customer/section/load/?sections=cart%2Ccaptcha%2Cammessages%2Cmessages%2Clocation&force_new_section_timestamp=true&_=1718128215571 [] []
[2024-06-11T17:51:18.561269+00:00] report.INFO: group:NONE: SESSION WRITE Request URI: /checkout/onepage/success/ [] []

@colinmollenhour
Copy link

@denniskopitz Looks like a great step forward, thanks for sharing. Why does /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information need to lock the session? Surely REST API endpoints don't write messages to the session and just return them in the response body instead? Asking because I genuinely don't know..

@engcom-Hotel engcom-Hotel moved this to Ready for Development in High Priority Backlog Aug 19, 2024
@indykoning
Copy link

Wondering if other's could share the request URIs they are targeting. So far we have identified these:

'/(amasty_promo|estimate-shipping-methods|amazon_pay|group-prices|customer\/section\/load)/'

I've mainly been focussing on the backend since that caused the most problems for our clients for that i had:
#^/[^/]+/(catalog/category/categoriesJson|cms/wysiwyg_images/thumbnail|admin/widget_instance|\.thumbs[a-zA-Z]+)/#

Surely REST API endpoints don't write messages to the session and just return them in the response body instead? Asking because I genuinely don't know..

I don't think it was Magento framework code but a plugin, but i can definitely remember finding some messages being set on the session in some rest endpoints (after wondering why the hell our headless frontend wasn't showing these messages)

@denniskopitz
Copy link

@denniskopitz Looks like a great step forward, thanks for sharing. Why does /rest/shinola_us/V1/guest-carts/rF2YIRPXMEVwGI0gdhKVRqFzSvILoWA3/totals-information need to lock the session? Surely REST API endpoints don't write messages to the session and just return them in the response body instead? Asking because I genuinely don't know..

Apologies for the delayed response @colinmollenhour. When testing, we found that setting the totals-information to the readOnly list caused the checkout success page to fail. Rather than spend the time to figure out root cause of that, we just left it out.

@denniskopitz
Copy link

@indykoning Did your effort improve performance on the admin wysiwyg editor? We have also experienced that issue.

@vadim4err
Copy link

vadim4err commented Sep 9, 2024

We have the same issue with Cm\RedisSession\Handler::read 30sec
API request to /V1/carts/mine/totals-information
I've found that the customer's quote has product qty 871 with 251 items.
But, unfortunately, I can't reproduce the issue.

Here is a Redis session config, I've also set debug log level to catch the issue, maybe this can shed light

        'timeout' => '2.5',
        'persistent_identifier' => '',
        'compression_threshold' => '2048',
        'compression_library' => 'gzip',
        'log_level' => '7',
        'max_concurrency' => '25',
        'break_after_frontend' => '5',
        'break_after_adminhtml' => '30',
        'first_lifetime' => '600',
        'bot_first_lifetime' => '60',
        'bot_lifetime' => '7200',
        'disable_locking' => '0',
        'min_lifetime' => '60',
        'max_lifetime' => '2592000'

@colinmollenhour
Copy link

I'm not familiar with Magento 2 checkout - does it send one request per quote item or something like that? Also if any other requests had a fatal error and the session lock was not released that could explain it. E.g. perhaps there is an out of memory error on the request just before the totals-information one.

@AndresInSpace
Copy link

AndresInSpace commented Sep 9, 2024

I'm not familiar with Magento 2 checkout - does it send one request per quote item or something like that? Also if any other requests had a fatal error and the session lock was not released that could explain it. E.g. perhaps there is an out of memory error on the request just before the totals-information one.

The vanilla m2 one step checkout sends everything via ajax to update quote as you enter information, then shipping options, then payment accepted redirects to success page.

We utilize Bolt payment which has to communicate to Bolt servers then updates quote back in Magento via API which is why I'm reading this issue.

@colinmollenhour can you clarify, if there is a fatal error anywhere in Magento 2 process then it does not release the lock for session?

@Yonn-Trimoreau
Copy link

@AndresInSpace there is a "catch all" exception management in place. If any exception occurs, the session is closed cleanly. So the lock is released, don't worry about it.

@colinmollenhour some situations can lead to having at least 3 Ajax requests sent in parallel in the normal checkout (not one per item, but each request can be time-consuming, especially if the cart contains a lot of items)

And we've already discussed this, the 3 definitive solutions are:

  • Magento stops using session in the checkout APIs (which is somewhat difficult, it would imply implementing some kind of JWT token or similar in the front-office instead of relying on the session auth, and some ACLs on checkout-related entities)
  • Colinmollenhour's library implements a real entry-level write lock mechanism (instead of the actual read lock logic)
  • Magento implements a real write lock mechanism on top of colinmollenhour's library

@AndresInSpace you should look into using the patches I have provided above in this discussion. They are presently in place on 5 sites and they fix exactly the kind of issue you are describing. They are however not adapted to the latest version of colinmollenhour's library so I can send you an updated version if you want (which a colleague of mine has made compatible with the latest version).

@colinmollenhour just to clarify things up: I understand why it would be problematic to do this in your library, as this would be a major breaking change for all your non-Magento users, and I don't think you should change your mind because of Magento's current issue.
But still, this would be the cleanest and most elegant solution.

@AndresInSpace
Copy link

AndresInSpace commented Sep 12, 2024

Edit: Please ignore my misunderstanding here :)

What I was trying to say (and figure out) is that it seems this is not specific to just using redis for sessions caching(eg magento session), but also when utilized for default caches or full page cache, as this is an issue with redis and the process that opened the request (eg the 'redis session') right? I realize the module is specifically the php redis session, but I am wondering if the other redis module/configurations exacerbate this issue since despite having multiple instances for the default/fpc cache configuration those caches are always being hammered. Would that have a byproduct effect on the redis session cache(due to the single process nature of redis and reaching memory limits in environment and having 50 concurrent users sending requests.., etc) and possibly cause forced breaks?..

What I would like to figure out is if @mttjohnson or anyone in the thread (who all have done amazing debugging work, apologies for all my edits here and the confusion) can confirm they ran their tests on a storefront with redis configured for all (frontend default caches, fpc, and mage sessions) or was the storefront only having mage session configured to redis? Would be interested to see if redis for all three affects the queue timing differently, vs say redis session and varnish for fpc.

@Yonn-Trimoreau
Copy link

@AndresInSpace I think you are going out of road with this reasoning.
We are talking about Redis session locks. Session locks are specific to sessions and are not used in fpc or other cache types.
Furthermore, you are supposed to use a different Redis database for each cache type. No locking conflict can happen.
And Redis is a database that is built to withstand a huge number of requests in parallel, if you were to have a response time in the order of seconds, your Redis instance would be already completely down I think? Maybe I'm quite wrong on this, but the performance of Redis is not to be discussed and it does not respond slowly because of being hammered down in a Magento context.

@AndresInSpace
Copy link

AndresInSpace commented Sep 13, 2024

My apologies everyone I had a misunderstanding, did not mean to branch the subject.
Thank you for the clarification @Yonn-Trimoreau. I will look into the suggested patches/fixes for our project with this issue.

@vadim4err IIRC the totals-information triggers a calculate method to collect totals and is expensive since that is where tax loops through all items to calculate accordingly based on row eg line item (I believe this was context @colinmollenhour was asking earlier)
https://github.com/magento/magento2/blob/2.4-develop/app/code/Magento/Checkout/Model/TotalsInformationManagement.php

@indykoning
Copy link

Surely REST API endpoints don't write messages to the session and just return them in the response body instead? Asking because I genuinely don't know..

I don't think it was Magento framework code but a plugin, but i can definitely remember finding some messages being set on the session in some rest endpoints (after wondering why the hell our headless frontend wasn't showing these messages)

Coming back to this.
I've ran into the issue and it it actually Magento code!

The case was, being spammed by the same notifications. Reason why?
i had made /customer/section/load use Read Only sessions

And the following endpoint reads notifications from your session, Removes them from your session, and returns the messages.
/customer/section/load/?sections=messages

If it cannot remove them from your session, it will simply repeat the cycle.

@badger-adam
Copy link

And the following endpoint reads notifications from your session, Removes them from your session, and returns the messages. /customer/section/load/?sections=messages

If it cannot remove them from your session, it will simply repeat the cycle.

This is correct. I can confirm the same issue.

@webtekindo
Copy link

@Yonn-Trimoreau thank you for the patch, we are going to try it in production.

We are also currently using: https://github.com/integer-net/magento2-session-unblocker/

I'm curious on your opinion, do you think there's any extra benefit on top of your patch to use that module?

@Yonn-Trimoreau
Copy link

@webtekindo this module seems interesting.
But I'm pretty sure it will have no effect in the checkout (since it does not list the checkout session in the di.xml), and I'm wondering if it couldn't cause the same race condition issues as the disable_locking option, depending on the logic involved (since it closes the session once it is read, and not once it is written).
My patch and this module won't be technically incompatible, but I think this module is not safe to use in some scenarios (which could be identified by adding some logs and testing thoroughly)

@Yonn-Trimoreau
Copy link

Yonn-Trimoreau commented Nov 10, 2024

Here are the updated versions of the patches if you need them:

implement-write-lock-and-diff.txt
set-php-serialize-handler.txt

Don't forget to flush the redis session storage after applying it in production.

For reference, the initial comment: #34758 (comment)

@webtekindo
Copy link

@Yonn-Trimoreau ok perfect we will try without the module and thanks for the updated patches.

@webtekindo
Copy link

webtekindo commented Nov 16, 2024

@Yonn-Trimoreau we are already running the patch in production and I have to say it works perfectly, nice job!

We use the old patch that you shared here #34758 (comment) and we had issue on the checkout because of the function _arrayRemoveRecursiv after replacing the function with the one in the new patch the problem was solved.

We cannot directly apply the latest version of your patch directly, because probably we still use an older Magento version (2.4.2), is there any other part of the new patch (except _arrayRemoveRecursiv) that is important to adjust?

And do you think it should be possible to make a Magernto module out of it instead of patching Cm/RedisSession, I give it a try but still have some issues (with session locking) here the Magento preference that we try:

Redis.txt

I will be curious if you think it may works that way (notice the read and write functions) ?

public function read($sessionId, bool $readOnly = true)
{
    try {
        $connection = $this->getConnection();
        $connection->setReadOnly($readOnly);
        $sessionData = $connection->read($sessionId);
        $this->_lastReadSessionDataById[$sessionId] = $sessionData;

        return $sessionData;

    } catch (ConcurrentConnectionsExceededException $e) {
        require $this->filesystem->getDirectoryRead(DirectoryList::PUB)->getAbsolutePath('errors/503.php');
    }
}

public function write($sessionId, $sessionData)
{
    $freshSession = $this->_getRefreshedSessionDataToWrite($sessionId);
    return $this->getConnection()->write($sessionId, $freshSession);
}

protected function _getRefreshedSessionDataToWrite(string $sessionId): string
{
    $currentSessionData = $this->read($sessionId, false);

    //$this->getConnection()->setReadOnly(false);
    //$currentSessionData = $this->getConnection()->read($sessionId);

    $lastReadSessionData = $this->_lastReadSessionDataById['sess_' . $sessionId];

    $freshSession = unserialize($currentSessionData) ?: [];
    $lastReadSession = unserialize($lastReadSessionData) ?: [];

    $diffToUnset = $this->_arrayDiffRecursive($lastReadSession, $_SESSION);
    if (!empty($diffToUnset)) {
        $freshSession = $this->_arrayRemoveRecursive($freshSession, $diffToUnset);
    }

    $diffToSet = $this->_arrayDiffRecursive($_SESSION, $lastReadSession);
    if (!empty($diffToSet)) {
        $freshSession = array_replace_recursive($freshSession, $diffToSet);
    }

    return serialize($freshSession);
}

Merci.

@colinmollenhour
Copy link

I started on a proof of concept that is similar to @Yonn-Trimoreau 's approach. It uses MongoDb with zero locking but uses $set and $unset operations to only update the changed keys on write() so that different processes for the same session making changes to different keys (or no changes) will not block each other but rather perform a sort of sloppy merge. There is still some potential for race conditions, but unless there are some really hot keys, it should be greatly reduced and overall consistency should be better in theory. Mainly, two processes can read a key that one wants to delete, and both will see it, but it will still be deleted for the third process even though they both read the key and only one deleted it. The bad is that "exactly once" delivery is not guaranteed.
It's not in a working state currently and would require overriding some core functionality in Magento, but I wanted to explore the idea and see how it could work. I'm not expecting to finish this any time soon or possibly ever, so if someone wants to take it over, please feel free.
https://github.com/colinmollenhour/php-mongodb-lockless-session

It may also be possible to do something similar using Lua with Redis, although that would just put more strain on Redis which is single-threaded so may not scale well. MongoDb should be perfect for this, although you would need to call the gc() method for garbage cleanup.

@Yonn-Trimoreau
Copy link

@webtekindo yes this probably would be a preference on \Magento\Framework\Session\SaveHandlerInterface, ultimately that's the way to go.

@Yonn-Trimoreau
Copy link

@colinmollenhour great! MongoDB would be a right fit since it supports entry-level locking and can read/write at an impressive speed (but not as fast as Postgres! and it also supports entry-level locking).
But adding MongoDB to the Magento stack seems overkill.
I mean, we already have OpenSearch, RabbitMQ, Redis (for cache), Varnish, nginx/Apache, MySQL and PHP... The usage of Redis is convenient since we already use it heavily for caching purposes.

@colinmollenhour
Copy link

@Yonn-Trimoreau Agreed that Postgres (even MySQL) should also be capable of handling this design, although the syntax for setting and deleting multiple keys deeply nested in a single query with SQL looks pretty ugly which is why I chose MongoDb for the proof of concept. One possibility would be in MySQL 9 to use a Javascript stored program to do the updates, but I have no idea how that performs compared to Redis and MongoDb. But yes, it could be done with any database that has decent JSON manipulation.

The performance of either Postgres or Mongo would probably be very good in terms of high concurrency, possibly even better than Redis given the removal of multiple round trips required for locking. Also agreed that I hate to add yet another database server to the stack, but I'd argue that the Redis instance should not be shared with the cache anyway. In my opinion they should be separate instances due to needing different eviction policies, and Redis being a single-threaded performance bottleneck - since cache and sessions are separate data types it is an obvious choice for horizontal scaling without going to a full cluster implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: APIs Component: Framework/Session Component: GraphQL GraphQL Issue: Confirmed Gate 3 Passed. Manual verification of the issue completed. Issue is confirmed Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Progress: ready for dev Reproduced on 2.4.x The issue has been reproduced on latest 2.4-develop branch Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed.
Projects
Status: Ready for Development
Development

No branches or pull requests