-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extra characters being inserted before and after HTTP response when X-Sendfile is used and there are many concurrent requests #218
Comments
When you upgraded, did you recompile mod_xsendfile, or update it to the latest corresponding version. Usually an Apache module compiled for an older Apache version should work on a newer version of same major/minor version, but there are some instances where I have seen this not be the case because Linux distros backport patches to Apache, breaking its API forward compatibility. If this has happened, it is important that the Apache module be recompiled for the newer version of Apache. Also, what do you have and are files on a local filesystem or NFS server. The Finally, that |
By "upgrading" I meant that I went from using a Debian "jessie" virtual machine on Amazon Web Services to using a Debian "stretch" virtual machine on Amazon Web Services. Debian "stretch" uses this version of mod_wsgi, which is in a package maintained by Bernd Zeimetz. Based on the changelogs it looks like the Apache module was compiled on 12/29/2016 against Apache version 2.4.25. This is the same version of Apache currently used by Debian "stretch." However, it looks like the Apache installation has been modified since then, including the backporting of some security fixes from 2.4.26. I will try it again with a fresh recompile of mod_wsgi (and mod_xsendfile) to see if that changes things. EnableSendFile is not set, so the default setting of "off" would be used. In my application, the files referenced in the X-Sendfile headers are all on the local filesystem. Thanks very much for your help! |
I recompiled the wsgi and xsendfile modules against the Apache sources, and the 5-second delay and corruption with ten-byte sequences still occurs. I'm not sure what the source of the 5 seconds is. I tried changing the the I wouldn't think that mod_wsgi had anything to do with this, but for the fact that the problem only happens when there are a lot of requests coming in at the same time. Maybe the problem is that the xsendfile module hasn't been updated in seven years... |
Have you changed the value of |
I have not changed The KeepAlive mechanism might be the issue here. (See discussion of this issue regarding nginx.) |
Obvious thing to try then is to disable keep alive altogether. Which MPM are you using? If using the event MPM and the new way it handles keep alive connections, then maybe mod_xsendfile is incompatible with it. |
Yes, disabling KeepAlive makes the problem go away (but at a huge cost to performance). Another way to make the problem go away is to use HTTP instead of HTTPS. In both Debian jessie (Apache 2.4.10), which had no problems, and Debian stretch (Apache 2.4.25), the event MPM is used. I don't see any major changes to keepalive functionality mentioned in the changelog between 2.4.10 and 2.4.25, but I am not an expert in this area. I did an experiment to try to rule out mod_wsgi. I wrote a Perl script "file server" that prints an X-Sendfile header in order to serve files from my Flask static file directory. I also added a 250ms delay in the script. I then wrote an HTML file that includes the long list of static Javascript and CSS files that my web application calls, but it retrieves them through the Perl script. I enabled the cgid module and edited the Apache configuration to activate the Perl script in /usr/lib/cgi-bin. When I go to this HTML page in my web browser, the browser makes lots of simultaneous requests to the web server, just as it does when it is communicating with my web app through mod_wsgi. Interestingly, though, none of the responses gets corrupted or delayed. One difference between cgid and mod_wsgi is that cgid spawns separate processes for each call to the Perl script, whereas mod_wsgi (as I have it configured) uses one process with five threads. |
Is there any chance you could pull down mod_wsgi source code for version 4.3.0 and see if it still has the problem? If it doesn't, then try 4.3.1. |
Also, try setting |
Thanks so much for your help with this. When I compiled and installed 4.3.0 (using the standard When I did the same with 4.3.1, the problem reappeared. I tried setting |
Okay, seems I only added Anyway, at least I know what code has likely introduced the issue. I need to now work out why, but more likely work out how to identify the |
Can you set:
in the Apache configuration and tell me if that avoids the problem? |
Ok, I tried adding that line within the VirtualHost configuration. It did not avoid the problem. |
Can you tell me if the mod_xsendfile source code is the same as what is at: It mentions Apache 2.2, and looks like it may not have been changed for a very long time. |
Yes, the source code being used by Debian for the xsendfile module is exactly the same as the source code on that site. |
The static web page you said you created earlier to try and emulate this with a backend in Perl, can you provide me a version of that static HTML, even if uses dummy static asset URLs, and a Flask app with the most minimal code required to handle X-Sendfile. If you already have a self contained minimal example that triggers it, even better. I want to try and create a setup to emulate the problem. I suspect I am going to have to build it into a Docker image though, as more than likely will not see it on MacOS X . |
Ok, a minimal example is here: https://github.com/jhpyle/testxsendfile Note that I have only been able to trigger the problem over HTTPS. (I used to be able to get HTTPS to work on a personal computer with a self-signed certificate, but I can't figure that out anymore so these days I just use Let's Encrypt on virtual machines in the cloud.) I have this running at https://test54.docassemble.org. It is easier to trigger the problem when the network connection is slower. Over slow WiFi, I get the problem every time, but on a desktop I have to press Ctrl-Shift-R in Firefox several times before I see the problem reflected in the Console. You might be able to simulate this "slow connection" factor by changing the |
Just to let you know, have had super busy week as need to get some stuff done before some trips. So haven't had a chance to look at it yet. Getting mod_xsendfile to compile on MacOS X is also a pain as MacOS X is broken and doesn't supply apr-config/apu-config scripts any more, thus apxs breaks if try and use it to compile modules. |
Thanks very much for looking into the issue. By the way, I took down the https://test54.docassemble.org site last night (to save money on my Amazon Web Services bill), but I can recreate it if that would be helpful to you. Thanks! |
When I upgrade from mod_wsgi 4.3.0 (i.e. Debian jessie with Apache 2.4.10) to mod_wsgi 4.5.11 (i.e., Debian stretch or Ubuntu zesty with Apache 2.4.25), I start to see the following scenario:
When many concurrent requests are made at the same time, some static files that are served through X-Sendfile are 1) delayed by more than 5 seconds; 2) corrupted with ten bytes before the HTTP response and ten bytes after the HTTP response. This ten byte sequence consists of one ASCII character 3 byte followed by nine ASCII character 0 bytes. The sequence before the response appears immediately before the HTTP header, causing the browser to be unable to read the status code or content type of the HTTP response. The content of the response is there, and is correct (that is, the contents of the file referenced by X-Sendfile are there, and the content-type header is set correctly, etc.). The problem is that the response is sandwiched between these ten-byte sequences.
I am seeing this happen when a browser initially accesses my web application. Since the browser has not cached anything yet, approximately 20 CSS and Javascript files are all requested all at the same time. One or two of my CSS or Javascript files will get delayed and corrupted. The other files are delivered properly within 1,300 milliseconds. After a page refresh, the corrupted files will load without a problem (during that page load, only a few concurrent requests are made because the rest of the static files are cached by the browser). Sometimes, no files get corrupted. Usually one is corrupted, but sometimes two. There is some randomness to it.
The five second delay happens during the "receiving" portion of the request, according to my browser's network timing meters.
The browser is talking directly to Apache over HTTPS. Apache is running on a machine with 1 core.
My Apache configuration looks like this:
I am using Flask. If I turn off the X-Sendfile feature in my Flask application, this problem goes away. The problem seems to be related to the empty-data nature of X-Sendfile responses.
I am using Flask in a virtualenv, so the Python packages (Flask, werkzeug, etc.) are the same both on the Debian jessie platform where I get no error and the Debian stretch platform where I see the error. I am using Python 2.7.
I am guessing the problem is not with the xsendfile Apache module because the version of this module did not change between Debian jessie, where I had no error, and Debian stretch, where the error appeared.
The error also happens on Debian sid, which has even more up-to-date versions of Apache, etc.
It could be that this is an Apache issue and not a mod_wsgi issue, but I thought it made sense to check with you first. I noticed that wsgi_thread.c was doing
memset(content, '\0', sizeof(content))
which made me think that maybe this was the source of those extraneous null bytes. And I looked on the internet for other people reporting this ten-byte corruption issue, and I couldn't find any other reports, which makes me think it is less likely to be a global issue with Apache.I can help you reproduce this if you think it might be a mod_wsgi issue. I can be reached at [email protected].
Thanks very much,
Jonathan Pyle
The text was updated successfully, but these errors were encountered: