Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flush() after writing to gzip_file #2753

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

flush() after writing to gzip_file #2753

wants to merge 2 commits into from

Conversation

vin
Copy link

@vin vin commented Nov 15, 2024

Summary

In order to better support streaming responses where the chunks are smaller than the file buffer size, we flush after writing.

Without the explicit flush, the writes are buffered and the subsequent reads see an empty self.gzip_buffer until the file automatically flushes due to either (1) the 32KiB write buffer1 fills or (2) the file is closed because the streaming response is complete.

Without flushing, the GZipMiddleware doesn't work as expected for streaming responses, especially not for Server-Sent Events which are expected to be delivered immediately to clients. The code as written appears to intend to flush immediately rather than buffering, as it does immediately call await self.send(message), but in practice that message is often empty.

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

Footnotes

  1. https://github.com/python/cpython/blob/main/Lib/gzip.py#L26

In order to better support streaming responses where the chunks are smaller than the file buffer size, we flush after writing.

Without the explicit flush, the writes are buffered and the subsequent reads see an empty self.gzip_buffer until the file automatically flushes due to either
(1) the write buffer fills, probably at 8kiB, or (2) the file is closed because the streaming response is complete.

Without flushing, the GZipMiddleware doesn't work as expected for streaming responses, especially not for Server-Sent Events which are expected to be delivered immediately to clients. The code as written appears to intend to flush immediately rather than buffering, as it does immediately call `await self.send(message)`, but in practice that `message` is often empty.
@Kludex
Copy link
Member

Kludex commented Nov 15, 2024

Can we add a test to prove your point?

@vin
Copy link
Author

vin commented Nov 15, 2024

I'll work on that today. I have a small repro case I'll share but I need to make it run as a test

@vin
Copy link
Author

vin commented Nov 16, 2024

I've added a test, but it's a bit complicated. Without the flush, the entire contents of the response are correct, but to show that they are received iteratively rather than all at once, I use a wrapping middleware to assert that GZipMiddleware isn't sending empty message bodies, which is what it does without the flush.

@vin
Copy link
Author

vin commented Nov 25, 2024

@Kludex could you take another look or recommend a good reviewer? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants