Use buffer pool to avoid large buffer allocation for connections and responses #87

mfp · 2015-08-16T21:19:44Z

A trivial buffer pool is used to get rid of these buffer allocations:

per connection/receiver buffers (Ocsigen_http_com)
internal buffer in Ocsigen_senders.File_content
internal buffer in deflatemod

Moreover, some care is taken to avoid allocation for the last chunk of a file when the remaining data doesn't match the buffer size.

The changes are minimal, but reviews are appreciated. In particular:

I haven't tested deflatemod yet
I made a mistake in my first attempt at addressing the last-chunk issue (forgot that the buffer size could be rounded up to the next power of two), so any extra eyeballs to look for errors would help
I had to release the response stream resources (i.e. buffers) in Ocsigen_server; it's literally 3 lines, but it's quite big conceptually

Overall, these changes deliver up to ~25-30% more speed in some requests.

Performance

Effect of buffer pools

(buffer pool disabled with OCSIGEN_DISABLE_BUFFER_POOL -> enabled)
o=150 is the GC setting used in half of the runs

404
        12000 -> 12800
 o=150  12900 -> 13500


normal response (20 bytes)
        ~ 8100 -> 8900
  o=150 ~ 8400 -> 8900


16kb file ->
          6721 -> 7800
   o=150  7100 -> 7900

Effect of last-chunk optimization when serving files

PRE                                       POST
----------------------                    ----------------------
15kb file ->                              15kb file ->
          6200 -> 7400                              6400 -> 7950
   o=150  7000 -> 7800                       o=150  7200 -> 8200

7kb file ->                               7kb file ->
          7500 -> 8550                              7450 -> 9150
   o=150  7650 -> 8700                       o=150  7850 -> 9250


1kb file ->                               1kb file ->
          8100 -> 8750                              8100 -> 8750
   o=150  8400 -> 8800                       o=150  8550 -> 9000

----------------------                    ----------------------

Overall effect of `Lwt_chan` fix #49 + buffer pools + last-chunk optimization

(only for files): compare figures at the top + those in POST to these for 2.6:

2.6
--------------

404                                    16kb file ->
         11800                                   5500
 o=150   12500                            o=150  6100

20-byte response                       15kb file ->  
          7450                                   5200
   o=150  7900                            o=150  5900

                                       7kb file ->   
                                                 6000
                                          o=150  6700


                                       1kb file ->   
                                                 7650
                                          o=150  7950

mfp · 2015-08-16T21:41:50Z

AFAICS it builds & installs fine w/ jenkins (certainly builds for me). Same (phony?) uninstall error as in #84.

vasilisp · 2015-08-17T12:38:25Z

Thanks for doing this. I haven't read the last-chunk part carefully, but the rest of the code looks OK to me. The overhead (hashtable, stacks, release closures) should be insignificant for buffers of size 1K or more.

I can install and uninstall (the buffer-pool branch) just fine locally, it is hard to say why rm fails inside the docker container.

mfp · 2015-08-17T13:56:24Z

On Mon, Aug 17, 2015 at 05:38:28AM -0700, Vasilis Papavasileiou wrote:

Thanks for doing this. I haven't read the last-chunk part carefully, but the
rest of the code looks OK to me.

Thank you for taking a look :)

The overhead (hashtable, stacks, release closures) should be insignificant
for buffers of size 1K or more.

Indeed. In this case there's a speed increase across the board for any
response size (even 404s and 20-byte responses become a bit faster) --
ocsigenserver cannot know a priori that a connection will not make full use of
the connection buffers, so it's got to use large ones.

For hashtables of such size (a few dozen elements at most) I've measured over
4e7 lookups/s (we only need a handful of them per request, so it's over 3
orders of magnitude away :). The closures are essentially free (with Lwt,
we're building much heavier ones all the time anyway), and the only thing I
didn't know how to quantify easily was the caml_modify inherent in
Stack.push, but it clearly involves less work than the mark&sweeping we'd
have if we were allocating 4KB buffers left and right.

I can install and uninstall (the buffer-pool branch) just fine locally, it
is hard to say why rm fails inside the docker container.

I'm told it's an issue in the build script used for PRs.

Mauricio Fernández

mfp added 5 commits August 16, 2015 16:17

Add trivial buffer pool implementation (Ocsigen_buffer_pool).

c71eb6f

Use buffer pool in deflatemod.

f499fb6

Use buffer pool in Ocsigen_senders.File_content.

423285f

Use buffer pool in Ocsigen_http_com (per-connection/receiver buffers).

10393df

Ocsigen_senders.File_content: avoid allocation of last chunk.

62d4b86

mfp mentioned this pull request Aug 16, 2015

Large buffers allocated on each request #86

Open

Fix typo in Ocsigen_buffer_pool interface docs.

cfc8117

mfp mentioned this pull request Aug 17, 2015

Huge GC pressure when serving files with Cohttp_lwt_unix.Server.respond_file mirage/ocaml-cohttp#207

Open

Do not expose Ocsigen_buffer_pool.

cb960ae

vasilisp mentioned this pull request Nov 8, 2016

Release 2.8 #114

Closed

jrochel assigned vouillon May 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use buffer pool to avoid large buffer allocation for connections and responses #87

Use buffer pool to avoid large buffer allocation for connections and responses #87

mfp commented Aug 16, 2015

mfp commented Aug 16, 2015

vasilisp commented Aug 17, 2015

mfp commented Aug 17, 2015

Use buffer pool to avoid large buffer allocation for connections and responses #87

Are you sure you want to change the base?

Use buffer pool to avoid large buffer allocation for connections and responses #87

Conversation

mfp commented Aug 16, 2015

Performance

Effect of buffer pools

Effect of last-chunk optimization when serving files

Overall effect of Lwt_chan fix #49 + buffer pools + last-chunk optimization

mfp commented Aug 16, 2015

vasilisp commented Aug 17, 2015

mfp commented Aug 17, 2015

Overall effect of `Lwt_chan` fix #49 + buffer pools + last-chunk optimization