Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GZIP encoded POSTs/GETs via CURL #20

Open
staabm opened this issue Apr 1, 2020 · 6 comments
Open

Support GZIP encoded POSTs/GETs via CURL #20

staabm opened this issue Apr 1, 2020 · 6 comments

Comments

@staabm
Copy link

staabm commented Apr 1, 2020

we use the API and send between 15k and 40k requests per hour.. things get pretty slow..

we aleady batch the requests to reduce the overhead.

after a bit of investigation it looks like requests/responses are not compressed/gzip-encoded. we only see Accept-Encoding: deflate, gzip but no Content-Encoding: gzip

also when posting items to the webservice it seems things went across the wire in plain text.

see the following output, after hacking curl_setopt($this->handle, CURLOPT_VERBOSE, 1); into the cURL.php of the http-client:

*   Trying 136.243.64.171...
* Connected to rapi.recombee.com (136.243.64.171) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /cluster/www/www/www/telexa/vendor/rmccue/requests/library/Requests/Transport/cacert.pem
  CApath: /etc/ssl/certs
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*        subject: OU=Domain Control Validated; CN=*.recombee.com
*        start date: Dec 30 10:26:13 2018 GMT
*        expire date: Feb 28 16:23:00 2021 GMT
*        subjectAltName: rapi.recombee.com matched
*        issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; CN=Go Daddy Secure Certificate Authority - G2
*        SSL certificate verify ok.
> GET /telexa-dev/items/list/?hmac_timestamp=1585732533&hmac_sign=XXXHTTP/1.1
Host: rapi.recombee.com
Accept: */*
Accept-Encoding: deflate, gzip
Referer: https://rapi.recombee.com/telexa-dev/items/list/?hmac_timestamp=1585732533&hmac_sign=XXX
User-Agent: recombee-php-api-client/3.0.0
Connection: close

< HTTP/1.1 200 OK
< Server: nginx/1.14.1
< Date: Wed, 01 Apr 2020 09:15:33 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 190555
< Connection: close
< X-Recombee-Request-Id: 2100e32cfc2677b0f662f571fccd69ea
< Cache-Control: no-cache
<
* Closing connection 0
DIE!mstaab@mst16:/cluster/www/www/www/telexa/scripts$ php recombee.php partner development 17
start:2020-04-01 11:16:37
*   Trying 136.243.64.171...
* Connected to rapi.recombee.com (136.243.64.171) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /cluster/www/www/www/telexa/vendor/rmccue/requests/library/Requests/Transport/cacert.pem
  CApath: /etc/ssl/certs
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*        subject: OU=Domain Control Validated; CN=*.recombee.com
*        start date: Dec 30 10:26:13 2018 GMT
*        expire date: Feb 28 16:23:00 2021 GMT
*        subjectAltName: rapi.recombee.com matched
*        issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; CN=Go Daddy Secure Certificate Authority - G2
*        SSL certificate verify ok.
> GET /telexa-dev/items/list/?hmac_timestamp=1585732597&hmac_sign=XXXHTTP/1.1
Host: rapi.recombee.com
Accept: */*
Accept-Encoding: deflate, gzip
Referer: https://rapi.recombee.com/telexa-dev/items/list/?hmac_timestamp=1585732597&hmac_sign=XXX
User-Agent: recombee-php-api-client/3.0.0
Connection: close

< HTTP/1.1 200 OK
< Server: nginx/1.14.1
< Date: Wed, 01 Apr 2020 09:16:37 GMT
< Content-Type: application/json; charset=utf-8
< Content-Length: 190555
< Connection: close
< X-Recombee-Request-Id: 06730ddebc72699bdc2f513d5b3241d0
< Cache-Control: no-cache
<
* Closing connection 0
* Hostname rapi.recombee.com was found in DNS cache
*   Trying 136.243.64.171...
* Connected to rapi.recombee.com (136.243.64.171) port 443 (#0)
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /cluster/www/www/www/telexa/vendor/rmccue/requests/library/Requests/Transport/cacert.pem
  CApath: /etc/ssl/certs
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
* Server certificate:
*        subject: OU=Domain Control Validated; CN=*.recombee.com
*        start date: Dec 30 10:26:13 2018 GMT
*        expire date: Feb 28 16:23:00 2021 GMT
*        subjectAltName: rapi.recombee.com matched
*        issuer: C=US; ST=Arizona; L=Scottsdale; O=GoDaddy.com, Inc.; OU=http://certs.godaddy.com/repository/; CN=Go Daddy Secure Certificate Authority - G2
*        SSL certificate verify ok.
> POST /telexa-dev/batch/?hmac_timestamp=1585732597&hmac_sign=XXXHTTP/1.1
Host: rapi.recombee.com
Accept: */*
Accept-Encoding: deflate, gzip
Referer: https://rapi.recombee.com/telexa-dev/batch/?hmac_timestamp=1585732597&hmac_sign=XXX
Content-Type: application/json
User-Agent: recombee-php-api-client/3.0.0
Connection: close
Content-Length: 658192
Expect: 100-continue
@staabm staabm changed the title Support GZIP encoded posts Support GZIP encoded POSTs/GETs Apr 1, 2020
@staabm staabm changed the title Support GZIP encoded POSTs/GETs Support GZIP encoded POSTs/GETs via CURL Apr 1, 2020
@OndraFiedler
Copy link
Member

Thanks for opening the ticket.
Support for GZIP definitely makes sense, however I assume it is not the main bottleneck as the requests & responses are rather small.

The ways for achieving higher throughput will be:

  1. Assigning more computational resources for your database
  2. Moving your database to a datacenter near you - I assume to North America? Currently the databases are created in Europe by default.

Both can by done by Recombee support, so please contact [email protected].

@staabm
Copy link
Author

staabm commented Apr 1, 2020

please see https://blackfire.io/profiles/6d579fdc-bb0a-4c92-aebe-bde02484ee61/graph for a in-depth look into our cronjob.

the curl requests take the most of the time. not sure whether the time is spent in-transit or at your computational end.

I will contact the suppport for now and see what I can achieve.

thx for the fast feedback

@staabm
Copy link
Author

staabm commented Apr 1, 2020

Support for GZIP definitely makes sense, however I assume it is not the main bottleneck as the requests & responses are rather small.

I guess in our case, where we send 10.000 items for deactivation having them GZIP'ed (or at least somehow compressed) could make a big difference.

as you can see in the above blackfire profile, we spent ~40 seconds to delete 10.000 items:
grafik

@staabm
Copy link
Author

staabm commented Apr 1, 2020

just got some numbers from production. there we have 4x ~40 seconds delete commands for 10.000 item batches each

grafik

@staabm
Copy link
Author

staabm commented Oct 19, 2020

it seems we are again running into this problem.

Support for GZIP definitely makes sense, however I assume it is not the main bottleneck as the requests & responses are rather small.

was GZIP support activated on your end since then?

@Pazekal90
Copy link

it seems we are again running into this problem.

Support for GZIP definitely makes sense, however I assume it is not the main bottleneck as the requests & responses are rather small.

was GZIP support activated on your end since then?

Ping @OndraFiedler. Is there something new?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants