Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the experimental name space from cookie parameters #144

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 22 additions & 3 deletions CHANGES.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,31 @@
Changes
=======

TBR
---
0.13.0 (unreleased)
-------------------

* Cookie support is no longer experimental:

* If the ``COOKIES_ENABLED`` setting is ``True`` (default), automatic request
parameter mapping now sets ``responseCookies`` to ``True`` and maps request
cookies to ``requestCookies``.
Comment on lines +9 to +11
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that, by default, all Zyte API requests with automap will start including "responseCookies": True among their parameters. Similar to how we do with httpResponseHeaders, but in this case the behavior also affects browser rendering and automatic extraction scenarios. The test expectation updates are a great way to get an idea of the impact.

This is a big one, the reason I went for 0.13.0 instead of 0.12.3, and I wonder whether or not this is the right call. I wonder if we should implement an opt-in setting for responseCookies, or make it so that it is only added to requests if requestCookies is also added (either manually by the user our automatically mapped).


* The ``ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED`` setting is now deprecated.
When enabled, however, the ``experimental`` name space is still used for
automatic request parameter mapping.

* If you use ``requestCookies``, ``responseCookies``, or ``cookieManagement``
within the ``experimental`` name space in request parameters, a deprecation
warning is now logged.

* The ``responseCookies`` response parameter is now handled the same as
``experimental.responseCookies``; the latter still works but is deprecated.

* Zyte API Request IDs are now included in the error logs.

* Bump the zyte-api dependency: 0.4.7 → 0.4.8.


0.12.2 (2023-10-19)
-------------------

Expand Down Expand Up @@ -42,7 +61,7 @@ TBR
Experimental is treated as a namespace, and its parameters are the ones
counted, i.e. there is no ``scrapy-zyte-api/request_args/experimental`` stat,
but there are stats like
``scrapy-zyte-api/request_args/experimental.responseCookies``.
``scrapy-zyte-api/request_args/experimental.foo``.


0.11.1 (2023-08-25)
Expand Down
46 changes: 20 additions & 26 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -323,10 +323,10 @@ possible:

- ``statusCode`` becomes ``response.status``.

- ``httpResponseHeaders`` and ``experimental.responseCookies`` become
- ``httpResponseHeaders`` and ``responseCookies`` become
``response.headers``.

- ``experimental.responseCookies`` is also mapped into the request cookiejar.
- ``responseCookies`` is also mapped into the request cookiejar.

- ``browserHtml`` and ``httpResponseBody`` are mapped into both
``response.text`` (``str``) and ``response.body`` (``bytes``).
Expand Down Expand Up @@ -412,17 +412,15 @@ parameters are chosen as follows by default:

- ``Request.body`` becomes ``httpRequestBody``.

- If the ``ZYTE_API_EXPERIMENTAL_COOKIES_ENABLED`` Scrapy setting is
``True``, the COOKIES_ENABLED_ Scrapy setting is ``True`` (default), and
provided request metadata does not set dont_merge_cookies_ to ``True``:
- If the COOKIES_ENABLED_ Scrapy setting is ``True`` (default), and provided
request metadata does not set dont_merge_cookies_ to ``True``:

.. _COOKIES_ENABLED: https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#std-setting-COOKIES_ENABLED
.. _dont_merge_cookies: https://docs.scrapy.org/en/latest/topics/request-response.html#std-reqmeta-dont_merge_cookies

- ``experimental.responseCookies`` is set to ``True``.
- ``responseCookies`` is set to ``True``.

- Cookies from the request `cookie jar`_ become
``experimental.requestCookies``.
- Cookies from the request `cookie jar`_ become ``requestCookies``.

.. _cookie jar: https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#std-reqmeta-cookiejar

Expand All @@ -434,10 +432,10 @@ parameters are chosen as follows by default:
If the cookies to be set exceed the limit defined in the
``ZYTE_API_MAX_COOKIES`` setting (100 by default), a warning is logged,
and only as many cookies as the limit allows are set for the target
request. To silence this warning, set ``experimental.requestCookies``
manually, e.g. to an empty dict. Alternatively, if Zyte API starts
supporting more than 100 request cookies, update the
``ZYTE_API_MAX_COOKIES`` setting accordingly.
request. To silence this warning, set ``requestCookies`` manually, e.g.
to an empty dict. Alternatively, if Zyte API starts supporting more
than 100 request cookies, update the ``ZYTE_API_MAX_COOKIES`` setting
accordingly.

If you are using a custom downloader middleware to handle request
cookiejars, you can point the ``ZYTE_API_COOKIE_MIDDLEWARE`` setting to
Expand Down Expand Up @@ -511,20 +509,18 @@ following parameters:
"value": "application/json"
}
],
"experimental": {
"requestCookies": [
{
"name": "a",
"value": "b",
"domain": ""
}
],
"responseCookies": true
},
"httpResponseBody": true,
"httpResponseHeaders": true,
"httpRequestBody": "eyJmb28iOiAiYmFyIn0=",
"httpRequestMethod": "POST",
"requestCookies": [
{
"name": "a",
"value": "b",
"domain": ""
}
],
"responseCookies": true,
"url": "https://httpbin.org/anything"
}

Expand Down Expand Up @@ -553,10 +549,8 @@ following parameters:

{
"browserHtml": true,
"experimental": {
"responseCookies": true
},
"requestHeaders": {"referer": "https://example.com/"},
"responseCookies": true,
"url": "https://quotes.toscrape.com"
}

Expand Down Expand Up @@ -815,7 +809,7 @@ The following Zyte API parameters are *not* taken into account for request
fingerprinting:

- Request header parameters (``customHttpRequestHeaders``,
``requestHeaders``)
``requestHeaders``, ``requestCookies``)

- Metadata parameters (``echoData``, ``jobId``)

Expand Down
3 changes: 2 additions & 1 deletion scrapy_zyte_api/_cookies.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ def _process_cookies(
):
if not cookie_jars:
return
response_cookies = api_response.get("experimental", {}).get("responseCookies")
old_response_cookies = api_response.get("experimental", {}).get("responseCookies")
response_cookies = api_response.get("responseCookies", old_response_cookies)
if not response_cookies:
return
cookie_jar = _get_cookie_jar(request, cookie_jars)
Expand Down
Loading