Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mediatypes): reimplement (and unvendor) mimeparse #2348

Merged
merged 32 commits into from
Oct 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
664211b
feat: WiP reimplement mimeparse
vytas7 Jul 5, 2024
468c90a
Merge branch 'master' into reimplement-mimeparse
vytas7 Aug 28, 2024
34f5f4b
feat(mediatypes): add some skeletons for mediatype parsing
vytas7 Aug 28, 2024
cada7b6
Merge branch 'master' into reimplement-mimeparse
vytas7 Sep 4, 2024
1ff911d
chore: fix up after master merge
vytas7 Sep 4, 2024
763a973
Merge branch 'falconry:master' into reimplement-mimeparse
vytas7 Sep 26, 2024
e6d164d
Merge branch 'master' into reimplement-mimeparse
vytas7 Sep 28, 2024
4243b7b
feat(mimeparse): wip doodles
vytas7 Sep 28, 2024
2a22a85
Merge branch 'master' into reimplement-mimeparse
vytas7 Sep 29, 2024
b3150a2
feat(mediatypes): implement computation of best quality
vytas7 Sep 29, 2024
cefa80a
feat(mediatypes): remove vendored mimeparse
vytas7 Sep 30, 2024
379998f
docs: add a newsfragment for one of the issues
vytas7 Sep 30, 2024
aa138b0
refactor: remove debug `print()`s
vytas7 Sep 30, 2024
75bd57b
feat(mediatypes): add specialized mediatype/range errors, coverage
vytas7 Oct 1, 2024
33afb39
Merge branch 'master' into reimplement-mimeparse
vytas7 Oct 1, 2024
69bd926
docs(newsfragments): add a newsfragment for #1367
vytas7 Oct 1, 2024
9fdc9ce
test(mediatypes): add more tests
vytas7 Oct 1, 2024
7912924
feat(mediatypes): improve docstring, simplify behaviour
vytas7 Oct 1, 2024
2f11234
refactor(mediatypes): use a stricter type annotation
vytas7 Oct 1, 2024
be2f155
chore: remove an unused import
vytas7 Oct 1, 2024
317109c
chore: fix docstring style violation D205
vytas7 Oct 1, 2024
84bbf39
Merge branch 'master' into reimplement-mimeparse
vytas7 Oct 4, 2024
4caed99
chore(docs): apply review suggestion to `docs/ext/rfc.py`
vytas7 Oct 4, 2024
42d2cea
docs(newsfragments): apply review suggestion for `docs/_newsfragments…
vytas7 Oct 4, 2024
ec4c557
Merge branch 'master' into reimplement-mimeparse
vytas7 Oct 4, 2024
f6e2ef3
refactor(mediatypes): address some review comments
vytas7 Oct 4, 2024
6f7afc2
perf(mediatypes): short-circuit if q is absent as per review comment
vytas7 Oct 4, 2024
017f75e
docs: explain how to mitigate a potentially breaking change
vytas7 Oct 4, 2024
94495f6
docs: add a note that we continue to maintain python-mimeparse
vytas7 Oct 4, 2024
b0e3829
refactor(mediatypes): convert _MediaType and _MediaRange to dataclasses
vytas7 Oct 5, 2024
8618640
fix(mediatypes): only use dataclass(slots=True) where supported (>=py…
vytas7 Oct 5, 2024
532b44e
refactor(mediatypes): a yet another attempt to make dataclasses work …
vytas7 Oct 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/_newsfragments/1367.newandimproved.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
The new implementation of :ref:`media type utilities <mediatype_util>`
(Falcon was using the ``python-mimeparse`` library before) now always favors
the exact media type match, if one is available.
36 changes: 36 additions & 0 deletions docs/_newsfragments/864.breakingchange.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
Falcon is no longer vendoring the
`python-mimeparse <https://github.com/falconry/python-mimeparse>`__ library;
the relevant functionality has instead been reimplemented in the framework
itself, fixing a handful of long-standing bugs in the new implementation.

If you use standalone
`python-mimeparse <https://github.com/falconry/python-mimeparse>`__ in your
project, do not worry! We will continue to maintain it as a separate package
under the Falconry umbrella (we took over about 3 years ago).

The following new behaviors are considered breaking changes:

* Previously, the iterable passed to
:meth:`req.client_prefers <falcon.Request.client_prefers>` had to be sorted in
the order of increasing desirability.
:func:`~falcon.mediatypes.best_match`, and by proxy
:meth:`~falcon.Request.client_prefers`, now consider the provided media types
to be sorted in the (more intuitive, we hope) order of decreasing
desirability.
vytas7 marked this conversation as resolved.
Show resolved Hide resolved

* Unlike ``python-mimeparse``, the new
:ref:`media type utilities <mediatype_util>` consider media types with
different values for the same parameters as non-matching.

One theoretically possible scenario where this change can affect you is only
installing a :ref:`media <media>` handler for a content type with parameters;
it then may not match media types with conflicting values (that used to match
before Falcon 4.0).
If this turns out to be the case, also
:ref:`install the same handler <custom_media_handlers>` for the generic
``type/subtype`` without parameters.

The new functions,
:func:`falcon.mediatypes.quality` and :func:`falcon.mediatypes.best_match`,
otherwise have the same signature as the corresponding methods from
``python-mimeparse``.
4 changes: 4 additions & 0 deletions docs/api/util.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,14 @@ HTTP Status
.. autofunction:: falcon.http_status_to_code
.. autofunction:: falcon.code_to_http_status

.. _mediatype_util:

Media types
-----------

.. autofunction:: falcon.parse_header
.. autofunction:: falcon.mediatypes.quality
.. autofunction:: falcon.mediatypes.best_match

Async
-----
Expand Down
6 changes: 3 additions & 3 deletions docs/ext/rfc.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

import re

IETF_DOCS = 'https://datatracker.ietf.org/doc/html'
RFC_PATTERN = re.compile(r'RFC (\d{4}), Section ([\d\.]+)')


Expand All @@ -39,11 +40,10 @@ def _process_line(line):
section = m.group(2)

template = (
'`RFC {rfc}, Section {section} '
'<https://tools.ietf.org/html/rfc{rfc}#section-{section}>`_'
'`RFC {rfc}, Section {section} <{ietf_docs}/rfc{rfc}#section-{section}>`__'
)

rendered_text = template.format(rfc=rfc, section=section)
rendered_text = template.format(rfc=rfc, section=section, ietf_docs=IETF_DOCS)

return line[: m.start()] + rendered_text + line[m.end() :]

Expand Down
6 changes: 6 additions & 0 deletions falcon/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
'http_status_to_code',
'IS_64_BITS',
'is_python_func',
'mediatypes',
'misc',
'parse_header',
'reader',
Expand Down Expand Up @@ -138,6 +139,8 @@
'HTTPUnsupportedMediaType',
'HTTPUriTooLong',
'HTTPVersionNotSupported',
'InvalidMediaRange',
'InvalidMediaType',
'MediaMalformedError',
'MediaNotFoundError',
'MediaValidationError',
Expand Down Expand Up @@ -388,6 +391,8 @@
from falcon.errors import HTTPUnsupportedMediaType
from falcon.errors import HTTPUriTooLong
from falcon.errors import HTTPVersionNotSupported
from falcon.errors import InvalidMediaRange
from falcon.errors import InvalidMediaType
from falcon.errors import MediaMalformedError
from falcon.errors import MediaNotFoundError
from falcon.errors import MediaValidationError
Expand Down Expand Up @@ -617,6 +622,7 @@
from falcon.util import http_status_to_code
from falcon.util import IS_64_BITS
from falcon.util import is_python_func
from falcon.util import mediatypes
from falcon.util import misc
from falcon.util import parse_header
from falcon.util import reader
Expand Down
12 changes: 7 additions & 5 deletions falcon/app_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -291,12 +291,14 @@ def default_serialize_error(req: Request, resp: Response, exception: HTTPError)
resp: Instance of ``falcon.Response``
exception: Instance of ``falcon.HTTPError``
"""
predefined = [MEDIA_XML, 'text/xml', MEDIA_JSON]

predefined = [MEDIA_JSON, 'text/xml', MEDIA_XML]
media_handlers = [mt for mt in resp.options.media_handlers if mt not in predefined]
# NOTE(caselit) add all the registered before the predefined ones. This ensures that
# in case of equal match the last one (json) is selected and that the q= is taken
# into consideration when selecting the media
preferred = req.client_prefers(media_handlers + predefined)
# NOTE(caselit,vytas): Add the registered handlers after the predefined
# ones. This ensures that in the case of an equal match, the first one
# (JSON) is selected and that the q parameter is taken into consideration
# when selecting the media handler.
preferred = req.client_prefers(predefined + media_handlers)

if preferred is None:
# NOTE(kgriffs): See if the client expects a custom media
Expand Down
2 changes: 1 addition & 1 deletion falcon/asgi/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -554,7 +554,7 @@ async def __call__( # type: ignore[override] # noqa: C901
data = resp._data

if data is None and resp._media is not None:
# NOTE(kgriffs): We use a special MISSING singleton since
# NOTE(kgriffs): We use a special _UNSET singleton since
# None is ambiguous (the media handler might return None).
if resp._media_rendered is _UNSET:
opt = resp.options
Expand Down
2 changes: 1 addition & 1 deletion falcon/asgi/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ def __init__(

self.uri_template = None
# PERF(vytas): Fall back to class variable(s) when unset.
# self._media = MISSING
# self._media = _UNSET
# self._media_error = None

# TODO(kgriffs): ASGI does not specify whether 'path' may be empty,
Expand Down
2 changes: 1 addition & 1 deletion falcon/asgi/response.py
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ async def render_body(self) -> Optional[bytes]: # type: ignore[override]
data = self._data

if data is None and self._media is not None:
# NOTE(kgriffs): We use a special MISSING singleton since
# NOTE(kgriffs): We use a special _UNSET singleton since
# None is ambiguous (the media handler might return None).
if self._media_rendered is _UNSET:
if not self.content_type:
Expand Down
10 changes: 10 additions & 0 deletions falcon/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,8 @@ def on_get(self, req, resp):
'HTTPUnsupportedMediaType',
'HTTPUriTooLong',
'HTTPVersionNotSupported',
'InvalidMediaRange',
'InvalidMediaType',
'MediaMalformedError',
'MediaNotFoundError',
'MediaValidationError',
Expand All @@ -111,6 +113,14 @@ class CompatibilityError(ValueError):
"""The given method, value, or type is not compatible."""


class InvalidMediaType(ValueError):
"""The provided media type cannot be parsed into type/subtype."""


class InvalidMediaRange(InvalidMediaType):
"""The media range contains an invalid media type and/or the q value."""


class UnsupportedScopeError(RuntimeError):
"""The ASGI scope type is not supported by Falcon."""

Expand Down
7 changes: 3 additions & 4 deletions falcon/http_error.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@
import xml.etree.ElementTree as et

from falcon.constants import MEDIA_JSON
from falcon.util import code_to_http_status
from falcon.util import http_status_to_code
from falcon.util import misc
from falcon.util import uri

if TYPE_CHECKING:
Expand Down Expand Up @@ -136,7 +135,7 @@ def __init__(
# we'll probably switch over to making everything code-based to more
# easily support HTTP/2. When that happens, should we continue to
# include the reason phrase in the title?
self.title = title or code_to_http_status(status)
self.title = title or misc.code_to_http_status(status)

self.description = description
self.headers = headers
Expand All @@ -161,7 +160,7 @@ def status_code(self) -> int:
"""HTTP status code normalized from the ``status`` argument passed
to the initializer.
""" # noqa: D205
return http_status_to_code(self.status)
return misc.http_status_to_code(self.status)

def to_dict(
self, obj_type: Type[MutableMapping[str, Union[str, int, None, Link]]] = dict
Expand Down
4 changes: 2 additions & 2 deletions falcon/media/handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@
from falcon.media.multipart import MultipartFormHandler
from falcon.media.multipart import MultipartParseOptions
from falcon.media.urlencoded import URLEncodedFormHandler
from falcon.util import mediatypes
from falcon.util import misc
from falcon.vendor import mimeparse


class MissingDependencyHandler(BinaryBaseHandlerWS):
Expand Down Expand Up @@ -186,7 +186,7 @@ def _best_match(media_type: str, all_media_types: Sequence[str]) -> Optional[str
try:
# NOTE(jmvrbanac): Mimeparse will return an empty string if it can
# parse the media type, but cannot find a suitable type.
result = mimeparse.best_match(all_media_types, media_type)
result = mediatypes.best_match(all_media_types, media_type)
except ValueError:
pass

Expand Down
6 changes: 3 additions & 3 deletions falcon/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,10 @@
from falcon.typing import ReadableIO
from falcon.util import deprecation
from falcon.util import ETag
from falcon.util import mediatypes
from falcon.util import structures
from falcon.util.uri import parse_host
from falcon.util.uri import parse_query_string
from falcon.vendor import mimeparse

DEFAULT_ERROR_LOG_FORMAT = '{0:%Y-%m-%d %H:%M:%S} [FALCON] [ERROR] {1} {2}{3} => '

Expand Down Expand Up @@ -1167,7 +1167,7 @@ def client_accepts(self, media_type: str) -> bool:

# Fall back to full-blown parsing
try:
return mimeparse.quality(media_type, accept) != 0.0
return mediatypes.quality(media_type, accept) != 0.0
except ValueError:
return False

Expand All @@ -1187,7 +1187,7 @@ def client_prefers(self, media_types: Iterable[str]) -> Optional[str]:

try:
# NOTE(kgriffs): best_match will return '' if no match is found
preferred_type = mimeparse.best_match(media_types, self.accept)
preferred_type = mediatypes.best_match(media_types, self.accept)
except ValueError:
# Value for the accept header was not formatted correctly
preferred_type = ''
Expand Down
2 changes: 1 addition & 1 deletion falcon/response.py
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,7 @@ def render_body(self) -> Optional[bytes]:
data = self._data

if data is None and self._media is not None:
# NOTE(kgriffs): We use a special MISSING singleton since
# NOTE(kgriffs): We use a special _UNSET singleton since
# None is ambiguous (the media handler might return None).
if self._media_rendered is _UNSET:
if not self.content_type:
Expand Down
Loading