Skip to content

Commit

Permalink
[3.12] Improve pathname2url() and url2pathname() docs (GH-127125) (
Browse files Browse the repository at this point in the history
…#127233)

Improve `pathname2url()` and `url2pathname()` docs (GH-127125)

These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:

    >>> pathname2url(r'C:\foo')
    '///C:/foo'
    >>> pathname2url(r'\\server\share')
    '////server/share'  # or '//server/share' as of quite recently

If these were URL path components, they would imply complete URLs like
`file://///C:/foo` and `file://////server/share`. Clearly this isn't right.
Yet the implementation in `nturl2path` is deliberate, and the
`url2pathname()` function correctly inverts it.

On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with *both* interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following `file:` (this commit)

The conclusion I draw is that these functions operate on everything after
the `file:` prefix, which may include an authority section. This is the
only explanation that fits both the  Windows and non-Windows behaviour.
It's also a better match for the function names.
(cherry picked from commit 307c633)

Co-authored-by: Barney Gale <[email protected]>
  • Loading branch information
miss-islington and barneygale authored Nov 24, 2024
1 parent e26ba96 commit b52ab48
Showing 1 changed file with 19 additions and 7 deletions.
26 changes: 19 additions & 7 deletions Doc/library/urllib.request.rst
Original file line number Diff line number Diff line change
Expand Up @@ -160,16 +160,28 @@ The :mod:`urllib.request` module defines the following functions:

.. function:: pathname2url(path)

Convert the pathname *path* from the local syntax for a path to the form used in
the path component of a URL. This does not produce a complete URL. The return
value will already be quoted using the :func:`~urllib.parse.quote` function.
Convert the given local path to a ``file:`` URL. This function uses
:func:`~urllib.parse.quote` function to encode the path. For historical
reasons, the return value omits the ``file:`` scheme prefix. This example
shows the function being used on Windows::

>>> from urllib.request import pathname2url
>>> path = 'C:\\Program Files'
>>> 'file:' + pathname2url(path)
'file:///C:/Program%20Files'

.. function:: url2pathname(path)

Convert the path component *path* from a percent-encoded URL to the local syntax for a
path. This does not accept a complete URL. This function uses
:func:`~urllib.parse.unquote` to decode *path*.
.. function:: url2pathname(url)

Convert the given ``file:`` URL to a local path. This function uses
:func:`~urllib.parse.unquote` to decode the URL. For historical reasons,
the given value *must* omit the ``file:`` scheme prefix. This example shows
the function being used on Windows::

>>> from urllib.request import url2pathname
>>> url = 'file:///C:/Program%20Files'
>>> url2pathname(url.removeprefix('file:'))
'C:\\Program Files'

.. function:: getproxies()

Expand Down

0 comments on commit b52ab48

Please sign in to comment.