Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.12] Improve pathname2url() and url2pathname() docs (GH-127125) #127233

Merged
merged 1 commit into from
Nov 24, 2024

Conversation

miss-islington
Copy link
Contributor

@miss-islington miss-islington commented Nov 24, 2024

These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:

>>> pathname2url(r'C:\foo')
'///C:/foo'
>>> pathname2url(r'\\server\share')
'////server/share'  # or '//server/share' as of quite recently

If these were URL path components, they would imply complete URLs like
file://///C:/foo and file://////server/share. Clearly this isn't right.
Yet the implementation in nturl2path is deliberate, and the
url2pathname() function correctly inverts it.

On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with both interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following file: (this commit)

The conclusion I draw is that these functions operate on everything after
the file: prefix, which may include an authority section. This is the
only explanation that fits both the Windows and non-Windows behaviour.
It's also a better match for the function names.
(cherry picked from commit 307c633)

Co-authored-by: Barney Gale [email protected]


📚 Documentation preview 📚: https://cpython-previews--127233.org.readthedocs.build/

These functions have long sown confusion among Python developers. The
existing documentation says they deal with URL path components, but that
doesn't fit the evidence on Windows:

    >>> pathname2url(r'C:\foo')
    '///C:/foo'
    >>> pathname2url(r'\\server\share')
    '////server/share'  # or '//server/share' as of quite recently

If these were URL path components, they would imply complete URLs like
`file://///C:/foo` and `file://////server/share`. Clearly this isn't right.
Yet the implementation in `nturl2path` is deliberate, and the
`url2pathname()` function correctly inverts it.

On non-Windows platforms, the behaviour until quite recently is to simply
quote/unquote the path without adding or removing any leading slashes. This
behaviour is compatible with *both* interpretations -- 1) the value is a
URL path component (existing docs), and 2) the value is everything
following `file:` (this commit)

The conclusion I draw is that these functions operate on everything after
the `file:` prefix, which may include an authority section. This is the
only explanation that fits both the  Windows and non-Windows behaviour.
It's also a better match for the function names.
(cherry picked from commit 307c633)

Co-authored-by: Barney Gale <[email protected]>
@bedevere-app bedevere-app bot added docs Documentation in the Doc dir skip news labels Nov 24, 2024
@barneygale barneygale enabled auto-merge (squash) November 24, 2024 17:34
@barneygale barneygale merged commit b52ab48 into python:3.12 Nov 24, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir skip issue skip news
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants