Skip to content

Commit

Permalink
Merge pull request #3 from ngtcp2/update-doc
Browse files Browse the repository at this point in the history
Update doc
  • Loading branch information
tatsuhiro-t authored Nov 13, 2024
2 parents ad65582 + d6a0958 commit 2dcded2
Showing 1 changed file with 28 additions and 1 deletion.
29 changes: 28 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,31 @@ urlparse

urlparse is a URL parser compatible to `url_parser_parse_url()` from
`nodejs/http-parser <https://github.com/nodejs/http-parser>`_ which
has been archived since 2022.
has been archived since 2022. urlparse only implements the strict
mode.

There is a slight difference in a return code when they fail.
`url_parser_parse_url()` returns nonzero if it fails.
`urlparse_parse_url()` returns the negative error code
``URLPARSE_ERR_PARSE`` if it fails.

`url_parser_parse_url()` historically does not follow any standards
like RFC 3986. Here is the allowed characters in each URL component:

- scheme: ``A-Za-z``
- userinfo: ``A-Za-z!$%&'()*+,-.:;=_~``
- host: ``a-zA-Z0-9-.``
- IPv6 host: ``A-Fa-f0-9.:``

- optionally followed by zone info which starts ``%`` and can
contain: ``A-Za-z0-9%-._~``

- and IPv6 host must be enclosed by ``[`` and ``]``

- port: ``0-9``
- path: ``A-Za-z0-9!"$%&'()*+,-./:;<=>@[\]^_`{|}~``
- query: ``A-Za-z0-9!"$%&'()*+,-./:;<=>?@[\]^_`{|}~``
- fragment: ``A-Za-z0-9!"#$%&'()*+,-./:;<=>?@[\\]^_\`{|}~``

- all consecutive ``#`` characters that precede a fragment are
treated as a single ``#``. A fragment cannot start with ``#``.

0 comments on commit 2dcded2

Please sign in to comment.