Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remaining corner cases between Host and :authority #905

Closed
wtarreau opened this issue Aug 19, 2021 · 5 comments · Fixed by #968
Closed

Remaining corner cases between Host and :authority #905

wtarreau opened this issue Aug 19, 2021 · 5 comments · Fixed by #968

Comments

@wtarreau
Copy link
Contributor

Adding here the link to my yesterdays report to help with tracking:

https://lists.w3.org/Archives/Public/ietf-http-wg/2021JulSep/0237.html

I can also paste the contents and/or open one or more issues if needed, though I don't think it is needed for now.

@daurnimator
Copy link

I think #768 is related.

@icing
Copy link

icing commented Aug 23, 2021

In hindsight, it is easy to see that we need an "HTTP info set", e.g. the pure semantics, and from that derive protocol version serializations. As things went, implementations are now seeking ways to make their own internal representation of semantics. As @wtarreau describes for haproxy.

The same is happening in httpd, where one carefully tries to disect the cases that apply in general from the ones specific to a version. This is tricky for backward compatible releases.

The worst, to me, seem to be rules in standards to specify "from another http version". That seems to imply an "older, existing" version. Will it apply to a future http version? Who knows?

But this question needs an answer soon, as implementations will need to cope with QUIC traffic incoming and forwarding it to another H2 server. I believe implementations will need to

  • define their own internal representation of HTTP semantics
  • how to map incoming traffic to it
  • sift through the rules in the standard and determine which can be applied safely to the generic semantics and which need to stay at "if this was originally from version x.y"

Standard revisions could help here with clarifying the "from other httpd versions" to either apply in general to http semantics or specifically to a range of HTTP versions. It's probably not easy.

@mnot
Copy link
Member

mnot commented Sep 1, 2021

Willy, we need to be careful here -- this is a pretty broad issue, touching many points and a fair amount of history. We can't reopen HTTP-core right now, so we're constrained in how well we can integrate things here.

Looking at what you bring up, I think we can make some editorial improvements here; I've suggested some in #961. There's a very fine line to walk here between respecting the abstraction that semantics provides and making it useful to implementers.

The other thing that might be worth discussing is this requirement:

An intermediary that forwards a request to HTTP/2 MUST retain any Host header field, even if an authority is part of control data.

My recollection is that this was advocated as helping in the reconstruction of the request-line, for purposes like bot detection, debugging, etc.; it gives fidelity to what can be put on the wire in HTTP/1 (although whether that's a good goal is another discussion -- as I've complained elsewhere, h1, Host and authority are not necessarily... sane).

In retrospect, that MUST could be too strong; given that h2 is effectively always-encrypted, most intermediaries who participate in the protocol do so on behalf of the origin server, so they can coordinate if this information is necessary (as is wide practice for other bits like this, in various ad hoc headers). Also, it creates a (fallacious) expectation in servers that they can rely on Host being there.

For those reasons, I'd (personally) be open to considering dropping this MUST down to a MAY... but if we saw pushback (even mildly so), I'd be concerned enough to back down; I don't think it's important enough to spend too much time on. Regardless of how that ended up, we might add a few words to explain why there's a requirement here.

Thoughts?

@wtarreau
Copy link
Contributor Author

wtarreau commented Sep 1, 2021

I understand your points. But right now my point is that this maintains a real security issue. Routing has been made on Host for 2 decades now. It's hard-coded everywhere in applications and components. In HTTP/1 it was usual to make sure that Host and authority would exactly match or reject the request (and it's in HTTP/1.1 messaging#3.2).

There's no such rule in H2, resulting in two possibly different authorities being present and used differently along the chain. I would be fine keeping it as-is if we enforce the same rule as in H1, which is that if both are present, they MUST match according to RFC3986's rules on scheme-based normalization. But being allowed to have diverging Host and :authority in H2 is a serious concern to me.

@wtarreau
Copy link
Contributor Author

wtarreau commented Sep 1, 2021

BTW your update in 961 looks way better to me (aside the point being discussed above of course).

martinthomson added a commit to martinthomson/http2v2 that referenced this issue Sep 14, 2021
This makes a few changes, restricting things further than before.  For
the most part, this removes an allowance in the original specification
that had Host and :authority potentially differing.  The goal of that
was - from memory - to preserve some of the inherent quirks in HTTP/1.1.
That turns out to be more of a liability than an asset and far less
important now that we have a more formal understanding of the structure
of requests.

Closes httpwg#905.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants