-
-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for unix domain sockets #511
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the lead on a proposal for this. I submitted a general question, didn't look at the code too much. Maybe we can revive discussion on #133 first.
httpx/concurrency/asyncio.py
Outdated
@@ -232,7 +237,7 @@ def loop(self) -> asyncio.AbstractEventLoop: | |||
) -> BaseTCPStream: | |||
try: | |||
stream_reader, stream_writer = await asyncio.wait_for( # type: ignore | |||
asyncio.open_connection(hostname, port, ssl=ssl_context), | |||
self._open_connection(hostname, port, ssl_context), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we could generalize the backend interface so that it exposes a way to open a unix socket, rather than create a separate backend?
This way, users won't need to use a specific backend, and we can later look at adding this piece of functionality to the trio backend, too.
I'm thinking, maybe a new .open_unix_socket()
method similar to the current .open_tcp_socket()
, and then make the switch between the two in HTTPConnection.connect()
(if I remember correctly that that's where we create a stream instance).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great feedback! I was thinking the same but wasn't sure if stream dispatching would fit that high in the stack.
If we dispatch between tcp and unix connection in HttpConnection
, we need to determine what makes it select one or the other.
One example could be the origin.scheme
being unix
or http+unix
to favour the unix socket route. But that affects other bits too much maybe, and also the path to the socket file needs to be passed some way, e.g. as origin.host
which in turn hides any wanted real hostname to be passed through the unix socket.
For now, it might be better to add the unix socket path as an optional argument to the base class ConcurrencyBackend
and implement the dispaching there in new .open_stream()
method, which in turn calls your suggested .open_tcp_stream()
/ .open_unix_stream()
.
That way both AsyncioBackend
, TrioBackend
etc gets support for the socket file feature, but one needs to instantiate the backend with the socket file arg as my current implementation in the separate backend does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Well, if we put aside the whole concurrency backend aspect (which is only us trying to abstract I/O functionality so that we aren't particularly tied to any async library), maybe there's a way to, as @tomchristie suggested, to implement this as a separate, very narrow-focused dispatcher.
It's clearer to me why it should be doable if we look at requests-unixsocket, which provides unix sockets in the form of a Session
subclass that uses its own adapter (a concept somewhat similar to dispatchers in HTTPX). The implementation is based on HTTP connection and connection pool subclasses. In theory we should be able to take the same path in HTTPX.
And then in terms of API, maybe aim at something like…
import httpx
from httpx_unixsocket import UnixSocketAsyncDispatcher
dispatch = UnixSocketAsyncDispatcher("/var/run/docker.sock")
async with httpx.AsyncClient(dispatch=dispatch) as client:
r = await client.get("http+unix://docker/version")
Similarly to #509 (comment) I’d probably like to see any support here via a third party package first, rather than including it in the core package. I was also slightly surprised to see that the implementation was a ComcurrencyBackend class, rather than a Dispatcher implementation, which is what I would have expected to see, but I’ve not dug into that. |
It turns out that opening a Unix socket is very much an async library-level piece of functionality (here you see that we must call That said, I think #133 initially discussed using a special |
It's true that this could be implemented as a Dispatcher, but that would prevent the use of other existing dispatchers through a socket file, like if you're running some |
Except I think it's also fine that clients equiped with a Unix socket dispatcher can't make regular |
Maybe I'm misunderstanding, but tcp or unix socket is just the connections. A client with an available unix socket, probably needs the surrounding logic being triggered as is, like connection pooling, proxies etc, only that the actual communication in the end is over the socket file rather than over tcp? |
Yeah I think you're mostly right there @lundberg. :-) Unix vs TCP is merely a matter of transport, which ideally should be transparent to the end user in terms of dispatchers. But I also kind of agree with @tomchristie that for HTTPX's own sustainability, pushing for building third-party packages is definitely worthwhile. I reckon that currently HTTPX doesn't provide enough hooks to allow adding Unix socket support as a third party package. But it never will expose such hooks that might be useful for plenty of other use cases if we don't actually challenge the status quo and do the work of uncovering where those hooks or entrypoints should be. At least that's how I'm seeing it. :-) So, just as a thought experiment, this is the place where I think the switch between TCP or Unix would make sense: httpx/httpx/dispatch/connection.py Line 88 in 2fcf23b
If this line was refactored into a Then, we could also refactor this part: httpx/httpx/dispatch/connection_pool.py Lines 136 to 145 in 2fcf23b
into a At that point, basic needs would be covered as we'd be able to use Then, if we decide to bring the functionality in core, we'll be able to use the newly-created hooks to simplify the implementation, and we'd have pre-existing knowledge about how things work, should be tested, etc. So, what do you all think about the idea of introducing these hooks? ( (My core intuition is that there might be a fundamental abstraction we're missing on. I think @tomchristie mentioned before that there may be key insights to learn from ASGI in that regard. I agree, but the to-be-discovered abstraction isn't here yet.) |
These are great ideas and would make extending connections a lot easier @florimondmanca! One thought only; So even with the hooks in place, the |
Well, I think that's how we'd do things if we were to implement this in core, but that's not an immediate requirement for a 3rd party implementation. The 3rd party package could just have its own It doesn't even have to support |
The only bit that might be off is that right now HTTP connections manipulate
Just to solidify this claim, trio also has |
This is true @florimondmanca , but i really think that an http client should have the option of connecting through a unix socket in its core, since it's part of pythons core to support both tcp and unix socket connections, which in turn is transparent of what protocol being used, HTTP in httpx's case.
Let's take That said, I think the URL shouldn't be used to hold information about any socket file or mess with the scheme, like Another example is |
I'm certainly okay with us considering this functionality for core (tho we may want to think about the top-level API if that's what we're doing.) I'd expect us to have an |
9c4abdd
to
4d8d6fa
Compare
Spot on @tomchristie , this was my initial idea but started easy with the previous backend. I have now refactored accordingly! Any suggestions of more tests that should be done?
Since backends now return both tcp and uds streams, your renaming suggestion |
Thanks for suggesting! If you’re up for it let’s try to get the rename in via a separate PR maybe? :) I’ll try to have a new look at this one soon! |
Well, a new PR might be ok I guess, but since the only code instantiating and returning Question, do we need to do this rename as a non-breaking change, like keeping the old class and extend the new named one with a deprecation warning? Any 3rd party package might directly import Possible deprecation solution: class BaseSocketStream:
...
class BaseTCPStream(BaseSocketStream):
def __new__(
cls: typing.Type["BaseTCPStream"], *args: typing.Any, **kwargs: typing.Any
) -> "BaseTCPStream":
warnings.warn(
"BaseTCPStream is deprecated in favour of BaseSocketStream",
DeprecationWarning,
stacklevel=2,
)
return super().__new__(cls) This would yield a deprecation warning if any 3rd party package uses |
It would probably be easier for all to propose this change separately. We tend to encourage focused, incremental, isolated PRs as much as possible, even if that means having to deal with changes/conflicts with other pending PRs, as it facilitates reviewing and auditing. Two isolated PRs tend to be reviewed more efficiently, and merged more quickly, than a single PRs that tries to add/change two things at once. :-) FYI, a rename has already happened in the past, if it can be of inspiration: #339
Probably not — the only public aspect about concurrency backends is that class name and the import path; methods and internal classes like |
Note taken, I'll do a separate PR for the renaming then.
Ok, great. Only raised the question since it is exposed in |
76968d1
to
c066d3e
Compare
Rebased and in line with #517 |
@lundberg Sorry to have let this drag a bit. I'm just a bit wary on the attempt to DRY out the "open_x_stream" methods — that's probably one level of indirection too much. Having to pass awaitables (being paranoid — who knows if they'll actually be awaited?) also feels a bit strange. Can we keep them separate, for the sake of explicitness and clarity? |
Just to clarify, I moved the timeout and ssl logic into the generic "private" But, I have no problem removing the |
I think so, if that sounds OK to you. :) |
c066d3e
to
1b0f54c
Compare
Fixed and rebased @florimondmanca |
1b0f54c
to
45e49bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're getting there! The implementation looks top notch to me. I left several final comments on the tests, but nothing critical.
Co-Authored-By: Florimond Manca <[email protected]>
Legit comments, thanks. Refactored and pushed @florimondmanca . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stellar 🌟 Well done and thanks @lundberg!
We may want to add some prose to the "Advanced" section of the docs, to quickly demonstrate why and how to use Unix domain sockets. A short paragraph and a code snippet should do the trick. Would you like to tackle that as well, @lundberg? Up to you whether to add it to this PR, or we can merge and add the docs via a separate PR. Let me know! |
Let's merge this. I'll happily open a separate PR for the docs section, will start working on it now. |
This PR adds support for connecting through unix domain sockets.
New client argument
uds
is added to define unix domain socket path to use.Usage:
Solves previously discussed topic in #133