Add support for unix domain sockets #511

lundberg · 2019-11-06T18:20:52Z

This PR adds support for connecting through unix domain sockets.

New client argument uds is added to define unix domain socket path to use.

Usage:

async with httpx.AsyncClient(uds="/var/run/docker.sock") as client:
    # This request will connect through the socket file
    r = await client.get("http://docker/version")

Solves previously discussed topic in #133

florimondmanca

Thanks for taking the lead on a proposal for this. I submitted a general question, didn't look at the code too much. Maybe we can revive discussion on #133 first.

florimondmanca · 2019-11-06T18:26:49Z

httpx/concurrency/asyncio.py

@@ -232,7 +237,7 @@ def loop(self) -> asyncio.AbstractEventLoop:
    ) -> BaseTCPStream:
        try:
            stream_reader, stream_writer = await asyncio.wait_for(  # type: ignore
-                asyncio.open_connection(hostname, port, ssl=ssl_context),
+                self._open_connection(hostname, port, ssl_context),


Is there a way we could generalize the backend interface so that it exposes a way to open a unix socket, rather than create a separate backend?

This way, users won't need to use a specific backend, and we can later look at adding this piece of functionality to the trio backend, too.

I'm thinking, maybe a new .open_unix_socket() method similar to the current .open_tcp_socket(), and then make the switch between the two in HTTPConnection.connect() (if I remember correctly that that's where we create a stream instance).

Great feedback! I was thinking the same but wasn't sure if stream dispatching would fit that high in the stack.

If we dispatch between tcp and unix connection in HttpConnection, we need to determine what makes it select one or the other.

One example could be the origin.scheme being unix or http+unix to favour the unix socket route. But that affects other bits too much maybe, and also the path to the socket file needs to be passed some way, e.g. as origin.host which in turn hides any wanted real hostname to be passed through the unix socket.

For now, it might be better to add the unix socket path as an optional argument to the base class ConcurrencyBackend and implement the dispaching there in new .open_stream() method, which in turn calls your suggested .open_tcp_stream() / .open_unix_stream().

That way both AsyncioBackend, TrioBackend etc gets support for the socket file feature, but one needs to instantiate the backend with the socket file arg as my current implementation in the separate backend does.

Hmm. Well, if we put aside the whole concurrency backend aspect (which is only us trying to abstract I/O functionality so that we aren't particularly tied to any async library), maybe there's a way to, as @tomchristie suggested, to implement this as a separate, very narrow-focused dispatcher.

It's clearer to me why it should be doable if we look at requests-unixsocket, which provides unix sockets in the form of a Session subclass that uses its own adapter (a concept somewhat similar to dispatchers in HTTPX). The implementation is based on HTTP connection and connection pool subclasses. In theory we should be able to take the same path in HTTPX.

And then in terms of API, maybe aim at something like…

import httpx from httpx_unixsocket import UnixSocketAsyncDispatcher dispatch = UnixSocketAsyncDispatcher("/var/run/docker.sock") async with httpx.AsyncClient(dispatch=dispatch) as client: r = await client.get("http+unix://docker/version")

tomchristie · 2019-11-06T19:28:53Z

Similarly to #509 (comment) I’d probably like to see any support here via a third party package first, rather than including it in the core package.

I was also slightly surprised to see that the implementation was a ComcurrencyBackend class, rather than a Dispatcher implementation, which is what I would have expected to see, but I’ve not dug into that.

florimondmanca · 2019-11-06T19:40:32Z

I was also slightly surprised to see that the implementation was a ComcurrencyBackend class, rather than a Dispatcher implementation

It turns out that opening a Unix socket is very much an async library-level piece of functionality (here you see that we must call asyncio.open_unix_connection() instead of asyncio.open_connection()), so it kind of makes sense that adding support for Unix sockets requires us to play with concurrency backends.

That said, I think #133 initially discussed using a special http+unix scheme, which doesn't seem to be the way we're heading at here.

lundberg · 2019-11-06T19:47:57Z

I was also slightly surprised to see that the implementation was a ComcurrencyBackend class, rather than a Dispatcher implementation, which is what I would have expected to see, but I’ve not dug into that.

It's true that this could be implemented as a Dispatcher, but that would prevent the use of other existing dispatchers through a socket file, like if you're running some wsgi server bound to a unix socket and want to use WSGIDispatch.

florimondmanca · 2019-11-06T19:55:06Z

like if you're running some wsgi server bound to a unix socket and want to use WSGIDispatch.

Except WSGIDispatch doesn't actually make network calls, does it? (I think it merely passes the correct environ/start_response components to the WSGI app.)

I think it's also fine that clients equiped with a Unix socket dispatcher can't make regular http:// requests. We aren't preventing users from using multiple clients, one for each use case (e.g. HTTP/TCP, HTTP/Unix requests, HTTP/WSGI, etc).

lundberg · 2019-11-06T20:20:47Z

I think it's also fine that clients equiped with a Unix socket dispatcher can't make regular http:// requests.

Maybe I'm misunderstanding, but tcp or unix socket is just the connections. A client with an available unix socket, probably needs the surrounding logic being triggered as is, like connection pooling, proxies etc, only that the actual communication in the end is over the socket file rather than over tcp?

florimondmanca · 2019-11-06T20:54:15Z

Yeah I think you're mostly right there @lundberg. :-) Unix vs TCP is merely a matter of transport, which ideally should be transparent to the end user in terms of dispatchers.

But I also kind of agree with @tomchristie that for HTTPX's own sustainability, pushing for building third-party packages is definitely worthwhile. I reckon that currently HTTPX doesn't provide enough hooks to allow adding Unix socket support as a third party package. But it never will expose such hooks that might be useful for plenty of other use cases if we don't actually challenge the status quo and do the work of uncovering where those hooks or entrypoints should be. At least that's how I'm seeing it. :-)

So, just as a thought experiment, this is the place where I think the switch between TCP or Unix would make sense:

httpx/httpx/dispatch/connection.py

Line 88 in 2fcf23b

stream = await self.backend.open_tcp_stream(host, port, ssl_context, timeout)

If this line was refactored into a .open_stream() method on HTTPConnection, we'd be able to build a subclass of it, say UnixHTTPConnection that opens a Unix socket stream instead of a TCP stream.

Then, we could also refactor this part:

httpx/httpx/dispatch/connection_pool.py

Lines 136 to 145 in 2fcf23b

    
           connection = HTTPConnection( 
        
               origin, 
        
               verify=self.verify, 
        
               cert=self.cert, 
        
               timeout=self.timeout, 
        
               http_versions=self.http_versions, 
        
               backend=self.backend, 
        
               release_func=self.release_connection, 
        
               trust_env=self.trust_env, 
        
           )

into a .create_connection(origin) method, that could be overridden in a UnixConnectionPool subclass that creates an UnixHTTPConnection and also stores the socket path.

At that point, basic needs would be covered as we'd be able to use dispatch=UnixConnectionPool() to make connections over a Unix socket, even if we'd have a split between TCP and Unix. (Except that that functionality would be maintained by the community. HTTPX maintainers do have limited resources as well. 😉)

Then, if we decide to bring the functionality in core, we'll be able to use the newly-created hooks to simplify the implementation, and we'd have pre-existing knowledge about how things work, should be tested, etc.

So, what do you all think about the idea of introducing these hooks? (HTTPConnection.open_stream(), ConnectionPool.create_connection())

(My core intuition is that there might be a fundamental abstraction we're missing on. I think @tomchristie mentioned before that there may be key insights to learn from ASGI in that regard. I agree, but the to-be-discovered abstraction isn't here yet.)

lundberg · 2019-11-06T22:15:34Z

So, what do you all think about the idea of introducing these hooks? (HTTPConnection.open_stream(), ConnectionPool.create_connection())

These are great ideas and would make extending connections a lot easier @florimondmanca!

One thought only;
Let's say your suggested hooks exists, and in a future core/3rd UnixHTTPConnection class we can override the suggested open_stream() method to open a unix socket connection. We still would want to use the client's given/selected backend, like AsyncioBackend or TrioBackend, leading us to use self.backend.open??? in the implementation of ???Connection.open_stream(), right?

So even with the hooks in place, the ConcurrencyBackend still, in my opinion, needs both tcp and unix interface methods and implementations in related backends, since it's where the actual transport connections currently are being made.

florimondmanca · 2019-11-06T22:29:15Z

We still would want to use the client's given/selected backend, like AsyncioBackend or TrioBackend

Well, I think that's how we'd do things if we were to implement this in core, but that's not an immediate requirement for a 3rd party implementation.

The 3rd party package could just have its own open_unix_socket() helper, and use it there instead of calling into the concurrency backend.

It doesn't even have to support trio right away — asyncio-only can be defined as a constraint, which might be resolved by someone contributing trio support later. :-)

florimondmanca · 2019-11-06T22:38:09Z

The only bit that might be off is that right now HTTP connections manipulate TCPStream instances. There's nothing particularly "TCP" about this interface — what seems to be important is that it's a socket of some sort, that exposes read/write operations. So maybe we should rename that to SocketStream or something? (This is the terminology that trio uses.)

It doesn't even have to support trio right away — asyncio-only can be defined as a constraint, which might be resolved by someone contributing trio support later. :-)

Just to solidify this claim, trio also has open_unix_socket() so extending an initial asyncio-only implementation should be easy enough. :)

lundberg · 2019-11-07T08:11:03Z

The 3rd party package could just have its own open_unix_socket() helper, and use it there instead of calling into the concurrency backend.

This is true @florimondmanca , but i really think that an http client should have the option of connecting through a unix socket in its core, since it's part of pythons core to support both tcp and unix socket connections, which in turn is transparent of what protocol being used, HTTP in httpx's case.

I’d probably like to see any support here via a third party package first, rather than including it in the core package.

Let's take curl as an example, which has the client option --unix-socket to set transport over a specific unix socket file instead of TCP.

That said, I think the URL shouldn't be used to hold information about any socket file or mess with the scheme, like requests-unixsocket has to implement it in a rather "hacky" way, due to lack of hooks or core functionality in the requests lib.

Another example is uvicorn, which has the option --uds to bind to a unix domain socket, why wouldn't we want httpx to have core support to request uvicorn that way @tomchristie ?
By the way, this is how I test unix socket in this PR, i.e. by starting uvicorn bound to an uds file.

tomchristie · 2019-11-07T12:45:20Z

I'm certainly okay with us considering this functionality for core (tho we may want to think about the top-level API if that's what we're doing.) I'd expect us to have an open_uds_connection on the concurrency backend, and then expose usd=... on the client, and have the connection pool switch between open_tcp_connection or open_uds_connection depending on if uds is set.

lundberg · 2019-11-07T23:26:33Z

I'd expect us to have an open_uds_connection on the concurrency backend, and then expose usd=... on the client, and have the connection pool switch between open_tcp_connection or open_uds_connection depending on if uds is set.

Spot on @tomchristie , this was my initial idea but started easy with the previous backend. I have now refactored accordingly! Any suggestions of more tests that should be done?

The only bit that might be off is that right now HTTP connections manipulate TCPStream instances. There's nothing particularly "TCP" about this interface — what seems to be important is that it's a socket of some sort, that exposes read/write operations. So maybe we should rename that to SocketStream or something? (This is the terminology that trio uses.)

Since backends now return both tcp and uds streams, your renaming suggestion TCPStream -> SocketStream makes sense @florimondmanca . Should I do this aswell?

florimondmanca · 2019-11-08T07:54:01Z

Should I do this aswell?

Thanks for suggesting! If you’re up for it let’s try to get the rename in via a separate PR maybe? :)

I’ll try to have a new look at this one soon!

lundberg · 2019-11-08T08:20:24Z

If you’re up for it let’s try to get this in via a separate PR maybe? :)

Well, a new PR might be ok I guess, but since the only code instantiating and returning TCPStream are these open_???_stream methods in the backends.
I'd say it belongs to this PR, OK @florimondmanca ?

Question, do we need to do this rename as a non-breaking change, like keeping the old class and extend the new named one with a deprecation warning? Any 3rd party package might directly import httpx.BaseTCPStream. Thoughts on that @tomchristie ?

Possible deprecation solution:

class BaseSocketStream:
    ...

class BaseTCPStream(BaseSocketStream):
    def __new__(
        cls: typing.Type["BaseTCPStream"], *args: typing.Any, **kwargs: typing.Any
    ) -> "BaseTCPStream":
        warnings.warn(
            "BaseTCPStream is deprecated in favour of BaseSocketStream",
            DeprecationWarning,
            stacklevel=2,
        )
        return super().__new__(cls)

This would yield a deprecation warning if any 3rd party package uses BaseTCPStream.

florimondmanca · 2019-11-08T11:20:33Z

I'd say it belongs to this PR, OK @florimondmanca ?

It would probably be easier for all to propose this change separately.

We tend to encourage focused, incremental, isolated PRs as much as possible, even if that means having to deal with changes/conflicts with other pending PRs, as it facilitates reviewing and auditing. Two isolated PRs tend to be reviewed more efficiently, and merged more quickly, than a single PRs that tries to add/change two things at once. :-)

FYI, a rename has already happened in the past, if it can be of inspiration: #339

do we need to do this rename as a non-breaking change

Probably not — the only public aspect about concurrency backends is that class name and the import path; methods and internal classes like TCPStream aren't used directly by our users. (Anyway we're in 0.x, so I think backwards compatibility / strict semver isn't a concern yet. :-) e.g. #482.)

lundberg · 2019-11-08T11:30:42Z

It would be easier for all to propose this change separately.

Note taken, I'll do a separate PR for the renaming then.

FYI, a rename has already happened in the past, if it can be of inspiration: #339

Stream -> TCPStream -> SocketStream ;-)

do we need to do this rename as a non-breaking change

Probably not

Ok, great. Only raised the question since it is exposed in __init__, i.e. httpx.BaseTCPStream

lundberg · 2019-11-08T22:07:45Z

Rebased and in line with #517

florimondmanca · 2019-11-16T20:31:41Z

@lundberg Sorry to have let this drag a bit. I'm just a bit wary on the attempt to DRY out the "open_x_stream" methods — that's probably one level of indirection too much. Having to pass awaitables (being paranoid — who knows if they'll actually be awaited?) also feels a bit strange. Can we keep them separate, for the sake of explicitness and clarity?

lundberg · 2019-11-18T07:40:03Z

I'm just a bit wary on the attempt to DRY out the "open_x_stream" methods

Just to clarify, I moved the timeout and ssl logic into the generic "private" _open_stream method just to not duplicate that code, since it's the same for both the uds and tcp coroutines, in both backends. But intentionally did not add it to the "interface" ConcurrencyBackend, to not force it to be implemented like that.

But, I have no problem removing the _open_stream methods. Should I @florimondmanca?

florimondmanca · 2019-11-18T07:57:00Z

Should I?

I think so, if that sounds OK to you. :)

lundberg · 2019-11-18T08:49:40Z

Fixed and rebased @florimondmanca

florimondmanca

I think we're getting there! The implementation looks top notch to me. I left several final comments on the tests, but nothing critical.

tests/test_concurrency.py

tests/client/test_client.py

tests/client/test_async_client.py

httpx/client.py

Co-Authored-By: Florimond Manca <[email protected]>

lundberg · 2019-11-19T11:30:05Z

I left several final comments on the tests, but nothing critical.

Legit comments, thanks. Refactored and pushed @florimondmanca .

florimondmanca

Stellar 🌟 Well done and thanks @lundberg!

florimondmanca · 2019-11-19T21:18:05Z

We may want to add some prose to the "Advanced" section of the docs, to quickly demonstrate why and how to use Unix domain sockets. A short paragraph and a code snippet should do the trick. Would you like to tackle that as well, @lundberg? Up to you whether to add it to this PR, or we can merge and add the docs via a separate PR. Let me know!

lundberg · 2019-11-19T21:59:44Z

We may want to add some prose to the "Advanced" section of the docs....Up to you whether to add it to this PR, or we can merge and add the docs via a separate PR. Let me know!

Let's merge this. I'll happily open a separate PR for the docs section, will start working on it now.

florimondmanca reviewed Nov 6, 2019

View reviewed changes

lundberg force-pushed the unix-socket-backend branch 2 times, most recently from 9c4abdd to 4d8d6fa Compare November 7, 2019 23:26

lundberg mentioned this pull request Nov 8, 2019

Rename TCPStream to SocketStream #517

Merged

lundberg force-pushed the unix-socket-backend branch 4 times, most recently from 76968d1 to c066d3e Compare November 8, 2019 22:04

lundberg force-pushed the unix-socket-backend branch from c066d3e to 1b0f54c Compare November 18, 2019 08:46

lundberg added 3 commits November 18, 2019 15:02

Add and implement open_uds_stream in concurrency backends

3ad7144

Add uds arg to BaseClient and select tcp or uds in HttpConnection

4f79f24

Make open stream methods in backends more explicit

45e49bc

lundberg force-pushed the unix-socket-backend branch from 1b0f54c to 45e49bc Compare November 18, 2019 14:02

florimondmanca reviewed Nov 19, 2019

View reviewed changes

lundberg and others added 3 commits November 19, 2019 12:07

Close sentence

9513520

Co-Authored-By: Florimond Manca <[email protected]>

Refactor uds concurrency test

902b393

Remove redundant uds test assertions

b367a91

florimondmanca approved these changes Nov 19, 2019

View reviewed changes

florimondmanca merged commit 7a96a2c into encode:master Nov 19, 2019

florimondmanca mentioned this pull request Nov 29, 2019

Drop nox in favor of vanilla scripts #566

Merged

florimondmanca mentioned this pull request Jan 6, 2020

Consider pushing back on UDS support. #723

Closed

tomchristie mentioned this pull request May 13, 2020

Advanced connection options. encode/httpcore#88

Closed

tomchristie mentioned this pull request Aug 6, 2020

UDS Support encode/httpcore#138

Closed

florimondmanca mentioned this pull request Aug 6, 2020

Add Unix Domain Socket support encode/httpcore#139

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for unix domain sockets #511

Add support for unix domain sockets #511

lundberg commented Nov 6, 2019 •

edited

Loading

florimondmanca left a comment

florimondmanca Nov 6, 2019

lundberg Nov 6, 2019

florimondmanca Nov 6, 2019

tomchristie commented Nov 6, 2019

florimondmanca commented Nov 6, 2019

lundberg commented Nov 6, 2019

florimondmanca commented Nov 6, 2019 •

edited

Loading

lundberg commented Nov 6, 2019

florimondmanca commented Nov 6, 2019

lundberg commented Nov 6, 2019

florimondmanca commented Nov 6, 2019 •

edited

Loading

florimondmanca commented Nov 6, 2019 •

edited

Loading

lundberg commented Nov 7, 2019 •

edited

Loading

tomchristie commented Nov 7, 2019

lundberg commented Nov 7, 2019

florimondmanca commented Nov 8, 2019 •

edited

Loading

lundberg commented Nov 8, 2019 •

edited

Loading

florimondmanca commented Nov 8, 2019 •

edited

Loading

lundberg commented Nov 8, 2019 •

edited

Loading

lundberg commented Nov 8, 2019

florimondmanca commented Nov 16, 2019

lundberg commented Nov 18, 2019

florimondmanca commented Nov 18, 2019

lundberg commented Nov 18, 2019

florimondmanca left a comment •

edited

Loading

lundberg commented Nov 19, 2019

florimondmanca left a comment •

edited

Loading

florimondmanca commented Nov 19, 2019

lundberg commented Nov 19, 2019

Add support for unix domain sockets #511

Add support for unix domain sockets #511

Conversation

lundberg commented Nov 6, 2019 • edited Loading

florimondmanca left a comment

Choose a reason for hiding this comment

florimondmanca Nov 6, 2019

Choose a reason for hiding this comment

lundberg Nov 6, 2019

Choose a reason for hiding this comment

florimondmanca Nov 6, 2019

Choose a reason for hiding this comment

tomchristie commented Nov 6, 2019

florimondmanca commented Nov 6, 2019

lundberg commented Nov 6, 2019

florimondmanca commented Nov 6, 2019 • edited Loading

lundberg commented Nov 6, 2019

florimondmanca commented Nov 6, 2019

lundberg commented Nov 6, 2019

florimondmanca commented Nov 6, 2019 • edited Loading

florimondmanca commented Nov 6, 2019 • edited Loading

lundberg commented Nov 7, 2019 • edited Loading

tomchristie commented Nov 7, 2019

lundberg commented Nov 7, 2019

florimondmanca commented Nov 8, 2019 • edited Loading

lundberg commented Nov 8, 2019 • edited Loading

florimondmanca commented Nov 8, 2019 • edited Loading

lundberg commented Nov 8, 2019 • edited Loading

lundberg commented Nov 8, 2019

florimondmanca commented Nov 16, 2019

lundberg commented Nov 18, 2019

florimondmanca commented Nov 18, 2019

lundberg commented Nov 18, 2019

florimondmanca left a comment • edited Loading

Choose a reason for hiding this comment

lundberg commented Nov 19, 2019

florimondmanca left a comment • edited Loading

Choose a reason for hiding this comment

florimondmanca commented Nov 19, 2019

lundberg commented Nov 19, 2019

lundberg commented Nov 6, 2019 •

edited

Loading

florimondmanca commented Nov 6, 2019 •

edited

Loading

florimondmanca commented Nov 6, 2019 •

edited

Loading

florimondmanca commented Nov 6, 2019 •

edited

Loading

lundberg commented Nov 7, 2019 •

edited

Loading

florimondmanca commented Nov 8, 2019 •

edited

Loading

lundberg commented Nov 8, 2019 •

edited

Loading

florimondmanca commented Nov 8, 2019 •

edited

Loading

lundberg commented Nov 8, 2019 •

edited

Loading

florimondmanca left a comment •

edited

Loading

florimondmanca left a comment •

edited

Loading