pre-proposal: CurveZMQ #75

minrk · 2021-09-21T09:16:03Z

zeromq has a transport-level encryption and authentication protocol called CurveZMQ.

I've just landed support for CurveZMQ in IPython Parallel, and I think it's worth talking about in Jupyter.

The gist of the most basic implementation:

server sockets set a public and private key, issued e.g. via zmq.curve_keypair() from pyzmq. (the sockets that bind make the most sense, but is not strictly a requirement), and CURVE_SERVER=1 to enable auth.
client sockets use the server's public key as a server key - anyone with this key can connect to the server and send/receive messages
client's own private/public key can be any key pair, and are used exclusively for encryption; they have no role in authentication

In Jupyter, we already have a key distribution mechanism, which is the HMAC message-signing key in connection files. We can use the same key distribution for Curve keys. In the context of Jupyter, it's a little weird, because it's usually the client (KernelManager) that sets the credentials, which means the client issues the kernel's private key as well, and needs to pass the private key to the kernel. This being the case, the absolute simplest version is to use the same private/public keypair for both ends.

Sketch:

KernelManager issues private, public keypair with zmq.curve_keypair()
store private_key and public_key in connection file
kernel sockets set CURVE_PRIVATEKEY, CURVE_PUBLICKEY, CURVE_SERVER=1
client sockets set the same CURVE_PRIVATEKEY, CURVE_PUBLICKEY, and use the public key in CURVE_SERVERKEY (alternative: client sockets issue new private/public keys, but if the private key is already in the connection file, I see no benefit to issuing more keypairs)

Here's an example of an authenticated socket pair in pyzmq:

curve socket example

import asyncio
import zmq
import zmq.asyncio as zaio

async def main():
    public, private = zmq.curve_keypair()
    ctx = zaio.Context()
    server = ctx.socket(zmq.ROUTER)
    # server socket is a 'curve server'
    server.CURVE_SECRETKEY = private
    server.CURVE_PUBLICKEY = public
    server.CURVE_SERVER = True

    url = "tcp://127.0.0.1:5555"
    server.bind(url)

    no_auth_client = ctx.socket(zmq.DEALER)

    auth_client = ctx.socket(zmq.DEALER)
    auth_client.CURVE_SECRETKEY = private
    auth_client.CURVE_PUBLICKEY = public
    auth_client.CURVE_SERVERKEY = public # this authenticates the client

    auth_client.connect(url)
    no_auth_client.connect(url)


    for i in range(5):
        # messages from 'auth_client' will be received
        asyncio.ensure_future(auth_client.send(b'auth'))
        # messages from 'no_auth_client' will never be delivered
        asyncio.ensure_future(no_auth_client.send(b'noauth'))
        msg = await server.recv_multipart()
        print("Received", msg)

    ctx.destroy(linger=0)

if __name__ == "__main__":
    asyncio.run(main())

Benefits

transport-level encryption, which was not previously available. CurveZMQ provides forward-secrecy, meaning access to the keys is not sufficient to decrypt captured traffic in the future. It is sufficient for a live man-in-the-middle attack.
connection-level authentication removes need to check message signatures (the less security-related code we implement ourselves, the better!), and is vastly simpler to implement for both kernels and clients
connection-level authentication prevents non-authenticated clients with access to ports from monitoring IOPub, which is only currently preventable when using ipc:// transport and file permissions

Caveats

Severe crash issues in libzmq regarding threadsafety on systems without getrandom, closed as "wontfix: use getrandom": threadsafety issue with curve_keypair, libsodium, randombytes_close zeromq/libzmq#4241 . I think this is a huge mistake, and have backported the opt-in patches on libzmq to both pyzmq's bundled libzmq and conda-forge libzmq, but may turn out to be a major problem.
Requires implementation by kernels, and capability advertisement in kernelspecs; always a high hurdle

Alternatives:

broader support for zmq auth, including PLAIN (username:password, authentication but not encryption), GSSAPI, etc.
full support for ZAP (zmq auth protocol), which is a much bigger burden on kernels, but would enable more sophisticated access control

Both of these would require more generic exposure of general options for zmq, whereas the proposal as it is only requires sharing of a single string (or two strings) for the key pair, under the exact same model we already have, and only setting a 3 socket options (same values on all sockets), a fairly minor change in practice. Plus, all changes are in socket creation, nowhere else in the protocol implementation.

The text was updated successfully, but these errors were encountered:

minrk · 2021-09-21T10:27:28Z

we can avoid passing the private key in the connection file by specifying that it will be in an environment variable, e.g. $JUPYTER_CURVE_SECRETKEY. In which case, clients should assume that they will need to issue their own private/public keypair. This doesn't really make any difference to clients, but it means the connection file is no longer enough info for a man-in-the-middle attack. It's also no longer enough info for kernels to set up their sockets.

minrk mentioned this issue Jun 16, 2022

Support encryption for the kernel protocol over ZMQ jupyter/jupyter_client#808

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pre-proposal: CurveZMQ #75

pre-proposal: CurveZMQ #75

minrk commented Sep 21, 2021

minrk commented Sep 21, 2021

pre-proposal: CurveZMQ #75

pre-proposal: CurveZMQ #75

Comments

minrk commented Sep 21, 2021

Sketch:

Benefits

Caveats

Alternatives:

minrk commented Sep 21, 2021