Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doh: use tls session cache #523

Merged
merged 1 commit into from
Nov 5, 2024
Merged

Conversation

ignoramous
Copy link
Contributor

In our experiments with rethinkdns, employing a session cache reduces data consumed by DoH by 3x (500mb/mo down to 180mb/mo) and latency by upto 4x.

In our experiments with rethinkdns, employing
a session cache reduces data consumed by
DoH by 3x (500mb/mo down to 180mb/mo)
and latency by upto 4x.
@ignoramous
Copy link
Contributor Author

cc: @fortuna unsure if this is better or worse from an anti-censorship pov. I imagine it shouldn't affect it at all (given session resumption has no affect on SNI ext).

@fortuna fortuna requested a review from jyyi1 November 5, 2024 23:12
@fortuna fortuna merged commit 5a48272 into Jigsaw-Code:master Nov 5, 2024
2 checks passed
@link2xt
Copy link

link2xt commented Nov 6, 2024

Session resumption has an effect on privacy as it allows to track the same user over time. Session ticket is essentially a cookie.

@fortuna
Copy link
Contributor

fortuna commented Nov 6, 2024

In this case the session is gone after the session is terminated, so it's not long-lived, similar to a browser.

I believe the DNS client also issues multiple queries on the same connection, which already lets you correlate queries. But IETF RFCs generally recommend connection reuse:

@fortuna
Copy link
Contributor

fortuna commented Nov 6, 2024

It seems like we need to better understand why the performance increase. Perhaps the connection reuse is not working?

@fortuna
Copy link
Contributor

fortuna commented Nov 6, 2024

This article provides some helpful privacy context, with a link to a research: https://venafi.com/blog/tls-session-resumption/

Perhaps we should clear the cache every minutes or so. It's not super clear how much more privacy one would get though, especially if we can properly rely on the connection reuse.

@ignoramous
Copy link
Contributor Author

Perhaps we should clear the cache every minutes or so.

Typical of some DoH / DoT public resolvers to allow session resumption for multiple days.

openssl s_client -connect one.one.one.one:853 -reconnect shows 2 days, for example.

Unsure what Private DNS does (believe it was implemented by Benjamin?), but it'd interesting to look.

if we can properly rely on the connection reuse

From what I've observed, Go stdlib does reuse connections for http, though on phones especially, one'd want to steer clear of longer keepalives.

@link2xt
Copy link

link2xt commented Nov 10, 2024

After trying to get any latency reduction using TLS 1.3 session resumption in deltachat/deltachat-core-rust#6182 and looking at the diagram https://www.rfc-editor.org/rfc/rfc8446#section-2.2 I don't understand where 4x latency reduction can come from. Could be session establishment is slow on the server side so clients requesting a new session get some delay, but in terms of RTT you only gain 1 RTT if you send your request in early_data. Otherwise with normal TLS 1.3 handshake you can send the request in response to Server Hello which is as good as it can be without 0-RTT.

Bandwidth reduction makes sense, especially if server certificates are not compressed.

@ignoramous
Copy link
Contributor Author

ignoramous commented Nov 10, 2024

I don't understand where 4x latency reduction can come from

Which resolvers are you testing with? We observed up to 4x reduction for Rethink's upstreams, which are not as expansively deployed nor are run on powerful machines (1/16th of a vCPU per VM and 40 such VMs across 20+ regions) as some of the other public DoH/DoT resolvers may be. For Rethink DNS, my theory is the efficiency comes a combination of things, incl 1-RTT (resumption) and adaptive TLS record sizing (set to 1280 - TLS+TCP+IP header overheads).

Without session resumption (+ adaptive resizing), it was very common to see 3KB/4KB requests (using the DoH client from our fork of Intra) per DNS query (sans conn reuse).

1 RTT if you send your request in early_data

Believe, TLS v1.3 early data (which is 0-RTT but not as efficient if TCP is Nagle'd) is not impl / disabled by most HTTP servers & reverse proxies (due to request idempotency concerns?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants