-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: auto-tune (dynamic) stream receive window #176
Conversation
ca14cec
to
e6fb5f8
Compare
Depend on connection window only.
68d2132
to
bb0c6a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a preliminary review, results look great, well done!
I can see that you've spent a fair bit of effort to make the implementation of the algorithm easy to understand. I think it might be worth to improve on that further by decoupling it from the Stream
struct. We should be able to implement the algorithm as a pure function and simply call it from Stream
by passing in the required state. That should also allow you to test it more easily using quickcheck
.
Does need a changelog entry! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @mxinden!
Thank you for your help @thomaseizinger! |
libp2p/rust-yamux#176 enables auto-tuning for the Yamux stream receive window. While preserving small buffers on low-latency and/or low-bandwidth connections, this change allows for high-latency and/or high-bandwidth connections to exhaust the available bandwidth on a single stream. Using the [libp2p perf](https://github.com/libp2p/test-plans/blob/master/perf/README.md) benchmark tools (60ms, 10Gbit/s) shows an **improvement from 33 Mbit/s to 1.3 Gbit/s** in single stream throughput. See libp2p/rust-yamux#176 for details. To ship the above Rust Yamux change in a libp2p patch release (non-breaking), this pull request uses `yamux` `v0.13` (new version) by default and falls back to `yamux` `v0.12` (old version) when setting any configuration options. Thus default users benefit from the increased performance, while power users with custom configurations maintain the old behavior. Pull-Request: #4970.
Using the libp2p perf tests, this change improves rust-libp2p's TCP/TLS/Yamux single stream performance from ~33 Mbit/s to 1.3 Gbit/s. See pull request description above for details. I expect this change to have a large throughput performance impact on various live networks as well. Thus tagging folks here for visibility. For those using Please reach out in case you have any questions.
|
@mxinden great work. Is there a plan to update the Yamux specification to cover this? |
Just to avoid confusion, none of this pull request requires changes at the protocol level. In other words, Rust Yamux follows/ed the libp2p Yamux specification before and after. Round trip time is measured via Yamux's PING message. Window updates are granted via Yamux's WINDOW_UPDATE message. I am not planning to update the specification any time soon. Though I am happy to review a pull request. Want to give it a shot @diegomrsantos? Suggestions to add:
|
Slightly offtopic but I think this is actually a great showcase on what good protocol & API design requires: Specification of messages and their associated behaviour. Simply by specifying the ping & window update frames, it is possible to build quite elaborate algoritms for improving performance, whilst being entirely backwards compatible. I don't mind if this get mentioned in the spec as a "further resource" or something like that but it should be very clear that it is entirely an implementation detail when window updates are sent and how much credit is given to the remote. |
* ci: unset `RUSTFLAGS` value in semver job Don't fail semver-checking if a dependency version has warnings, such as deprecation notices. Related: libp2p#4932 (comment). Related: obi1kenobi/cargo-semver-checks#589. Pull-Request: libp2p#4942. * deps(webrtc): bump alpha versions Bumps versions of `libp2p-webrtc` and `libp2p-webrtc-websys` up one minor version. Fixes: libp2p#4953. Pull-Request: libp2p#4959. * feat(request-response): derive `PartialOrd`,`Ord` for `{Out,In}RequestId` Pull-Request: libp2p#4956. * refactor(connection-limits): make `check_limit` a free-function Pull-Request: libp2p#4958. * chore(webrtc-utils): bump version to allow for new release We didn't bump this crate's version despite it depending on `libp2p_noise`. As such, we can't release `libp2p-webrtc-websys` at the moment because it needs a new release of this crate. Pull-Request: libp2p#4968. * feat(webrtc-websys): hide `libp2p_noise` from the public API Currently, `libp2p-webrtc-websys` exposes the `libp2p_noise` dependency in its public API. It should really be a private dependency of the crate. By wrapping it in a new-type, we can achieve this. Pull-Request: libp2p#4969. * fix(kad): iterator progress to be decided by any of new peers Pull-Request: libp2p#4932. * chore(quic): set `max_idle_timeout` to quinn default timeout Resolves libp2p#4917. Pull-Request: libp2p#4965. * feat(core): impl Display on ListenerId Fixes: libp2p#4935. Pull-Request: libp2p#4936. * feat(server): support websocket Pull-Request: libp2p#4937. * feat(swarm): implement `Copy` and `Clone` for `FromSwarm` We can make `FromSwarm` implement `Copy` and `Close` which makes it much easier to a) generate code in `libp2p-swarm-derive` b) manually wrap a `NetworkBehaviour` Previously, we couldn't do this because `ConnectionClosed` would have a `handler` field that cannot be cloned / copied. Related: libp2p#4076. Related: libp2p#4581. Pull-Request: libp2p#4825. * deps: bump wasm-bindgen-futures from 0.4.38 to 0.4.39 Pull-Request: libp2p#4946. * feat(connection-limit): add function to mutate `ConnectionLimits` Resolves: libp2p#4826. Pull-Request: libp2p#4964. * deps: bump web-sys from 0.3.65 to 0.3.66 Pull-Request: libp2p#4976. * deps: bump wasm-bindgen-test from 0.3.38 to 0.3.39 Pull-Request: libp2p#4975. * fix(kad): don't assume `QuerId`s are unique We mistakenly assumed that `QueryId`s are unique in that, only a single request will be emitted per `QueryId`. This is wrong. A bootstrap for example will issue multiple requests as part of the same `QueryId`. Thus, we cannot use the `QueryId` as a key for the `FuturesMap`. Instead, we use a `FuturesTupleSet` to associate the `QueryId` with the in-flight request. Related: libp2p#4901. Resolves: libp2p#4948. Pull-Request: libp2p#4971. * fix(webrtc example): clarify idle connection timeout When I ran the `example/browser-webrtc` example I discovered it would break after a ping or two. The `Ping` idle timeout needed to be extended, on both the server and the wasm client, which is what this PR fixes. I also added a small note to the README about ensuring `wasm-pack` is install for the users who are new to the ecosystem. Fixes: libp2p#4950. Pull-Request: libp2p#4966. * docs(examples/readme): fix broken link Related: libp2p#3536. Pull-Request: libp2p#4984. * feat(yamux): auto-tune (dynamic) stream receive window libp2p/rust-yamux#176 enables auto-tuning for the Yamux stream receive window. While preserving small buffers on low-latency and/or low-bandwidth connections, this change allows for high-latency and/or high-bandwidth connections to exhaust the available bandwidth on a single stream. Using the [libp2p perf](https://github.com/libp2p/test-plans/blob/master/perf/README.md) benchmark tools (60ms, 10Gbit/s) shows an **improvement from 33 Mbit/s to 1.3 Gbit/s** in single stream throughput. See libp2p/rust-yamux#176 for details. To ship the above Rust Yamux change in a libp2p patch release (non-breaking), this pull request uses `yamux` `v0.13` (new version) by default and falls back to `yamux` `v0.12` (old version) when setting any configuration options. Thus default users benefit from the increased performance, while power users with custom configurations maintain the old behavior. Pull-Request: libp2p#4970. * deps: bump actions/deploy-pages from 2 to 3 Pull-Request: libp2p#4978. * deps: bump the axum group with 2 updates Pull-Request: libp2p#4943. * chore(webrtc-websys): remove unused dependencies Pull-Request: libp2p#4973. * chore(quic): fix link to PR in changelog Pull-Request: libp2p#4993. * deps: bump tokio from 1.34.0 to 1.35.0 Pull-Request: libp2p#4995. * deps: bump syn from 2.0.39 to 2.0.40 Pull-Request: libp2p#4996. * deps: bump once_cell from 1.18.0 to 1.19.0 Pull-Request: libp2p#4998. --------- Co-authored-by: Predrag Gruevski <[email protected]> Co-authored-by: Doug A <[email protected]> Co-authored-by: Darius Clark <[email protected]> Co-authored-by: zhiqiangxu <[email protected]> Co-authored-by: Thomas Eizinger <[email protected]> Co-authored-by: maqi <[email protected]> Co-authored-by: stormshield-frb <[email protected]> Co-authored-by: Max Inden <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: NAHO <[email protected]>
* ci: unset `RUSTFLAGS` value in semver job Don't fail semver-checking if a dependency version has warnings, such as deprecation notices. Related: libp2p#4932 (comment). Related: obi1kenobi/cargo-semver-checks#589. Pull-Request: libp2p#4942. * deps(webrtc): bump alpha versions Bumps versions of `libp2p-webrtc` and `libp2p-webrtc-websys` up one minor version. Fixes: libp2p#4953. Pull-Request: libp2p#4959. * feat(request-response): derive `PartialOrd`,`Ord` for `{Out,In}RequestId` Pull-Request: libp2p#4956. * refactor(connection-limits): make `check_limit` a free-function Pull-Request: libp2p#4958. * chore(webrtc-utils): bump version to allow for new release We didn't bump this crate's version despite it depending on `libp2p_noise`. As such, we can't release `libp2p-webrtc-websys` at the moment because it needs a new release of this crate. Pull-Request: libp2p#4968. * feat(webrtc-websys): hide `libp2p_noise` from the public API Currently, `libp2p-webrtc-websys` exposes the `libp2p_noise` dependency in its public API. It should really be a private dependency of the crate. By wrapping it in a new-type, we can achieve this. Pull-Request: libp2p#4969. * fix(kad): iterator progress to be decided by any of new peers Pull-Request: libp2p#4932. * chore(quic): set `max_idle_timeout` to quinn default timeout Resolves libp2p#4917. Pull-Request: libp2p#4965. * feat(core): impl Display on ListenerId Fixes: libp2p#4935. Pull-Request: libp2p#4936. * feat(server): support websocket Pull-Request: libp2p#4937. * feat(swarm): implement `Copy` and `Clone` for `FromSwarm` We can make `FromSwarm` implement `Copy` and `Close` which makes it much easier to a) generate code in `libp2p-swarm-derive` b) manually wrap a `NetworkBehaviour` Previously, we couldn't do this because `ConnectionClosed` would have a `handler` field that cannot be cloned / copied. Related: libp2p#4076. Related: libp2p#4581. Pull-Request: libp2p#4825. * deps: bump wasm-bindgen-futures from 0.4.38 to 0.4.39 Pull-Request: libp2p#4946. * feat(connection-limit): add function to mutate `ConnectionLimits` Resolves: libp2p#4826. Pull-Request: libp2p#4964. * deps: bump web-sys from 0.3.65 to 0.3.66 Pull-Request: libp2p#4976. * deps: bump wasm-bindgen-test from 0.3.38 to 0.3.39 Pull-Request: libp2p#4975. * fix(kad): don't assume `QuerId`s are unique We mistakenly assumed that `QueryId`s are unique in that, only a single request will be emitted per `QueryId`. This is wrong. A bootstrap for example will issue multiple requests as part of the same `QueryId`. Thus, we cannot use the `QueryId` as a key for the `FuturesMap`. Instead, we use a `FuturesTupleSet` to associate the `QueryId` with the in-flight request. Related: libp2p#4901. Resolves: libp2p#4948. Pull-Request: libp2p#4971. * fix(webrtc example): clarify idle connection timeout When I ran the `example/browser-webrtc` example I discovered it would break after a ping or two. The `Ping` idle timeout needed to be extended, on both the server and the wasm client, which is what this PR fixes. I also added a small note to the README about ensuring `wasm-pack` is install for the users who are new to the ecosystem. Fixes: libp2p#4950. Pull-Request: libp2p#4966. * docs(examples/readme): fix broken link Related: libp2p#3536. Pull-Request: libp2p#4984. * feat(yamux): auto-tune (dynamic) stream receive window libp2p/rust-yamux#176 enables auto-tuning for the Yamux stream receive window. While preserving small buffers on low-latency and/or low-bandwidth connections, this change allows for high-latency and/or high-bandwidth connections to exhaust the available bandwidth on a single stream. Using the [libp2p perf](https://github.com/libp2p/test-plans/blob/master/perf/README.md) benchmark tools (60ms, 10Gbit/s) shows an **improvement from 33 Mbit/s to 1.3 Gbit/s** in single stream throughput. See libp2p/rust-yamux#176 for details. To ship the above Rust Yamux change in a libp2p patch release (non-breaking), this pull request uses `yamux` `v0.13` (new version) by default and falls back to `yamux` `v0.12` (old version) when setting any configuration options. Thus default users benefit from the increased performance, while power users with custom configurations maintain the old behavior. Pull-Request: libp2p#4970. * deps: bump actions/deploy-pages from 2 to 3 Pull-Request: libp2p#4978. * deps: bump the axum group with 2 updates Pull-Request: libp2p#4943. * chore(webrtc-websys): remove unused dependencies Pull-Request: libp2p#4973. * chore(quic): fix link to PR in changelog Pull-Request: libp2p#4993. * deps: bump tokio from 1.34.0 to 1.35.0 Pull-Request: libp2p#4995. * deps: bump syn from 2.0.39 to 2.0.40 Pull-Request: libp2p#4996. * deps: bump once_cell from 1.18.0 to 1.19.0 Pull-Request: libp2p#4998. * deps: bump hkdf from 0.12.3 to 0.12.4 Pull-Request: libp2p#5009. * deps: bump clap from 4.4.10 to 4.4.11 Pull-Request: libp2p#4997. * deps: bump thiserror from 1.0.50 to 1.0.51 Pull-Request: libp2p#5010. * deps: bump syn from 2.0.40 to 2.0.41 Pull-Request: libp2p#5011. * deps: bump async-io from 2.2.1 to 2.2.2 Pull-Request: libp2p#5012. * deps: bump rust-embed from 8.0.0 to 8.1.0 Pull-Request: libp2p#5000. * chore(deps): bump golang.org/x/crypto from 0.7.0 to 0.17.0 Pull-Request: libp2p#5019. * deps: bump libc from 0.2.150 to 0.2.151 Pull-Request: libp2p#5002. * docs: remove [email protected] I no longer have access to the mailing list. See libp2p#5007. Pull-Request: libp2p#5020. * chore: fix typos Pull-Request: libp2p#5021. * fix(derive): restore support for inline generic type constraints Fixes the `#[NetworkBehaviour]` macro to support generic constraints on behaviours without a where clause, which was the case before v0.51. Pull-Request: libp2p#5003. * deps: bump actions/deploy-pages from 3 to 4 Pull-Request: libp2p#5022. * chore: fix several typos in documentation Pull-Request: libp2p#5008. * deps: bump async-trait from 0.1.74 to 0.1.75 Pull-Request: libp2p#5029. * deps: bump anyhow from 1.0.75 to 1.0.76 Pull-Request: libp2p#5030. * deps: bump futures-util from 0.3.29 to 0.3.30 Pull-Request: libp2p#5031. * deps: bump syn from 2.0.41 to 2.0.43 Pull-Request: libp2p#5033. * deps: bump tokio from 1.35.0 to 1.35.1 Pull-Request: libp2p#5034. * deps: bump reqwest from 0.11.22 to 0.11.23 Pull-Request: libp2p#5035. * deps: bump futures from 0.3.29 to 0.3.30 Pull-Request: libp2p#5032. * deps: bump trybuild from 1.0.85 to 1.0.86 Pull-Request: libp2p#5036. * deps: bump proc-macro2 from 1.0.69 to 1.0.71 Pull-Request: libp2p#5041. * deps: bump actions/upload-pages-artifact from 2.0.0 to 3.0.0 Pull-Request: libp2p#5023. * deps: bump Rust to 1.75 and fix clippy lints Pull-Request: libp2p#5043. * deps: bump thiserror from 1.0.51 to 1.0.53 Pull-Request: libp2p#5044. * deps: bump clap from 4.4.11 to 4.4.12 Pull-Request: libp2p#5046. * deps: bump tempfile from 3.8.1 to 3.9.0 Pull-Request: libp2p#5047. * deps: bump rust-embed from 8.1.0 to 8.2.0 Pull-Request: libp2p#5049. * deps: bump serde_json from 1.0.108 to 1.0.109 Pull-Request: libp2p#5050. * deps: bump anyhow from 1.0.76 to 1.0.78 Pull-Request: libp2p#5051. * deps: bump proc-macro2 from 1.0.71 to 1.0.73 Pull-Request: libp2p#5054. * deps: bump quote from 1.0.33 to 1.0.34 Pull-Request: libp2p#5055. * deps: bump anyhow from 1.0.78 to 1.0.79 Pull-Request: libp2p#5062. * deps: bump serde_json from 1.0.109 to 1.0.111 Pull-Request: libp2p#5063. * deps: bump thiserror from 1.0.53 to 1.0.56 Pull-Request: libp2p#5064. * deps: bump libc from 0.2.151 to 0.2.152 Pull-Request: libp2p#5065. * deps: bump trybuild from 1.0.86 to 1.0.88 Pull-Request: libp2p#5068. * deps: bump proc-macro2 from 1.0.73 to 1.0.76 Pull-Request: libp2p#5069. * deps: bump clap from 4.4.12 to 4.4.13 Pull-Request: libp2p#5070. * deps: bump Swatinem/rust-cache from 2.7.1 to 2.7.2 Pull-Request: libp2p#5076. * deps: bump tj-actions/glob from 17 to 18 Pull-Request: libp2p#5058. * deps: bump the axum group with 1 update Pull-Request: libp2p#5045. * deps: bump quote from 1.0.34 to 1.0.35 Pull-Request: libp2p#5071. * deps: bump async-trait from 0.1.75 to 0.1.77 Pull-Request: libp2p#5081. * ci: add dependabot group for webrtc Pull-Request: libp2p#5082. * deps: bump base64 from 0.21.5 to 0.21.7 Pull-Request: libp2p#5086. * deps: bump trybuild from 1.0.88 to 1.0.89 Pull-Request: libp2p#5087. * deps: bump js-sys from 0.3.66 to 0.3.67 Pull-Request: libp2p#5091. * deps: bump wasm-bindgen from 0.2.89 to 0.2.90 Pull-Request: libp2p#5089. * add PeerId to ListenFailure --------- Co-authored-by: Predrag Gruevski <[email protected]> Co-authored-by: Doug A <[email protected]> Co-authored-by: Darius Clark <[email protected]> Co-authored-by: zhiqiangxu <[email protected]> Co-authored-by: Thomas Eizinger <[email protected]> Co-authored-by: maqi <[email protected]> Co-authored-by: stormshield-frb <[email protected]> Co-authored-by: Max Inden <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: NAHO <[email protected]> Co-authored-by: alex <[email protected]> Co-authored-by: Akosh Farkash <[email protected]> Co-authored-by: Frieren <[email protected]>
[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P networking library. It supports all of the features of `rust-libp2p` that are currently being utilized by Polkadot SDK. Compared to `rust-libp2p`, `litep2p` has a quite different architecture which is why the new `litep2p` network backend is only able to use a little of the existing code in `sc-network`. The design has been mainly influenced by how we'd wish to structure our networking-related code in Polkadot SDK: independent higher-levels protocols directly communicating with the network over links that support bidirectional backpressure. A good example would be `NotificationHandle`/`RequestResponseHandle` abstractions which allow, e.g., `SyncingEngine` to directly communicate with peers to announce/request blocks. I've tried running `polkadot --network-backend litep2p` with a few different peer configurations and there is a noticeable reduction in networking CPU usage. For high load (`--out-peers 200`), networking CPU usage goes down from ~110% to ~30% (80 pp) and for normal load (`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp). These should not be taken as final numbers because: a) there are still some low-hanging optimization fruits, such as enabling [receive window auto-tuning](libp2p/rust-yamux#176), integrating `Peerset` more closely with `litep2p` or improving memory usage of the WebSocket transport b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less work will increase the networking CPU usage c) verification in a more diverse set of tests/conditions is needed Nevertheless, these numbers should give an early estimate for CPU usage of the new networking backend. This PR consists of three separate changes: * introduce a generic `PeerId` (wrapper around `Multihash`) so that we don't have use `NetworkService::PeerId` in every part of the code that uses a `PeerId` * introduce `NetworkBackend` trait, implement it for the libp2p network stack and make Polkadot SDK generic over `NetworkBackend` * implement `NetworkBackend` for litep2p The new library should be considered experimental which is why `rust-libp2p` will remain as the default option for the time being. This PR currently depends on the master branch of `litep2p` but I'll cut a new release for the library once all review comments have been addresses. --------- Signed-off-by: Alexandru Vasile <[email protected]> Co-authored-by: Dmitry Markin <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]>
[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P networking library. It supports all of the features of `rust-libp2p` that are currently being utilized by Polkadot SDK. Compared to `rust-libp2p`, `litep2p` has a quite different architecture which is why the new `litep2p` network backend is only able to use a little of the existing code in `sc-network`. The design has been mainly influenced by how we'd wish to structure our networking-related code in Polkadot SDK: independent higher-levels protocols directly communicating with the network over links that support bidirectional backpressure. A good example would be `NotificationHandle`/`RequestResponseHandle` abstractions which allow, e.g., `SyncingEngine` to directly communicate with peers to announce/request blocks. I've tried running `polkadot --network-backend litep2p` with a few different peer configurations and there is a noticeable reduction in networking CPU usage. For high load (`--out-peers 200`), networking CPU usage goes down from ~110% to ~30% (80 pp) and for normal load (`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp). These should not be taken as final numbers because: a) there are still some low-hanging optimization fruits, such as enabling [receive window auto-tuning](libp2p/rust-yamux#176), integrating `Peerset` more closely with `litep2p` or improving memory usage of the WebSocket transport b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less work will increase the networking CPU usage c) verification in a more diverse set of tests/conditions is needed Nevertheless, these numbers should give an early estimate for CPU usage of the new networking backend. This PR consists of three separate changes: * introduce a generic `PeerId` (wrapper around `Multihash`) so that we don't have use `NetworkService::PeerId` in every part of the code that uses a `PeerId` * introduce `NetworkBackend` trait, implement it for the libp2p network stack and make Polkadot SDK generic over `NetworkBackend` * implement `NetworkBackend` for litep2p The new library should be considered experimental which is why `rust-libp2p` will remain as the default option for the time being. This PR currently depends on the master branch of `litep2p` but I'll cut a new release for the library once all review comments have been addresses. --------- Signed-off-by: Alexandru Vasile <[email protected]> Co-authored-by: Dmitry Markin <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]>
[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P networking library. It supports all of the features of `rust-libp2p` that are currently being utilized by Polkadot SDK. Compared to `rust-libp2p`, `litep2p` has a quite different architecture which is why the new `litep2p` network backend is only able to use a little of the existing code in `sc-network`. The design has been mainly influenced by how we'd wish to structure our networking-related code in Polkadot SDK: independent higher-levels protocols directly communicating with the network over links that support bidirectional backpressure. A good example would be `NotificationHandle`/`RequestResponseHandle` abstractions which allow, e.g., `SyncingEngine` to directly communicate with peers to announce/request blocks. I've tried running `polkadot --network-backend litep2p` with a few different peer configurations and there is a noticeable reduction in networking CPU usage. For high load (`--out-peers 200`), networking CPU usage goes down from ~110% to ~30% (80 pp) and for normal load (`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp). These should not be taken as final numbers because: a) there are still some low-hanging optimization fruits, such as enabling [receive window auto-tuning](libp2p/rust-yamux#176), integrating `Peerset` more closely with `litep2p` or improving memory usage of the WebSocket transport b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less work will increase the networking CPU usage c) verification in a more diverse set of tests/conditions is needed Nevertheless, these numbers should give an early estimate for CPU usage of the new networking backend. This PR consists of three separate changes: * introduce a generic `PeerId` (wrapper around `Multihash`) so that we don't have use `NetworkService::PeerId` in every part of the code that uses a `PeerId` * introduce `NetworkBackend` trait, implement it for the libp2p network stack and make Polkadot SDK generic over `NetworkBackend` * implement `NetworkBackend` for litep2p The new library should be considered experimental which is why `rust-libp2p` will remain as the default option for the time being. This PR currently depends on the master branch of `litep2p` but I'll cut a new release for the library once all review comments have been addresses. --------- Signed-off-by: Alexandru Vasile <[email protected]> Co-authored-by: Dmitry Markin <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]>
[litep2p](https://github.com/altonen/litep2p) is a libp2p-compatible P2P networking library. It supports all of the features of `rust-libp2p` that are currently being utilized by Polkadot SDK. Compared to `rust-libp2p`, `litep2p` has a quite different architecture which is why the new `litep2p` network backend is only able to use a little of the existing code in `sc-network`. The design has been mainly influenced by how we'd wish to structure our networking-related code in Polkadot SDK: independent higher-levels protocols directly communicating with the network over links that support bidirectional backpressure. A good example would be `NotificationHandle`/`RequestResponseHandle` abstractions which allow, e.g., `SyncingEngine` to directly communicate with peers to announce/request blocks. I've tried running `polkadot --network-backend litep2p` with a few different peer configurations and there is a noticeable reduction in networking CPU usage. For high load (`--out-peers 200`), networking CPU usage goes down from ~110% to ~30% (80 pp) and for normal load (`--out-peers 40`), the usage goes down from ~55% to ~18% (37 pp). These should not be taken as final numbers because: a) there are still some low-hanging optimization fruits, such as enabling [receive window auto-tuning](libp2p/rust-yamux#176), integrating `Peerset` more closely with `litep2p` or improving memory usage of the WebSocket transport b) fixing bugs/instabilities that incorrectly cause `litep2p` to do less work will increase the networking CPU usage c) verification in a more diverse set of tests/conditions is needed Nevertheless, these numbers should give an early estimate for CPU usage of the new networking backend. This PR consists of three separate changes: * introduce a generic `PeerId` (wrapper around `Multihash`) so that we don't have use `NetworkService::PeerId` in every part of the code that uses a `PeerId` * introduce `NetworkBackend` trait, implement it for the libp2p network stack and make Polkadot SDK generic over `NetworkBackend` * implement `NetworkBackend` for litep2p The new library should be considered experimental which is why `rust-libp2p` will remain as the default option for the time being. This PR currently depends on the master branch of `litep2p` but I'll cut a new release for the library once all review comments have been addresses. --------- Signed-off-by: Alexandru Vasile <[email protected]> Co-authored-by: Dmitry Markin <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]> Co-authored-by: Alexandru Vasile <[email protected]>
Motivation
Yamux provides flow control on the stream level, preventing a fast sender from overwhelming a slow receiver, in other words for the receiver to exerts backpressure to the sender. The receiver does so through communicating a flow control window in-band through
WindowUpdate
messages, basically granting the sender send-credit on a continuous basis.The big question here is: how large should the window be. In other words, how much data should the sender be able to send, before waiting on another
WindowUpdate
from the sender. Choosing the window size too small results in the sender not being able to make use of the entire bandwidth. Choosing the window size too large results in a high memory footprint in the form of a buffer on the receiver side with the size of the window, and delays the backpressure signal from the receiver to the sender.Status quo
Today the Rust Yamux window size is fixed. The default value is 256 KB. Assuming a connection round-trip-time of 60ms, this allows for up to 33 Mbit/s throughput per stream (
256*1024/0,06*8=33.33
). Obviously this is not ideal when running on e.g. a cloud instance with up to 10Gbit/s.One can bump the default window size, to e.g. 16 MB. But then again, this might be too large for some deployments (low latency and/or low throughput) or too small for others (high latency and/or high throughput).
See #162 for a larger discussion on the various options.
Single stream throughput on a 60ms TCP connection with up to 5 Gbit/s machine bandwidth.
https://observablehq.com/@libp2p-workspace/performance-dashboard?branch=9a4c96952368e6432fe51fd12008d31ca30f63ea#branch
Solution
The solution is to auto-tune the stream receive window, making it dynamic based on the round-trip time and the estimated bandwidth, in other words aiming for the bandwidth-delay-product. This is nothing novel. In fact the Linux kernel auto-tunes the TCP receive buffer, QUIC implementations do it and the Go Yamux implementation does it.
Implementation
This pull request does the following:
Result
The results show that this pull request is a game-changer for any Yamux deployments. While low-resourced deployments maintain the benefit of small buffers, high resource deployments eventually end-up with a window of roughly the bandwidth-delay-product (ideal).
Same setup. Single stream throughput on a 60ms TCP connection with up to 5 Gbit/s machine bandwidth.
https://observablehq.com/@libp2p-workspace/performance-dashboard?branch=9a4c96952368e6432fe51fd12008d31ca30f63ea#branch
(As one can see, there is still room for improvement, i.e. 1.3 Gbit/s is not 5 Gbit/s. Though that might be due to some other parts of our stack. More to come in the future.)
Fixes #162.
Preliminary performance results: libp2p/test-plans#332