Better connection management #261

piodul · 2021-07-04T12:54:49Z

Reworks how the driver manages connections per-node connections.

A new object, NodeConnectionPool is introduced which takes responsibility of managing connections from Node and ConnectionKeeper. The connection keeper was not removed as it is still used for managing the control connection.

The new management scheme has following advantages over the previous approach:

The number of connections per node is now configurable.
Support for establishing per-shard connections over the non-shard-aware port (9042),
Automatic detection of the shard-aware port on a per-node basis - the driver uses it unless explicitly disabled in the configuration,
In case the shard-aware port is unreachable or hidden behind source-port translating NAT, the driver falls back to using the non-shard-aware port.

TODO: I did some minimal testing (node restart, node add/remove) but should probably do some more (resharding, shard-aware port unreachable, shard-aware port is disabled/enabled after restart). For now, marking as draft.

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
I added appropriate Fixes: annotations to PR description.

Jasperav · 2021-10-14T18:31:29Z

@piodul Do you know when this PR can be merged?

psarna · 2021-10-15T06:36:40Z

@Jasperav it's under review right now, and it also needs a little cleanup wrt. commits and commit messages. Overall, next week is very likely.

havaker

Why did you choose to keep ConnectionKeeper for managing control connections? It could be replaced with the new NodeConnectionPool (and by replacing it, we wouldn't have to maintain two separate connection management implementations).

Except for the fact of not removing ConnectionKeeper, this PR LGTM. Left some minor comments.

havaker · 2021-10-15T13:03:26Z

scylla/src/transport/connection_pool.rs

+
+        // If this fails try getting any other in random order
+        let mut shards_to_try: Vec<u16> =
+            (shard..nr_shards.get()).chain(0..shard).skip(1).collect();


I don't think it's clear, that .skip(1) is used here to exclude shard from the sequence. Have you considered using .filter()? e.g. (0..nr_shards.get()).filter(|i| *i != shard)

Using filter has the following drawback:

There will be an additional check to be performed for each shard ID

The iterator won't be able to provide an accurate size_hint due to the use of filter, so collect won't be able to pre-allocate the right amount of memory for the Vec.

How about this instead (this is what I pushed in the updated version):

let mut shards_to_try: Vec<u16> = (0..shard).chain(shard + 1..nr_shards.get()).collect();

havaker · 2021-10-15T15:30:29Z

scylla/src/transport/connection_pool.rs

+    Initializing,
+
+    // The pool is empty and is waiting to be refilled
+    Pending,


Pending state is caused by failing to create working connection, it is not a successor of Initializing in a favorable conditions. I suggest changing name to one of Broken, Failing or adding a comment, that this state is undesirable.

OK. Changed to Broken

Shard count cannot be zero. Let's use Rust's standard NonZeroU16 type to represent that fact. While the usage of the ShardInfo::nr_shards field became a bit less ergonomic, we get rid of the assert in ShardInfo's constructor.

It wasn't used anywhere. It was just returning the `nr_shards` field which is public and can be accessed directly.

ShardInfo represents information related to sharding which was parsed from OPTIONS response sent by the database. It contains the id of the connection's shard, total number of shards and the msb_ignore parameter used for calculating shard based on token. ShardInfo has multiple methods used for calculations related to sharding, e.g. to calculate the shard for a token or iterate over source ports which can be used to connect to shard-aware port for a given shard. However, none of those methods use the first field of ShardInfo which represents the id of the connection's shard. If a method needs to calculate something for a particular shard id, the id is passed as an argument. This commit moves most of ShardInfo's methods to a new struct, Sharder. The new struct is identical to the first one, but does not have the "connection's shard id" parameter.

Adds a method which returns the shard-aware port reported by the connected node.

Adds a method to ConnectionConfig which returns if the configuration is for a SSL connection or not. This method works both with and without the "ssl" feature enabled.

A Scylla node can have two shard-aware ports enabled - one for unencrypted connections, and other for SSL connections. They are reported separately in SUPPORTED response under two keys: - SCYLLA_SHARD_AWARE_PORT - SCYLLA_SHARD_AWARE_PORT_SSL When establishing SSL connections, we didn't consider the second one at all, only looked at the first one. This commit fixes that, and now the `is_shard_aware` method correctly detects shard-aware SSL connections.

piodul · 2021-11-03T13:30:50Z

Why did you choose to keep ConnectionKeeper for managing control connections? It could be replaced with the new NodeConnectionPool (and by replacing it, we wouldn't have to maintain two separate connection management implementations).

@havaker In a previous version of this PR, the PoolSize::PerHost strategy used to behave differently - in case it encountered a Scylla node, it rounded the connection number up to the nearest multiple of the shard count and tried to maintain this number of connections. This would not be suitable for keeping a control connection because we want to have only one such connection - so that we don't get duplicate events. Now, PoolSize::PerHost behaves differently, so I replaced the ConnectionKeeper with NodeConnectionPool.

As a future improvement, I think that TopologyReader (which was the only user of ConnectionKeeper before my recent changes) should manage a Connection directly. The reason for that is the fact that in case of a connection breakage, we want to immediately switch to a different host. Now, we only do that if there are no live connections available at the time of topology refresh which happens automatically every 60 seconds - at the beginning of that period the node handling the control connection might have gone down, the control connection breaks and we won't get any events for the next 60 seconds.

piodul · 2021-11-03T13:31:29Z

v2:

Slightly changed how the shards_to_try vector is being populated,
Renamed MaybeConnectionPool::Pending to MaybeConnectionPool::Broken,
Got rid of ConnectionKeeper,
Slightly changed the refilling algorithm: now, if the pool is empty and it starts filling, it always opens exactly one connection. This is done in order to reduce the number of connection attempts while the target node is down.

Adds NodeConnectionPool - an object which manages a pool of connections to a single node. It takes this responsibility over from Node and ConnectionKeeper - previously the Node kept a collection of ConnectionKeepers, and each keeper managed a connection to a single shard. The connection pool has the following advantages over the previous approach: - It is able to effectively use the regular port for connections to Scylla. Sometimes, the shard-aware port might not be available - not supported in an old version of Scylla, not exposed/enabled. It might also be hidden behind a source-port translating NAT which renders the shard choosing mechanism useless. A pool is a better abstraction for connecting to the regular port. When connecting to that port, Scylla assigns the shard for the connection and the driver has no control over the decision. The pool, when being refilled, tries to open as many connections as there are needed to fill the remaining shards and assigns resulting connections to the shards. The connection keeper manages only one connection to only one shard, and it won't be able to do it as effectively as the pool. It is worth mentioning that the pool _will_ use the shard-aware port if it is available. All things considered, the shard-aware port provides a superior method for establishing connections evenly distributed over all shards - the regular port should only be used if the shard-aware port cannot be used reliably. The pool is still able to use the shard-aware port as effectively as the ConnectionKeepers. - It detects the shard-aware port automatically. Currently, the Session probes a random node to see if the shard-aware port is reachable and, based on that check, either assumes that no nodes support it, or all nodes support it and have the same port number configured. The pool learns about the shard-aware port from the first connection and uses it for further connections. Because of this, the driver will now be able to use the new port in rolling upgrade scenarios. - It naturally support more than one connection per shard for Scylla (and more than one connection per node in case of Cassandra).

Exposes two new options in the configuration: - connection_pool_size - the target connection pool size, - disallow_shard_aware_port - setting it to true prevents the driver from using the shard-aware port at all. Both options can also be set in the SessionBuilder.

Because the NodeConnectionPool automatically detects the shard-aware port of the node, there is no need to probe it during creation of the Session. This commit removes the probe.

Removes the ConnectionKeeper struct. Its only user was the TopologyReader, and now the new NodeConnectionPool is used in its place.

piodul · 2021-11-03T17:48:10Z

v2.1:

Reduced verbosity of log messages which appear when pool is being refilled and the shard-aware port is being used
Corrected the excess pool maximum size calculation - now it's node_shard_count * 10 (as in gocql), previously it was per_shard_target * 10

piodul · 2021-11-03T17:50:27Z

I did a series of manual tests - I used an application which writes with a steady rate, and it calls use_keyspace at startup.

Ran this scenario twice, with two pool sizes - PerHost and PerShard:

Start with one node
Add a new node
Restart one node with less shards
Restart one node with more shards
Restart one node with shard-aware port disabled
Restart one node with shard awareness disabled
Restart one node, while the node is down restart the application
Decommission node

Ran this scenario with PerShard strategy:

Deliberately introduce a bug to source-port choosing algorithm, connect to a node and observe how the fallback-to-non-shard-aware-port logic behaves

Ran this scenario with PerShard strategy, shard-aware port disallowed:

Start multiple instances of the application and observe the connection storm which eventually passes

The application behaved in a sane way in all of the scenarios.

Jasperav · 2021-11-03T18:41:29Z

I think it's not to hard to spin up additional nodes with Docker with Command within a Rust test. Maybe add a test which is mentioned above while fixing #307

psarna · 2021-11-03T19:47:05Z

Adding a simple distributed test in order to solve #307 is definitely a good idea, but anything more complex is very likely to burn our CI credits very quickly, so for now a set of manual tests is also acceptable. I'll proceed with merging and after that we can start planning a 0.3.0 release, after we browse the statuses of other pending pull requests and move them forward as well.

Jasperav · 2021-11-03T20:02:14Z

@psarna CI credit? Github Action minutes are completely free for public repositories: https://github.com/pricing

psarna · 2021-11-03T20:06:26Z

Isn't there an upper bound of minutes to use monthly though? It would make sense for Github to place some sane limits in order to prevent abuse. If there isn't, then indeed we can add a bunch of distributed tests.

Jasperav · 2021-11-03T20:09:40Z

I don't think so! Yes! Distributed tests are missing for this crate, that would make it a lot faster production ready :)

QuentinPerez · 2021-11-03T21:03:44Z

so cool thank you :)

piodul force-pushed the better-connection-pooling branch from 4228019 to eb11747 Compare July 4, 2021 14:23

piodul force-pushed the better-connection-pooling branch from eb11747 to f8424e6 Compare July 17, 2021 11:07

cvybhu mentioned this pull request Sep 30, 2021

Hangs when using scylla in cluster #306

Closed

piodul force-pushed the better-connection-pooling branch from f8424e6 to fbe83d1 Compare October 11, 2021 07:42

Jasperav mentioned this pull request Oct 15, 2021

Upload to crates Jasperav/Catalytic#2

Closed

havaker reviewed Oct 19, 2021

View reviewed changes

piodul force-pushed the better-connection-pooling branch from fbe83d1 to a0560cb Compare October 20, 2021 20:17

piodul added 6 commits November 3, 2021 14:23

routing: use NonZeroU16 to represent shard count

d4e92aa

Shard count cannot be zero. Let's use Rust's standard NonZeroU16 type to represent that fact. While the usage of the ShardInfo::nr_shards field became a bit less ergonomic, we get rid of the assert in ShardInfo's constructor.

routing: get rid of ShardInfo::get_nr_shards

37adc79

It wasn't used anywhere. It was just returning the `nr_shards` field which is public and can be accessed directly.

connection: add get_shard_aware_port method

ebbcee5

Adds a method which returns the shard-aware port reported by the connected node.

connection: add helper is_ssl method

2d7bc76

Adds a method to ConnectionConfig which returns if the configuration is for a SSL connection or not. This method works both with and without the "ssl" feature enabled.

piodul force-pushed the better-connection-pooling branch 2 times, most recently from 8bc64c3 to 4919730 Compare November 3, 2021 13:26

piodul marked this pull request as ready for review November 3, 2021 13:31

piodul requested a review from havaker November 3, 2021 13:33

havaker approved these changes Nov 3, 2021

View reviewed changes

piodul added 4 commits November 3, 2021 18:04

session: disable initial detection of shard-aware port

31637e6

Because the NodeConnectionPool automatically detects the shard-aware port of the node, there is no need to probe it during creation of the Session. This commit removes the probe.

connection_keeper: remove altogether

427a935

Removes the ConnectionKeeper struct. Its only user was the TopologyReader, and now the new NodeConnectionPool is used in its place.

piodul force-pushed the better-connection-pooling branch from 4919730 to 427a935 Compare November 3, 2021 17:13

psarna merged commit 15f8faf into scylladb:main Nov 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better connection management #261

Better connection management #261

piodul commented Jul 4, 2021

Jasperav commented Oct 14, 2021

psarna commented Oct 15, 2021

havaker left a comment

havaker Oct 15, 2021

piodul Nov 3, 2021

havaker Oct 15, 2021

piodul Nov 3, 2021

piodul commented Nov 3, 2021

piodul commented Nov 3, 2021

piodul commented Nov 3, 2021

piodul commented Nov 3, 2021

Jasperav commented Nov 3, 2021

psarna commented Nov 3, 2021

Jasperav commented Nov 3, 2021

psarna commented Nov 3, 2021 •

edited

Loading

Jasperav commented Nov 3, 2021

QuentinPerez commented Nov 3, 2021

Better connection management #261

Better connection management #261

Conversation

piodul commented Jul 4, 2021

Pre-review checklist

Jasperav commented Oct 14, 2021

psarna commented Oct 15, 2021

havaker left a comment

Choose a reason for hiding this comment

havaker Oct 15, 2021

Choose a reason for hiding this comment

piodul Nov 3, 2021

Choose a reason for hiding this comment

havaker Oct 15, 2021

Choose a reason for hiding this comment

piodul Nov 3, 2021

Choose a reason for hiding this comment

piodul commented Nov 3, 2021

piodul commented Nov 3, 2021

piodul commented Nov 3, 2021

piodul commented Nov 3, 2021

Jasperav commented Nov 3, 2021

psarna commented Nov 3, 2021

Jasperav commented Nov 3, 2021

psarna commented Nov 3, 2021 • edited Loading

Jasperav commented Nov 3, 2021

QuentinPerez commented Nov 3, 2021

psarna commented Nov 3, 2021 •

edited

Loading