Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better connection management #261

Merged
merged 10 commits into from
Nov 3, 2021
Merged

Conversation

piodul
Copy link
Collaborator

@piodul piodul commented Jul 4, 2021

Reworks how the driver manages connections per-node connections.

A new object, NodeConnectionPool is introduced which takes responsibility of managing connections from Node and ConnectionKeeper. The connection keeper was not removed as it is still used for managing the control connection.

The new management scheme has following advantages over the previous approach:

  • The number of connections per node is now configurable.
  • Support for establishing per-shard connections over the non-shard-aware port (9042),
  • Automatic detection of the shard-aware port on a per-node basis - the driver uses it unless explicitly disabled in the configuration,
  • In case the shard-aware port is unreachable or hidden behind source-port translating NAT, the driver falls back to using the non-shard-aware port.

TODO: I did some minimal testing (node restart, node add/remove) but should probably do some more (resharding, shard-aware port unreachable, shard-aware port is disabled/enabled after restart). For now, marking as draft.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I added appropriate Fixes: annotations to PR description.

@piodul piodul force-pushed the better-connection-pooling branch from 4228019 to eb11747 Compare July 4, 2021 14:23
@piodul piodul force-pushed the better-connection-pooling branch from eb11747 to f8424e6 Compare July 17, 2021 11:07
@piodul piodul force-pushed the better-connection-pooling branch from f8424e6 to fbe83d1 Compare October 11, 2021 07:42
@Jasperav
Copy link
Contributor

@piodul Do you know when this PR can be merged?

@psarna
Copy link
Contributor

psarna commented Oct 15, 2021

@Jasperav it's under review right now, and it also needs a little cleanup wrt. commits and commit messages. Overall, next week is very likely.

Copy link
Contributor

@havaker havaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you choose to keep ConnectionKeeper for managing control connections? It could be replaced with the new NodeConnectionPool (and by replacing it, we wouldn't have to maintain two separate connection management implementations).

Except for the fact of not removing ConnectionKeeper, this PR LGTM. Left some minor comments.


// If this fails try getting any other in random order
let mut shards_to_try: Vec<u16> =
(shard..nr_shards.get()).chain(0..shard).skip(1).collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's clear, that .skip(1) is used here to exclude shard from the sequence. Have you considered using .filter()? e.g. (0..nr_shards.get()).filter(|i| *i != shard)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using filter has the following drawback:

  • There will be an additional check to be performed for each shard ID
  • The iterator won't be able to provide an accurate size_hint due to the use of filter, so collect won't be able to pre-allocate the right amount of memory for the Vec.

How about this instead (this is what I pushed in the updated version):

let mut shards_to_try: Vec<u16> = (0..shard).chain(shard + 1..nr_shards.get()).collect();

Initializing,

// The pool is empty and is waiting to be refilled
Pending,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending state is caused by failing to create working connection, it is not a successor of Initializing in a favorable conditions. I suggest changing name to one of Broken, Failing or adding a comment, that this state is undesirable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Changed to Broken

@piodul piodul force-pushed the better-connection-pooling branch from fbe83d1 to a0560cb Compare October 20, 2021 20:17
Shard count cannot be zero. Let's use Rust's standard NonZeroU16 type to
represent that fact.

While the usage of the ShardInfo::nr_shards field became a bit less
ergonomic, we get rid of the assert in ShardInfo's constructor.
It wasn't used anywhere. It was just returning the `nr_shards` field
which is public and can be accessed directly.
ShardInfo represents information related to sharding which was parsed
from OPTIONS response sent by the database. It contains the id of the
connection's shard, total number of shards and the msb_ignore parameter
used for calculating shard based on token.

ShardInfo has multiple methods used for calculations related to
sharding, e.g. to calculate the shard for a token or iterate over source
ports which can be used to connect to shard-aware port for a given
shard.

However, none of those methods use the first field of ShardInfo which
represents the id of the connection's shard. If a method needs to
calculate something for a particular shard id, the id is passed as an
argument.

This commit moves most of ShardInfo's methods to a new struct, Sharder.
The new struct is identical to the first one, but does not have the
"connection's shard id" parameter.
Adds a method which returns the shard-aware port reported by the
connected node.
Adds a method to ConnectionConfig which returns if the configuration is
for a SSL connection or not. This method works both with and without the
"ssl" feature enabled.
A Scylla node can have two shard-aware ports enabled - one for
unencrypted connections, and other for SSL connections. They are
reported separately in SUPPORTED response under two keys:

- SCYLLA_SHARD_AWARE_PORT
- SCYLLA_SHARD_AWARE_PORT_SSL

When establishing SSL connections, we didn't consider the second one at
all, only looked at the first one. This commit fixes that, and now the
`is_shard_aware` method correctly detects shard-aware SSL connections.
@piodul piodul force-pushed the better-connection-pooling branch 2 times, most recently from 8bc64c3 to 4919730 Compare November 3, 2021 13:26
@piodul
Copy link
Collaborator Author

piodul commented Nov 3, 2021

Why did you choose to keep ConnectionKeeper for managing control connections? It could be replaced with the new NodeConnectionPool (and by replacing it, we wouldn't have to maintain two separate connection management implementations).

@havaker In a previous version of this PR, the PoolSize::PerHost strategy used to behave differently - in case it encountered a Scylla node, it rounded the connection number up to the nearest multiple of the shard count and tried to maintain this number of connections. This would not be suitable for keeping a control connection because we want to have only one such connection - so that we don't get duplicate events. Now, PoolSize::PerHost behaves differently, so I replaced the ConnectionKeeper with NodeConnectionPool.

As a future improvement, I think that TopologyReader (which was the only user of ConnectionKeeper before my recent changes) should manage a Connection directly. The reason for that is the fact that in case of a connection breakage, we want to immediately switch to a different host. Now, we only do that if there are no live connections available at the time of topology refresh which happens automatically every 60 seconds - at the beginning of that period the node handling the control connection might have gone down, the control connection breaks and we won't get any events for the next 60 seconds.

@piodul
Copy link
Collaborator Author

piodul commented Nov 3, 2021

v2:

  • Slightly changed how the shards_to_try vector is being populated,
  • Renamed MaybeConnectionPool::Pending to MaybeConnectionPool::Broken,
  • Got rid of ConnectionKeeper,
  • Slightly changed the refilling algorithm: now, if the pool is empty and it starts filling, it always opens exactly one connection. This is done in order to reduce the number of connection attempts while the target node is down.

@piodul piodul marked this pull request as ready for review November 3, 2021 13:31
@piodul piodul requested a review from havaker November 3, 2021 13:33
Adds NodeConnectionPool - an object which manages a pool of connections
to a single node. It takes this responsibility over from Node and
ConnectionKeeper - previously the Node kept a collection of
ConnectionKeepers, and each keeper managed a connection to a single
shard.

The connection pool has the following advantages over the previous
approach:

- It is able to effectively use the regular port for connections to
  Scylla. Sometimes, the shard-aware port might not be available - not
  supported in an old version of Scylla, not exposed/enabled. It might
  also be hidden behind a source-port translating NAT which renders the
  shard choosing mechanism useless.

  A pool is a better abstraction for connecting to the regular port.
  When connecting to that port, Scylla assigns the shard for the
  connection and the driver has no control over the decision. The pool,
  when being refilled, tries to open as many connections as there are
  needed to fill the remaining shards and assigns resulting connections
  to the shards.

  The connection keeper manages only one connection to only one shard,
  and it won't be able to do it as effectively as the pool.

  It is worth mentioning that the pool _will_ use the shard-aware port
  if it is available. All things considered, the shard-aware port
  provides a superior method for establishing connections evenly
  distributed over all shards - the regular port should only be used if
  the shard-aware port cannot be used reliably. The pool is still able
  to use the shard-aware port as effectively as the ConnectionKeepers.

- It detects the shard-aware port automatically. Currently, the Session
  probes a random node to see if the shard-aware port is reachable and,
  based on that check, either assumes that no nodes support it, or all
  nodes support it and have the same port number configured.

  The pool learns about the shard-aware port from the first connection
  and uses it for further connections. Because of this, the driver will
  now be able to use the new port in rolling upgrade scenarios.

- It naturally support more than one connection per shard for Scylla
  (and more than one connection per node in case of Cassandra).
Exposes two new options in the configuration:

- connection_pool_size - the target connection pool size,
- disallow_shard_aware_port - setting it to true prevents the driver
  from using the shard-aware port at all.

Both options can also be set in the SessionBuilder.
Because the NodeConnectionPool automatically detects the shard-aware
port of the node, there is no need to probe it during creation of the
Session.

This commit removes the probe.
Removes the ConnectionKeeper struct. Its only user was the
TopologyReader, and now the new NodeConnectionPool is used in its place.
@piodul piodul force-pushed the better-connection-pooling branch from 4919730 to 427a935 Compare November 3, 2021 17:13
@piodul
Copy link
Collaborator Author

piodul commented Nov 3, 2021

v2.1:

  • Reduced verbosity of log messages which appear when pool is being refilled and the shard-aware port is being used
  • Corrected the excess pool maximum size calculation - now it's node_shard_count * 10 (as in gocql), previously it was per_shard_target * 10

@piodul
Copy link
Collaborator Author

piodul commented Nov 3, 2021

I did a series of manual tests - I used an application which writes with a steady rate, and it calls use_keyspace at startup.

Ran this scenario twice, with two pool sizes - PerHost and PerShard:

  • Start with one node
  • Add a new node
  • Restart one node with less shards
  • Restart one node with more shards
  • Restart one node with shard-aware port disabled
  • Restart one node with shard awareness disabled
  • Restart one node, while the node is down restart the application
  • Decommission node

Ran this scenario with PerShard strategy:

  • Deliberately introduce a bug to source-port choosing algorithm, connect to a node and observe how the fallback-to-non-shard-aware-port logic behaves

Ran this scenario with PerShard strategy, shard-aware port disallowed:

  • Start multiple instances of the application and observe the connection storm which eventually passes

The application behaved in a sane way in all of the scenarios.

@Jasperav
Copy link
Contributor

Jasperav commented Nov 3, 2021

I think it's not to hard to spin up additional nodes with Docker with Command within a Rust test. Maybe add a test which is mentioned above while fixing #307

@psarna
Copy link
Contributor

psarna commented Nov 3, 2021

Adding a simple distributed test in order to solve #307 is definitely a good idea, but anything more complex is very likely to burn our CI credits very quickly, so for now a set of manual tests is also acceptable. I'll proceed with merging and after that we can start planning a 0.3.0 release, after we browse the statuses of other pending pull requests and move them forward as well.

@psarna psarna merged commit 15f8faf into scylladb:main Nov 3, 2021
@Jasperav
Copy link
Contributor

Jasperav commented Nov 3, 2021

@psarna CI credit? Github Action minutes are completely free for public repositories: https://github.com/pricing

@psarna
Copy link
Contributor

psarna commented Nov 3, 2021

Isn't there an upper bound of minutes to use monthly though? It would make sense for Github to place some sane limits in order to prevent abuse. If there isn't, then indeed we can add a bunch of distributed tests.

@Jasperav
Copy link
Contributor

Jasperav commented Nov 3, 2021

I don't think so! Yes! Distributed tests are missing for this crate, that would make it a lot faster production ready :)

@QuentinPerez
Copy link
Contributor

so cool thank you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants