Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excess shard-aware port connection for shard #334

Closed
Jasperav opened this issue Nov 11, 2021 · 3 comments
Closed

Excess shard-aware port connection for shard #334

Jasperav opened this issue Nov 11, 2021 · 3 comments
Assignees

Comments

@Jasperav
Copy link
Contributor

Jasperav commented Nov 11, 2021

When running this code inside caching_session somewhere (and enable debug logging) this code:

#[tokio::test(flavor = "multi_thread")]
async fn test_full() {
    let session = create_caching_session().await;

    tokio::time::sleep(tokio::time::Duration::from_secs(10)).await;
}

produces endlessly the following logging:

ov 11 20:35:53.568 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:53.568 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:53.580 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:53.642 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:53.643 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:54.658 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:54.717 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:54.718 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:54.725 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:54.787 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:54.788 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:54.798 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:54.858 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:54.860 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:54.870 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:54.933 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:54.934 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:54.944 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:55.005 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:55.005 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:55.011 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:55.073 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:55.073 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:55.081 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:55.143 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:55.144 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:55.154 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 11 20:35:55.215 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 11 20:35:55.215 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 11 20:35:55.222 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms

I am running a single node Scylla in Docker, nothing fancy.

@piodul
Copy link
Collaborator

piodul commented Nov 11, 2021

@Jasperav could you try re-running your test with logging level set to trace for the scylla crate and paste the results?

In the meantime, my wild guess would be that you are using the 19042 port (the shard-aware port) to connect and switching to 9042 (non-shard-aware port) will help. Could you try that?

Generally, with the new connection pool implementation, you should use the non-shard-aware port when connecting to the cluster. The driver detects the shard-aware port automatically and tries to use it if possible. However, if you are using Docker either on Windows or MacOS, then the driver might not be able to connect to some shards using the shard-aware port because of the specifics of Docker's networking implementation (see my comment here). Normally, in this situation the driver will fall back to the non-shard-aware port which doesn't suffer from this issue, however it's not able to detect that port automatically and assumes that the port you used to connect is the non-shard-aware port. If you pass the shard-aware port when connecting, the connection pool will always use the shard-aware port for connections and will get stuck in a retry loop because it will never manage to reach some of the shards.

@Jasperav
Copy link
Contributor Author

@piodul Alright that is too bad of Docker. Are there reasons to explicitly give the shard port to the driver? Because the driver automatically checks if the shard port is available.

This is the trace logging. Switching to the non-shard port works.

Nov 12 06:30:33.606 DEBUG scylla::transport::cluster: Requesting topology refresh
Nov 12 06:30:33.606 DEBUG scylla::transport::connection_pool: [127.0.0.1] Started asynchronous pool worker
Nov 12 06:30:33.609 TRACE scylla::transport::connection_pool: [127.0.0.1] Will open the first connection to the node
Nov 12 06:30:33.609 TRACE mio::poll: registering event source with poller: token=Token(0), interests=READABLE | WRITABLE    
Nov 12 06:30:33.618 DEBUG scylla::transport::connection_pool: [127.0.0.1] New sharder: Some(Sharder { nr_shards: 4, msb_ignore: 12 }), clearing all connections
Nov 12 06:30:33.618 DEBUG scylla::transport::connection_pool: [127.0.0.1] Updating shard aware port: Some(19042)
Nov 12 06:30:33.618 TRACE scylla::transport::connection_pool: [127.0.0.1] Adding connection 0x7fd80bb05090 to shard 2 pool, now there are 1 for the shard, total 1
Nov 12 06:30:33.618 DEBUG scylla::transport::connection_pool: [127.0.0.1] Pool is full, clearing 0 excess connections
Nov 12 06:30:33.629 DEBUG scylla::transport::connection_pool: [172.18.0.2] Started asynchronous pool worker
Nov 12 06:30:33.630 DEBUG scylla::transport::connection_pool: [172.18.0.5] Started asynchronous pool worker
Nov 12 06:30:33.630 TRACE scylla::transport::connection_pool: [172.18.0.2] Will open the first connection to the node
Nov 12 06:30:33.631 TRACE mio::poll: registering event source with poller: token=Token(1), interests=READABLE | WRITABLE    
Nov 12 06:30:33.631 DEBUG scylla::transport::connection_pool: [127.0.0.1] Started asynchronous pool worker
Nov 12 06:30:33.632 TRACE scylla::transport::connection_pool: [172.18.0.5] Will open the first connection to the node
Nov 12 06:30:33.632 TRACE mio::poll: registering event source with poller: token=Token(2), interests=READABLE | WRITABLE    
Nov 12 06:30:33.632 TRACE scylla::transport::connection_pool: [127.0.0.1] Will open the first connection to the node
Nov 12 06:30:33.632 TRACE mio::poll: registering event source with poller: token=Token(3), interests=READABLE | WRITABLE    
Nov 12 06:30:33.643 DEBUG scylla::transport::connection_pool: [127.0.0.1] New sharder: Some(Sharder { nr_shards: 4, msb_ignore: 12 }), clearing all connections
Nov 12 06:30:33.643 DEBUG scylla::transport::connection_pool: [127.0.0.1] Updating shard aware port: Some(19042)
Nov 12 06:30:33.643 TRACE scylla::transport::connection_pool: [127.0.0.1] Adding connection 0x7fd80a50b760 to shard 2 pool, now there are 1 for the shard, total 1
Nov 12 06:30:33.643 TRACE mio::poll: deregistering event source from poller    
Nov 12 06:30:33.643 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 12 06:30:33.643 DEBUG scylla::transport::connection_pool: [172.18.0.2] Failed to open connection to the non-shard-aware port: IoError(Os { code: 61, kind: ConnectionRefused, message: "Connection refused" })
Nov 12 06:30:33.643 DEBUG scylla::transport::connection_pool: [172.18.0.2] Scheduling next refill in 100 ms
Nov 12 06:30:33.699 TRACE scylla::transport::connection_pool: [127.0.0.1] Will open 1 connections to shard 0
Nov 12 06:30:33.699 TRACE scylla::transport::connection_pool: [127.0.0.1] Will open 1 connections to shard 1
Nov 12 06:30:33.699 TRACE scylla::transport::connection_pool: [127.0.0.1] Will open 1 connections to shard 3
Nov 12 06:30:33.700 TRACE mio::poll: registering event source with poller: token=Token(16777217), interests=READABLE | WRITABLE    
Nov 12 06:30:33.700 TRACE mio::poll: registering event source with poller: token=Token(4), interests=READABLE | WRITABLE    
Nov 12 06:30:33.701 TRACE mio::poll: registering event source with poller: token=Token(5), interests=READABLE | WRITABLE    
Nov 12 06:30:33.709 TRACE scylla::transport::connection_pool: [127.0.0.1] Adding connection 0x7fd80ba09670 to shard 0 pool, now there are 1 for the shard, total 2
Nov 12 06:30:33.711 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 12 06:30:33.711 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 12 06:30:33.711 TRACE mio::poll: deregistering event source from poller    
Nov 12 06:30:33.711 TRACE mio::poll: registering event source with poller: token=Token(6), interests=READABLE | WRITABLE    
Nov 12 06:30:33.711 TRACE mio::poll: registering event source with poller: token=Token(16777220), interests=READABLE | WRITABLE    
Nov 12 06:30:33.712 TRACE mio::poll: deregistering event source from poller    
Nov 12 06:30:33.718 TRACE scylla::transport::connection_pool: [127.0.0.1] Storing excess connection 0x7fd80ba09cf0 for shard 0
Nov 12 06:30:33.721 TRACE scylla::transport::connection_pool: [127.0.0.1] Storing excess connection 0x7fd80b806b70 for shard 2
Nov 12 06:30:33.721 DEBUG scylla::transport::connection_pool: [127.0.0.1] Scheduling next refill in 50 ms
Nov 12 06:30:33.745 TRACE scylla::transport::connection_pool: [172.18.0.2] Will open the first connection to the node
Nov 12 06:30:33.745 TRACE mio::poll: registering event source with poller: token=Token(33554433), interests=READABLE | WRITABLE    
Nov 12 06:30:33.776 TRACE scylla::transport::connection_pool: [127.0.0.1] Will open 1 connections to shard 1
Nov 12 06:30:33.776 TRACE scylla::transport::connection_pool: [127.0.0.1] Will open 1 connections to shard 3
Nov 12 06:30:33.776 TRACE mio::poll: registering event source with poller: token=Token(7), interests=READABLE | WRITABLE    
Nov 12 06:30:33.776 TRACE mio::poll: registering event source with poller: token=Token(8), interests=READABLE | WRITABLE    
Nov 12 06:30:33.783 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 0; will retry with non-shard-aware port
Nov 12 06:30:33.784 TRACE mio::poll: registering event source with poller: token=Token(9), interests=READABLE | WRITABLE    
Nov 12 06:30:33.784 TRACE mio::poll: deregistering event source from poller    
Nov 12 06:30:33.789 DEBUG scylla::transport::connection_pool: [127.0.0.1] Excess shard-aware port connection for shard 2; will retry with non-shard-aware port
Nov 12 06:30:33.789 TRACE mio::poll: registering event source with poller: token=Token(16777223), interests=READABLE | WRITABLE    

@piodul
Copy link
Collaborator

piodul commented Nov 12, 2021

@piodul Alright that is too bad of Docker. Are there reasons to explicitly give the shard port to the driver? Because the driver automatically checks if the shard port is available.

@Jasperav At the moment, it's better to give the non-shard-aware port to the driver, let it detect the shard-aware port and let it choose which port to use. In older versions the driver behaved better if you passed the shard-aware port, but that's no longer the case.

@Lorak-mmk Lorak-mmk self-assigned this Nov 15, 2023
@avelanarius avelanarius closed this as not planned Won't fix, can't repro, duplicate, stale Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants