WIP pool changes #3582

abonander · 2024-10-29T22:01:55Z

Use a separate waiting queue for new connections.
Pool inheritance (used for testing) only steals connect permits, not acquire permits.
Spawn connection attempts as their own task so they may complete even if the acquire() call is cancelled.
Race opening a new connection with acquiring one from the idle queue.
acquire() should now be completely cancel-safe.
Separate timeout for connecting.
New PoolConnector trait superceding both before_connect (requested but not yet implemented) and after_connect callbacks.
- Implemented for closures returning Future, albeit with a 'static requirement for the returned Future (instead of BoxFuture).
- May be updated to use async closures in a future release (hopefully backwards compatible but will require an MSRV bump): https://blog.rust-lang.org/inside-rust/2024/08/09/async-closures-call-for-testing.html
- Can be used to support high availability, or implement custom backoff or connection throttling schemes (e.g. token bucket).
Use usize for all connection counts to get rid of weird inconsistencies.

Breaking Changes

Pool::set_connect_options() and get_connect_options() have been removed. Instead, implement the new PoolConnector trait (or use a closure) using something like Arc<RwLock<impl ConnectOptions>>.
PoolOptions::after_connect() has been removed. Instead, implement PoolConnector (or use a closure), open a connection and then apply any operations necessary.
PoolOptions::min_connections(), PoolOptions::max_connections() and Pool::size() now use usize instead of u32.

Fixes #3513
Fixes #3315
Fixes #3132
Fixes #3117
Fixes #2848

sqlx-core/src/pool/connect.rs

svix-jplatte · 2024-11-08T14:14:12Z

sqlx-core/src/rt/mod.rs

+    }
+
+    #[cfg(not(feature = "_rt-async-std"))]
+    missing_rt((duration, f))


When playing around with this PR locally (to see if it fixes an acquire timeout issue, which it unfortunately doesn't), I found that this caused a compile error. I think it should be

Suggested change

missing_rt((duration, f))

missing_rt((deadline, f))

@jplatte if you have a solid repro for acquire timeouts, I'd love to add it as a test here.

I wish. It's in the proprietary version of the main work codebase, and somehow only happens w/ hyper 1.0 / axum 0.7. But if other debugging approaches don't work out, I can try the hyper upgrade on the much smaller OSS version of the codebase and reduce from there next week.

One thing that Axum does is cancel the handler future if the client disconnects. I wonder if it's triggering a cancellation bug somewhere.

Do you have a before_acquire callback set?

I did some digging a few weeks back and realized that connections could potentially get stuck in return_to_pool because there's no timeout: estuary/flow#1676 (comment)

That's a change I was meaning to add to this PR but hadn't gotten to yet. There's a timeout when it goes to close the connection, but no timeout for the task as a whole.

I don't think it's a cancelation bug. It happens in a test that does a bunch of requests in parallel (50 originally, I can turn it down to 20 and still reliably reproduce the hang, but at 18 it succeded).

What's the max size of the test pool?

And what's the acquire timeout set at?

Hmmm, the max size of the pool is exactly 20, and once I use that amount of parallelism it breaks. Tried 19 too and that works. Acquire timeout is 20s, much longer than it takes the test to run to completion with up to 19 parallel requests.

I also tried raising the pool size to 50, exactly same thing: Once the number of parallel requests is at least as big as the pool size, it hangs (until timeout).

Further, I was using a tokio::sync::Barrier and separate reqwest::Clients, tokio tasks for the requests to happen as closely together as possible (this test was originally written to catch another race). If I don't make the tasks wait on the barrier before making the request, that seems to already mix things up sufficiently for the test succeed, even at a pool size of 20 and 50 parallel requests.

I found the bug, it has nothing to do with SQLx itself. The test was deadlocking the server in a really weird way (related to the DB pool).

abonander added 2 commits October 18, 2024 19:49

breaking(pool): use usize for all connection counts

4efc16b

WIP pool changes

cb4c975

abonander added the breaking label Oct 29, 2024

This was referenced Oct 29, 2024

Fix: spawn task when opening new connection in acquire. #3516

Closed

feat: Implement before_connect callback to modify connect options. #3562

Closed

abonander commented Oct 29, 2024

View reviewed changes

sqlx-core/src/pool/connect.rs Outdated Show resolved Hide resolved

nitnelave mentioned this pull request Oct 30, 2024

[BUG] Postgresql reports "could not receive data from client: Connection reset by peer" lldap/lldap#980

Closed

svix-jplatte reviewed Nov 8, 2024

View reviewed changes

abonander force-pushed the ab/pool-changes branch from 974a2c9 to a5a4053 Compare November 8, 2024 20:18

abonander added 3 commits November 8, 2024 12:26

fix(pool): spawn task for before_acquire

5784733

refactor(pool): use a unique ID per connection

8257830

fix(pool): add timeout to return_to_pool()

3ab3029

abonander force-pushed the ab/pool-changes branch from a5a4053 to 3ab3029 Compare November 8, 2024 23:51

feat(pool): add more info to impl Debug for PoolConnection

e33d23e

abonander force-pushed the ab/pool-changes branch from 13d1c01 to e33d23e Compare November 8, 2024 23:53

fix: compilation error, warnings

39f00fa

abonander force-pushed the ab/pool-changes branch from de647bb to 39f00fa Compare November 9, 2024 00:23

abonander added 3 commits November 8, 2024 16:33

fix: conditional compilation in sqlx-cli

34dc79e

chore: delete defunct use of futures-intrusive

2e9fbb9

fix: upgrade ease-off

52d8933

abonander force-pushed the ab/pool-changes branch from c4b38c2 to 5292057 Compare November 9, 2024 05:49

fix: tests

aacf308

abonander force-pushed the ab/pool-changes branch from 5292057 to aacf308 Compare November 10, 2024 04:30

abonander added 3 commits November 11, 2024 12:24

fix(pool): don't stop emptying idle queue in .close()

e43e184

fix(pool): use the correct method in try_min_connections

c2784eb

fix(pool): use .fuse()

80831e8

abonander force-pushed the ab/pool-changes branch from 0f6e085 to 80831e8 Compare November 11, 2024 20:36

fix(pool): tweaks and fixes

d671b98

abonander force-pushed the ab/pool-changes branch from 8420b43 to d671b98 Compare November 11, 2024 22:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP pool changes #3582

WIP pool changes #3582

abonander commented Oct 29, 2024 •

edited

Loading

svix-jplatte Nov 8, 2024

abonander Nov 8, 2024

jplatte Nov 8, 2024

abonander Nov 8, 2024

abonander Nov 8, 2024 •

edited

Loading

jplatte Nov 8, 2024

abonander Nov 8, 2024

svix-jplatte Nov 11, 2024 •

edited

Loading

svix-jplatte Nov 13, 2024

WIP pool changes #3582

Are you sure you want to change the base?

WIP pool changes #3582

Conversation

abonander commented Oct 29, 2024 • edited Loading

Breaking Changes

svix-jplatte Nov 8, 2024

Choose a reason for hiding this comment

abonander Nov 8, 2024

Choose a reason for hiding this comment

jplatte Nov 8, 2024

Choose a reason for hiding this comment

abonander Nov 8, 2024

Choose a reason for hiding this comment

abonander Nov 8, 2024 • edited Loading

Choose a reason for hiding this comment

jplatte Nov 8, 2024

Choose a reason for hiding this comment

abonander Nov 8, 2024

Choose a reason for hiding this comment

svix-jplatte Nov 11, 2024 • edited Loading

Choose a reason for hiding this comment

svix-jplatte Nov 13, 2024

Choose a reason for hiding this comment

abonander commented Oct 29, 2024 •

edited

Loading

abonander Nov 8, 2024 •

edited

Loading

svix-jplatte Nov 11, 2024 •

edited

Loading