Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proof of Concept: Release GVL in Statement#step #528

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jhawthorn
Copy link
Member

Refs #287, but I don't think this is ready to merge. As described below the performance impact is very significant. I think this is something the library should do, but it's a penalty that's hard to just eat.

This allows other threads to execute Ruby code (including performing more work sqlite operations if on another DB connection). This should significantly improve multithreaded performance, however as @tenderlove correctly predicted in #287 adds significant overhead to single-threaded performance.

This required wrapping all callbacks which can happen within statement execution with rb_thread_call_with_gvl.

I've written two benchmarks, one which shows why this could be a good idea and one which shows why it's a bad idea. It's unfortunately not clear cut.

The good

https://gist.github.com/jhawthorn/ae84ee2723ff4930344778cfe40205a2

I made a benchmark based on making sqlite actually do work (a poorly optimized query) before returning, and then having multiple threads perform the same operation. This shows a very significant speed up as we are actually able to run the sqlite operations in parallel.

main

$ bundle exec ruby test.rb
single threaded:
0.9432485040742904
multi threaded:
1.1921860249713063

This branch:

$ bundle exec ruby test.rb
single threaded:
0.927121008047834
multi threaded:
0.17125224182382226

The bad

https://gist.github.com/jhawthorn/492b6a9c5fb6bebe1ebfe985f52495de

When running a completely trivial query in a single thread, this is about 60% slower due to the need to release and re-acquire the GVL on every call to Statement#step. I don't think there's an easy way around this as even if our API was a more "bulk" operation (ie "return all rows" rather than "return one row") I believe we'll need the GVL every time we want to allocate a Ruby string and our memory buffer returned from SQLite is only good until our next call to sqlite3_step.

Comparison:
SELECT * LIMIT 1000 (gvl):     4336.3 i/s
SELECT * LIMIT 1000 (nogvl):     2705.2 i/s - 1.60x  slower

I do think this is capturing the absolute worst case as the query is completely trivial and doesn't even make allocations other than the one array per-row.

So that's where we're at. There might be some sort of compromise we could do like making the GVL releasing optional or based on a heuristic. I haven't yet investigated any way to make this faster, but it does seem unlikely we'll find a way to do this without a performance penalty.

TODO

  • execute_batch2 doesn't yet release the GVL (which probably makes it crashy if any callbacks exist)
  • prepare and some other operations should likely also release the GVL
  • It might be good to provide a UBF to rb_thread_call_without_gvl

This allows other threads to execute Ruby code (including performing
more work sqlite operations if on another DB connection). This should
significantly improve multithreaded performance, however will add
overhead to single-threaded performance.

This required wrapping all callbacks which can happen within statement
execution with rb_thread_call_with_gvl.
@flavorjones
Copy link
Member

Do you think it might be useful to allow users to make this decision, e.g. via a Database constructor option like Database.new(release_gvl: true)?

Allowing folks who are knowledgeable about their production runtime to decide whether to release the GVL or not might still be a huge step forward, and allow higher-level abstractions (like the Rails database adapter) to evolve their own policies.

@jhawthorn
Copy link
Member Author

jhawthorn commented Apr 22, 2024

A configuration option is probably the safest bet, at least for step. Though I'd love if the option was kept open to make release_gvl: true the default in the future if we are either able to reduce the overhead or find that in real-world scenarios it's the best option.

I'm not sure immediately why this is failing CI on the Windows --disable-system-libraries builds specifically. I'll look into that when I get the chance.

@byroot
Copy link
Contributor

byroot commented Apr 23, 2024

Something we mentioned in the past was to only release the GVL around the first call to step. From my understanding, that's the one that perform the query hence is significantly "slower". Supposedly the following calls are just fetching the data and don't allow for much concurrency anyway.

@jhawthorn
Copy link
Member Author

Something we mentioned in the past was to only release the GVL around the first call to step. From my understanding, that's the one that perform the query hence is significantly "slower". Supposedly the following calls are just fetching the data and don't allow for much concurrency anyway.

In the general case this isn't true, unfortunately. Building on the benchmark in my gist:

db.prepare("SELECT * FROM t2 WHERE b>=2 AND b<500") do |stmt|
  10.times do
    puts Benchmark.realtime { stmt.step } * 1000
  end
end
0.05734805017709732
0.017974060028791428
0.012603937648236752
0.25284592993557453
0.04639802500605583
0.07501093205064535
0.008465955033898354
0.1521760132163763
0.01405598595738411
0.0023850006982684135

stmt.step just allows sqlite's VM to continue running, so it depends whether it has work to do between the first row returned and the second (ie. continue scanning the table).

It's possible that releasing once is a good heuristic though for queries which have an index?

@fractaledmind
Copy link
Contributor

In the Extralite gem, @noteflakes iterated on this problem and landed with a configurable gvl_release_threshold that defaults to 1000:

In order to support multi-threading, Extralite releases the Ruby GVL periodically while running queries. This allows other threads to run while the underlying SQLite library is busy preparing queries, fetching records and backing up databases. By default, the GVL is when preparing the query, and once for every 1000 iterated records. The GVL release threshold can be set separately for each database:

# release the GVL on preparing the query, and every 10 records
db.gvl_release_threshold = 10 

# release the GVL only when preparing the query
db.gvl_release_threshold = 0 

# never release the GVL (for single-threaded apps only)
db.gvl_release_threshold = -1

db.gvl_release_threshold = nil # use default value (currently 1000)

For most applications, there's no need to tune the GVL threshold value, as it provides excellent performance characteristics for both single-threaded and multi-threaded applications.

In a heavily multi-threaded application, releasing the GVL more often (lower threshold value) will lead to less latency (for threads not running a query), but will also hurt the throughput (for the thread running the query). Releasing the GVL less often (higher threshold value) will lead to better throughput for queries, while increasing latency for threads not running a query. The following diagram demonstrates the relationship between the GVL release threshold value, latency and throughput:

less latency & throughput <<< GVL release threshold >>> more latency & throughput

https://github.com/digital-fabric/extralite?tab=readme-ov-file#the-ruby-gvl

This is the pull request that added the feature, and this is the pull request that was closed in favor of this approach.

Just offering some additional context and food for thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants