Replace Connection::query_all with Connection::query_iter #645

piodul · 2023-02-17T10:55:04Z

For internal queries that can return a large number of rows (topology info, schema) we use currently use Connection::query_all to perform them. That method fetches results of the query in multiple pages and then returns it as a single combined QueryResult. This PR replaces it with a new Connection::query_iter method which has semantics more similar to the iterator-based query methods from the public interface (Session::query_iter, Session::execute_iter).

The motivations for this change are as follows:

Although the new methods will use slightly more memory if the query returns only one page (because of the need to spawn a new task/channels/etc.), it should consume less memory if multiple pages are fetched because rows are processed on-line as new pages appear and only 1-2 pages need to exist at a given time.
The logic that combines multiple pages into a single QueryResult will be hard to translate to use the upcoming iterator-based deserialization interface (Switch deserialization to an iterator-based interface to avoid allocations #462). Although the old interface will still be available as a deprecated fallback, I would like to adapt those methods as a form of dogfooding test for the new interface.

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
I added relevant tests for new features and bug fixes.
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
~~I added appropriate Fixes: annotations to PR description.~~

A common part of new_for_query and new_for_prepared_statement which handles spawning a task and returns a non-empty RowIterator is extracted to a common function. It will also help implement the query_iter method for Connection.

We have two places in the iterator worker code that attempt to send an empty page, and we will have one more in the upcoming commits, so let's abstract this pattern away to a method.

piodul · 2023-02-17T11:09:31Z

https://github.com/scylladb/scylla-rust-driver/actions/runs/4203056593/jobs/7291978661

running 2 tests
test retries_occur ... ok
test speculative_execution_is_fired ... FAILED

failures:

---- speculative_execution_is_fired stdout ----
Unique name: test_rust_1676631776_1
thread '<unnamed>' panicked at 'Worker failed: Node 172.42.0.2:9042 disconnected', scylla/tests/retries.rs:95:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'speculative_execution_is_fired' panicked at 'timeout: the function call took 3772 ms. Max time 30000 ms', scylla/tests/retries.rs:20:1


failures:
    speculative_execution_is_fired

test result: FAILED. 1 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 3.77s

Strange, sounds a little bit as if a node has crashed. It's hard to tell for sure because we don't print logs from the scylla docker containers at all.

I'll try to improve the CI so that it does print the logs from the cluster. In the meantime, I'll just re-run and hope that the error was an unimportant fluke.

wprzytula

Very good overall, but some old naming is still left.

scylla/src/transport/iterator.rs

scylla/src/transport/topology.rs

scylla/src/transport/connection.rs

piodul · 2023-02-20T14:37:35Z

v2:

Changed the buffer_unordered parameter from 64 -> 256 to allow more concurrency when translating addresses,
Updated some leftover comments/function names/string literals not to mention query_all but rather query_iter.

piodul · 2023-02-20T18:37:11Z

@cvybhu ping

cvybhu

Code looks ok, left a few nits.

Given that, I'm not a fan of this change. The streams are very fancy, but IMO they are much harder to read and reason about. Previously there were a few clearly defined steps where one thing happened at a time and I could easily follow it step after step. With streams it feels different - it's like first we prepare what to do and then do everything at once.

With normal code there are usually a few intermediate variables with clearly defined types, but here all the chained stream combinators make it impossible to be sure about the type of things at each step. The error handling is also much more complicated, all the errors must be propagated in the stream, and then all the try_something functions hopefully handle them, but I find it very hard to reason about. Then there are also the complications with .buffer_unordered(256)

IMO it would be better to keep using the query_all function. It could be implemented using query_iter, although I guess it doesn't make much sense.
I'm a big fan of KISS.

I might be complaining too much, maybe streams are just a Rust thing that I'm not used to yet, but my initial feeling is that they introduce new complexity that could easily be avoided.

In the PR description you mentioned that it will be hard to keep query_all after the deserialization refactor, but why is that? Will it be impossible to simply collect the rows to a Vec?

scylla/src/transport/iterator.rs

cvybhu · 2023-02-26T02:08:40Z

scylla/src/transport/iterator.rs

+/// A massively simplified version of the RowIteratorWorker. It does not have
+/// any complicated logic related to retries, it just fetches pages from
+/// a single connection.


It would be nice to use a RetryPolicy for retries. I think it wouldn't be too hard to incorporate it.

I'm not sure whether it makes sense for queries that are limited to a single connection and only read data from a single node. For example: what should happen on RetryNextNode?

piodul · 2023-02-27T10:00:29Z

Code looks ok, left a few nits.

Given that, I'm not a fan of this change. The streams are very fancy, but IMO they are much harder to read and reason about. Previously there were a few clearly defined steps where one thing happened at a time and I could easily follow it step after step. With streams it feels different - it's like first we prepare what to do and then do everything at once.

I assume that your comment is about the changes in query_peers from topology.rs.

I'm not sure if I agree. The biggest advantage of streams here is that they reduce the peak memory use. Instead of loading all results from a query into memory, we only need to keep 1-2 pages in memory at a given time. I'll admit that the benefit is not huge because we are constructing the metadata anyway, but it's not only about making the code look more "fancy".

With normal code there are usually a few intermediate variables with clearly defined types, but here all the chained stream combinators make it impossible to be sure about the type of things at each step. The error handling is also much more complicated, all the errors must be propagated in the stream, and then all the try_something functions hopefully handle them, but I find it very hard to reason about. Then there are also the complications with .buffer_unordered(256)

We are already using iterators extensively in the codebase, so I thought that streams would be familiar. They have a slightly different API due to their asynchronicity, but the concept is the same.

I'm not sure what do you mean with complications with .buffer_unordered(). Is it about the need to explicitly limit parallelism? Limiting parallelism is not a bad thing.

IMO it would be better to keep using the query_all function. It could be implemented using query_iter, although I guess it doesn't make much sense. I'm a big fan of KISS.

With query_iter, you can choose either to collect everything to a Vec and then process, or do the processing on the fly, as pages are fetched.

I might be complaining too much, maybe streams are just a Rust thing that I'm not used to yet, but my initial feeling is that they introduce new complexity that could easily be avoided.

In the PR description you mentioned that it will be hard to keep query_all after the deserialization refactor, but why is that? Will it be impossible to simply collect the rows to a Vec?

I guess it will be possible to change query_all to just return a Vec of parsed rows, not a full QueryResponse. I just felt that solving the problem in this way will prevent us from utilizing the new interface to its full potential and will deny us from testing its ergonomics on our internal code.

In general, the new deserialization interface will not return pre-parsed rows in the style of Vec<Vec<Option<CqlValue>>>>, but will return an iterator that lazily deserializes the contents of a response. If we wanted the query_all to keep returning QueryResponse, then we would have to concatenate the contents of the pages into a large blob, which is even worse with regards to memory use.

Adds the RowIterator::new_for_connection_query_iter method which will be the main workhorse of the Connection::query_iter method. It uses a new iterator worker which is a much simpler version of the proper, multi-connection iterator worker.

The new Connection::query_iter method will serve as a replacement for the query_all method. Contrary to query_all, query_iter will not materialize all query results as a single QueryResult but will allow to process the results row by row.

In order to use Connection::query_iter it is neccessary to pass self wrapped in an Arc - the iterator worker is spawned as a separate tokio task, so it needs to co-own the Connection.

Thanks to scylladb#628, it is now possible to provide an explicit path to a crate which provides necessary items for the FromRow macro. It is now possible to use this macro from the `scylla` crate itself, as we just need to point it to the `scylla-cql` crate instead. The NodeInfoRow struct is introduced to represent rows fetched from system.local and system.peers. It looks nicer than using a 5-element tuple and will reduce some noise when performing refactors in the commits that follow.

An enum expresses intent a little more clearly than a bool.

Conversion from NodeInfoRow to Peer is done across two lambdas applied to the untranslated_rows iterator. Let's simplify it a bit and extract this logic to a new separate function create_peer_from_row. It is recommended to hide whitespace changes when reviewing the diff of this commit.

Previously, query_peers used query_all to fetch topology information and then processed it using iterators. Now, this method is changed to use query_iter and rows are processed on the fly using asynchronous streams. In order to preserve the behavior of system.peers and system.local being queried in parallel, stream::select is used which merges results of both streams as they become available. Previously, we just put results of one query after results of the other one. Because of this, the order of Peers in the final result might be slightly different on each fetch - however, this order was never specified and could actually change e.g. when the control connection changes, so I think this should be OK.

There are other functions in topology.rs that use query_all, but they are less complicated and changing them to use query_iter is relatively straightforward. Therefore, we do it in a single commit.

This method is a sibling of query_all that was never actually used. In the future we will consider switching to prepared statements for internal queries, but 1) this will be done via not-yet-existing execute_iter and 2) we might do it by converting query_iter to use prepared statements. Therefore, there is no need to keep the dead code. This commit gets rid of execute_all and its all occurrences in the connection_query_all_execute_all_test (now renamed to: connection_query_all_test).

The query_iter method is meant to serve similar purpose to what query_all does currently - fetch data on control connection for internal queries - so it makes sense to adapt the (only) existing test for the new method. The last assertion (4) had to be removed because the new method does not check that the page size is set by the caller.

Now that we got rid of all uses of query_all in the code, we can finally get rid of it and of all the other code that depended on it - most notably, QueryResult::merge_with_next_page_res for which it won't be possible to translate it to the upcoming iterator-based deserialization interface.

piodul · 2023-02-27T10:20:40Z

v3:

renamed new_for_query_iter -> new_for_connection_query_iter
got rid of an unnecessary .clone() in a lambda defined in RowIterator::new_for_connection_query_iter

piodul · 2023-03-02T09:20:54Z

@cvybhu review ping

piodul added 2 commits February 17, 2023 09:44

iterator: introduce new_from_worker_future

02d59ac

A common part of new_for_query and new_for_prepared_statement which handles spawning a task and returns a non-empty RowIterator is extracted to a common function. It will also help implement the query_iter method for Connection.

iterator: introduce send_empty_page method

8ecbd0c

We have two places in the iterator worker code that attempt to send an empty page, and we will have one more in the upcoming commits, so let's abstract this pattern away to a method.

piodul requested review from cvybhu, havaker and wprzytula February 17, 2023 10:55

piodul force-pushed the rework-query-all branch from 7c7c095 to 7f76625 Compare February 18, 2023 08:14

wprzytula requested changes Feb 20, 2023

View reviewed changes

piodul force-pushed the rework-query-all branch from 7f76625 to 310f817 Compare February 20, 2023 14:30

piodul requested a review from wprzytula February 20, 2023 15:38

wprzytula approved these changes Feb 20, 2023

View reviewed changes

wprzytula mentioned this pull request Feb 22, 2023

topology: UDTs hold complete definition of themselves #649

Merged

5 tasks

cvybhu reviewed Feb 26, 2023

View reviewed changes

piodul force-pushed the rework-query-all branch from 310f817 to d1affd5 Compare February 27, 2023 10:17

piodul added 11 commits February 27, 2023 11:18

connection: introduce query_iter

6ae5870

The new Connection::query_iter method will serve as a replacement for the query_all method. Contrary to query_all, query_iter will not materialize all query results as a single QueryResult but will allow to process the results row by row.

topology: propagate Arc to query_peers

1a28f25

In order to use Connection::query_iter it is neccessary to pass self wrapped in an Arc - the iterator worker is spawned as a separate tokio task, so it needs to co-own the Connection.

topology: replace bool with NodeInfoSource enum

2e36d6a

An enum expresses intent a little more clearly than a bool.

topology: get rid of other uses of query_all

b8b19a7

There are other functions in topology.rs that use query_all, but they are less complicated and changing them to use query_iter is relatively straightforward. Therefore, we do it in a single commit.

piodul force-pushed the rework-query-all branch from d1affd5 to 012caa6 Compare February 27, 2023 10:19

piodul requested a review from cvybhu February 28, 2023 10:51

cvybhu approved these changes Mar 2, 2023

View reviewed changes

cvybhu merged commit 739c4f8 into scylladb:main Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Connection::query_all with Connection::query_iter #645

Replace Connection::query_all with Connection::query_iter #645

piodul commented Feb 17, 2023

piodul commented Feb 17, 2023 •

edited

Loading

wprzytula left a comment

piodul commented Feb 20, 2023

piodul commented Feb 20, 2023

cvybhu left a comment

cvybhu Feb 26, 2023

piodul Feb 27, 2023

piodul commented Feb 27, 2023

piodul commented Feb 27, 2023

piodul commented Mar 2, 2023

Replace Connection::query_all with Connection::query_iter #645

Replace Connection::query_all with Connection::query_iter #645

Conversation

piodul commented Feb 17, 2023

Pre-review checklist

piodul commented Feb 17, 2023 • edited Loading

wprzytula left a comment

Choose a reason for hiding this comment

piodul commented Feb 20, 2023

piodul commented Feb 20, 2023

cvybhu left a comment

Choose a reason for hiding this comment

cvybhu Feb 26, 2023

Choose a reason for hiding this comment

piodul Feb 27, 2023

Choose a reason for hiding this comment

piodul commented Feb 27, 2023

piodul commented Feb 27, 2023

piodul commented Mar 2, 2023

piodul commented Feb 17, 2023 •

edited

Loading