-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query paging API: hardening, robustness, explicitness #1061
Conversation
See the following report for details: cargo semver-checks output
|
0966edf
to
76b1fae
Compare
009283e
to
ee73b22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits, otherwise LGTM
ee73b22
to
a7aa56f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the first commit. Using invalid values for page size was never shown to be a problem / footgun for our users, so advantages from strong typing are quite limited. Otoh it introduces unnecessary complications in each call to set_page_size.
Imo in case of such obscure error, which also implies a bug in user code, a panic wuld be better - we still catch the error, but don't introduce complications for users.
So let's drop PageSize type and perform panicking check in set_page_size methods.
PR description is out of date - it mentions PagingContinuation
.
In checklist you marked that All commits compile, pass static checks and pass test.
- is that really the case? I would expect commit session: make {query,execute} methods unpaged
to break something if it really prevents usage of unpaged queries. If it doesn't break any test then it sounds like our tests are lacking.
I suggest a slightly different thing: let's make |
Good catch. Let's see if any test breaks. |
Makes sense |
Sad news... everything has passed. On a second thought, it's not strange that we haven't been testing improper behaviour. After all, |
a7aa56f
to
6248a1c
Compare
v1.1: addressed all Karol's comments apart from PageSize panics, which is TODO. |
6248a1c
to
0fa3118
Compare
f5b1839
to
a139ca5
Compare
v 1.2.1: mentioned that setters panic with non-positive numbers given. |
Before, a bare i32 was passed. Some API methods would accept nonpositive integers, while others would panic on such input. Both those kinds of behaviour are bad, and the inconsistency between them is even worse. For robustness the popular Rust tactics is employed, namely strong typing. PageSize's public constructor returns error for nonpositive integers, ensuring that is an instance of that type exists, its validity is guaranteed. I originally wanted to put burden of creating a correct PageSize and handle possible errors on user, but is would introduce a boilerplate (`try_into().unwrap()`) that is deemed to be unacceptable. Therefore, PageSize is kept pub(crate) and it's constructed by driver's code, panicking on invalid user's input. As page size is always chosen by user explicitly (not by some convoluted logic), such panics should appear early and be clear to users.
The value of everything being empty does not seem to be a reasonable default value for both QueryResult and ResultMetadata. It's better to have a mock_empty() constructor for the specific case when such special value is needed.
As an effort of more robust paging API, paging state is now made typed. Two new types are introduced: PagingState and PagingStateResponse. PagingState is, underneath, just an Arc over the raw bytes from the DB. The idea is that PagingState is passed as an argument to request execution API to ask for resuming the query (_continue it_) from the given PagingState. PagingStateResponse is returned always by the DB from paged requests - with information about either more or no more pages available. PagingState can be easily retrieved from PagingStateResponse (if it contains some), so that user can use it in a loop. The select_paging example is modernised to showcase the idea.
As users' experience showed that it optional page size on statements is error-prone, it is made mandatory. This means that at the moment no unpaged queries are possible. However, in the next commits it is going to be brought back again.
Some methods on Connection return QueryResponse instead of QueryResult. To make those methods stand out, they get "_raw" particle put into their names.
Without a clear reason, Connection::execute_raw would take PreparedStatement by value, involving a clone. The signature was changed to accept a shared reference and its usages are adjusted.
These methods are analogous to `Session`'s `{query,execute}`. Similarly, they don't accept a non-start PagingState.
The logic of {query,execute} is extracted to {query_inner,execute_inner} pub(crate) methods, respectively. {query,execute}_paged are unchanged. {query,execute} now are unpaged unconditionally, i.e. they ignore the page size set on a statement and pass None to the connection layer. In the next commit, both {query,execute} are appended the `_unpaged` suffix to explicitly state that they perform unpaged queries.
In order to make it explicit that the requests using those methods are performed without paging, their names are adjusted.
As these methods are analogous to those on Session, they are made unpaged in a manner similar to those. Inthe next commit, both all of them are appended the `_unpaged` suffix to explicitly state that they perform unpaged queries.
In order to make it explicit that the requests using those methods are performed without paging, their names are adjusted.
In no case for internal queries should we fetch only one page and ignore possible further ones. As both queries that were using query_single_page would return only one row anyway, the change should not affect semantics after all.
The name %_paged has long confused users. The new name indicates explicitly that only a single page is fetched with a single call to those methods.
As an attempt to make paged queries' API more explicit, robust and self-explanatory, PagingState is decoupled from QueryResult. Instead, it's returned from {query,execute}_single_page as a second field of a pair, helping users not to forget about it. Moreover, internal and external APIs which expect that the query is not paged now issue an error tracing message and return ProtocolError error if a "more pages" page state is returned from the unpaged query.
It appears that query_single_page's previous semantics was to only fetch the first page and ignore others. It makes much more sense, though, to have its semantics consistent with Session::query_single_page: to take PagingState and query a single page pointed by that state. This allows using it in a loop, similarly to Session::query_single_page.
As {query,execute} methods got the _unpaged prefix, now the inner methods can drop the _inner prefix. This gives us one subtle advantage: users now see "method {query,execute} is private" instead of "{query,execute} method not found", which makes them more likely to think that the API was changed on purpose. Assuming that users see docstrings of private methods when they hover a call to them, this also lets them read the new docstrings of {query,execute} that explain the reasons behind the change and point to proper public methods: {query,execute}_{unpaged,single_page,iter}.
After the API changes regarding paging, docstrings are updated with those changes in mind.
a139ca5
to
a1286de
Compare
v1.2.2: fully removed |
Motivation
Query paging API has shown to be unclear for users:
{query,execute}_paged
had misleading name (query_paged is a misleading name #940),paging_state
was a magicBytes
instance (Make paging state strongly typed #987),{query,execute}
would silently return only the first page if page size was specified on a statement,{query,execute}
as the methods with the simplest names, even though they should be used with SELECTs with caution:paging_state
, as it was just a field onQueryResult
,{query,execute}_paged
.This PR seeks to address those problems before we attack the Giant Request Execution Refactor™ (GRER™) (#978).
What's done
The core idea incorporated:
Shift responsibility about whether to page or not from statements (by setting
page_size
toSome
orNone
) toSession's
methods.Separate paged and unpaged API promotes conscious choice by users and thus prevents API misuse.
Detailed changes
PageSize
is introduced, which restricts the inneri32
's range to positive integers. This is important, because passing nonpositive integer as a page size silently makes the query unpaged. All APIs now acceptPageSize
, and it's user's responsibility to create aPageSize
, handling possible errors thrown upon its construction.PagingState
andPagingStateResponse
.PagingState
is, underneath, just anOption<Arc>
over the raw bytes from the DB. The idea is thatPagingState
is passed as an argument to request execution API to ask for resuming the query (continuing it) from the point specified by givenPagingState
.PagingStateResponse
is returned always by the DB from paged requests - with information about either more or no more pages being available.PagingStateResponse
is easily convertible intoPagingState
, so that user can use it in in a loop.PageSize
is made mandatory on statements. Henceforth, paged API will always use it and unpaged API will always ignore it.{query,execute}
methods have_unpaged
prefix appended and they always perform an unpaged query (ignoring thePageSize
specified on a statement).{query,execute}_paged
are renamed to{query,execute}_single_page
. The name%_paged
has long confused users. The new name indicates explicitly that only a single page is fetched with a single call to those methods.{query,execute}_single_page
pass page state explicitly. As an attempt to make paged queries' API more explicit, robust and self-explanatory,PagingStateResponse
is decoupled fromQueryResult
. Instead, it's returned from{query,execute}_single_page
as a second field of a pair, helping users not to forget about it.Moreover, internal and external APIs which expect that the query is not paged now warn if a "more pages" page state is returned from a query.
select-paging
example is modified to showcase the suggested way to use{query,execute}_single_page
.Fixes: #940
Fixes: #987
Pre-review checklist
[ ] I added relevant tests for new features and bug fixes../docs/source/
.Fixes:
annotations to PR description.