-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform inventory fetch queries in batches #4932
Conversation
@@ -1939,18 +2047,6 @@ mod test { | |||
assert_eq!(collection2.cabooses.len(), coll_counts.cabooses); | |||
assert_eq!(collection2.rot_pages.len(), coll_counts.rot_pages); | |||
|
|||
// Check that we get an error on the limit being reached for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a test that uses a batch size of 1 and makes sure we get the same result for some reasonably representative collection? (That way we know the pagination stuff is working.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, if we follow your advice from above about not making the batch size
configurable, we'd have to factor out a private function that takes a batch size
parameter...do you think it's worth doing that just for testing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that's easy enough to do and worthwhile. I've been burned by a surprising number of pagination bugs in the past!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good! Were you also going to add the new test?
Yup, I just didn't get to it yesterday. Planning to add that today! |
Oops -- didn't mean to jump the gun! |
It just set the default batch size, which we now do with the public API.
02f39d2
to
abac2a0
Compare
@davepacheco added a test as of abac2a0 --- let me know what you think? |
When querying the database for inventory data, we read a large number of
rows from a lot of different tables. Therefore, we want to ensure that
there is always an upper bound on the amount of data read per query.
Currently, an upper bound is ensured using limits: we add a limit to
the query, and if more rows would be returned, we fail the query. This
means that we can ensure an upper bound. However, the issue with this is
that the behavior when the upper bound is reached is to fail --- this
means that it's possible to select an upper bound that's too small, and
upon hitting that upper bound, we just fail...and there's no way to know
whether the selected limit is big enough until we hit it at runtime.
Instead of using hard limits, it's preferable to use pagination. By
paginating these queries, we still set an upper bound on the number of
rows returned by a single query, but rather than failing if that bound
is reached, we instead perform multiple queries.
This branch changes inventory fetch queries to use pagination with a
batch size limit, rather than a hard limit on query size, using the
pagination utilities added in #4632. I've replaced the
inventory_collection_read_best_effort
andinventory_collection_read_all_or_nothing
methods, which limit the sizeof the query, with a new
inventory_collection_read_batched
method,which uses the limit parameter as a maximum size per batch. I've then
changed the existing
inventory_get_latest_collection
method to useinventory_collection_read_batched
rather thaninventory_collection_read_all_or_nothing
.Closes #4629