-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEV-1096 Rights API: translate OFFSET queries into inequalities #13
base: main
Are you sure you want to change the base?
Conversation
- Add `QueryOptimizer` class which operates mainly on a `QueryParser` and emits a modified `offset` and additional `where` clauses. - Add `Cache` class to record last model object found, from which the new `where` can be constructed. - `default_order` changed to Array and always applied even when redundant, enforcing an intrinsic order. FIXME: - When items age out of the cache, removal is O(n) operation. - `QueryOptimizer` unit tests are a dismal placeholder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd like to back up and rethink this some. Looking at the design decisions here I think the assumption was that we needed to keep the API the same (being able to request limit and offset).
Would it simplify things and avoid needing to use the cache or a separate query optimizer class if instead we changed the API so that you specified a number of results and an ID to start with?
Basically, we'd just end up with the SQL query:
SELECT * from rights_current WHERE namespace >= $LAST_NAMESPACE AND id > $LAST_ID ORDER by namespace, id, timestamp
(For rights_log we might want to consider just always sending all rows for a given ID regardless of page size, or we could in addition send the last timestamp as a parameter).
Let me know what you think or if you'd like to discuss, or if I've completely misunderstood this...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't approve this documentation for cursor.rb. I barely understand it, so I'm requesting some non-specific changes.
I think you need to simplify the documentation significantly. You have focused on the low-level detail, and you need more context for the details to make sense.
I think you should put a CONTEXT block up top, that states what this class is what does, what the reader needs to know in order to use it. Then details such as "ivar semantics" and "cursor lifecycle" can follow.
I'd be happy to work with you on that, if you want.
# ORDER BY a, b, c TRANSLATES TO | ||
# WHERE (a > 1) | ||
# OR (a = 1 AND b > 2) | ||
# OR (a = 1 AND b = 2 AND c > 3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see this, in more detail, up top in the documentation.
Because this is the meat and potatoes of this class, no?
With an actual 5-ish line example using the code...
# Example:
require "rights_api/cursor"
cursor = RightsAPI.Cursor.new("x y z")
# ...
cursor.where(x,y,z) # -> [x OR y OR z]
# @param order [Array<RightsAPI::Order>] the current query's ORDER BY | ||
# @param rows [Sequel::Dataset] the result of the current query | ||
# @return [String] | ||
def encode(order:, rows:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get sweaty seeing def encode
and def self.encode
in the same class.
If both methods have to be called encode
I would mention the justification for that.
If they don't have to, I would name them more differently in order to avoid operator errors down the line.
Cursor
class inspired by Solr's CursorToken.Order
class that should support some cleaner logic, push theSequel
primitives closer to point of use.cursor.rb
make any kind of sense. Not interested at this point in testing, integration, or bigger-picture stuff because this PR is a hot mess that needs to be reissued.