Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEV-1096 Rights API: translate OFFSET queries into inequalities #13

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

moseshll
Copy link
Contributor

@moseshll moseshll commented Mar 29, 2024

  • Add Cursor class inspired by Solr's CursorToken.
  • New Order class that should support some cleaner logic, push the Sequel primitives closer to point of use.
  • Reviewer: I am interested in whether the comments in cursor.rb make any kind of sense. Not interested at this point in testing, integration, or bigger-picture stuff because this PR is a hot mess that needs to be reissued.

- Add `QueryOptimizer` class which operates mainly on a `QueryParser` and emits a modified `offset` and additional `where` clauses.
- Add `Cache` class to record last model object found, from which the new `where` can be constructed.
- `default_order` changed to Array and always applied even when redundant, enforcing an intrinsic order.

FIXME:
- When items age out of the cache, removal is O(n) operation.
- `QueryOptimizer` unit tests are a dismal placeholder.
@moseshll moseshll marked this pull request as draft March 29, 2024 15:38
@moseshll moseshll marked this pull request as ready for review April 16, 2024 13:44
@moseshll moseshll requested a review from aelkiss April 16, 2024 13:44
Copy link
Member

@aelkiss aelkiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd like to back up and rethink this some. Looking at the design decisions here I think the assumption was that we needed to keep the API the same (being able to request limit and offset).

Would it simplify things and avoid needing to use the cache or a separate query optimizer class if instead we changed the API so that you specified a number of results and an ID to start with?

Basically, we'd just end up with the SQL query:

SELECT * from rights_current WHERE namespace >= $LAST_NAMESPACE AND id > $LAST_ID ORDER by namespace, id, timestamp

(For rights_log we might want to consider just always sending all rows for a given ID regardless of page size, or we could in addition send the last timestamp as a parameter).

Let me know what you think or if you'd like to discuss, or if I've completely misunderstood this...

spec/unit/models/access_profile_spec.rb Show resolved Hide resolved
spec/integration/rights_spec.rb Show resolved Hide resolved
lib/rights_api/query_optimizer.rb Outdated Show resolved Hide resolved
@moseshll moseshll marked this pull request as draft April 23, 2024 21:04
@moseshll moseshll requested a review from mwarin June 14, 2024 19:12
Copy link

@mwarin mwarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't approve this documentation for cursor.rb. I barely understand it, so I'm requesting some non-specific changes.

I think you need to simplify the documentation significantly. You have focused on the low-level detail, and you need more context for the details to make sense.

I think you should put a CONTEXT block up top, that states what this class is what does, what the reader needs to know in order to use it. Then details such as "ivar semantics" and "cursor lifecycle" can follow.

I'd be happy to work with you on that, if you want.

Comment on lines +67 to +70
# ORDER BY a, b, c TRANSLATES TO
# WHERE (a > 1)
# OR (a = 1 AND b > 2)
# OR (a = 1 AND b = 2 AND c > 3)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see this, in more detail, up top in the documentation.
Because this is the meat and potatoes of this class, no?

With an actual 5-ish line example using the code...

# Example:
require "rights_api/cursor"
cursor = RightsAPI.Cursor.new("x y z")
# ...
cursor.where(x,y,z) # -> [x OR y OR z]

# @param order [Array<RightsAPI::Order>] the current query's ORDER BY
# @param rows [Sequel::Dataset] the result of the current query
# @return [String]
def encode(order:, rows:)
Copy link

@mwarin mwarin Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get sweaty seeing def encode and def self.encode in the same class.
If both methods have to be called encode I would mention the justification for that.
If they don't have to, I would name them more differently in order to avoid operator errors down the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants