Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encapsulate query construction logic in QueryConstructor class #134

Closed
lu-pl opened this issue Nov 7, 2024 · 3 comments · Fixed by #175
Closed

Encapsulate query construction logic in QueryConstructor class #134

lu-pl opened this issue Nov 7, 2024 · 3 comments · Fixed by #175
Assignees
Labels
refactor Implementation changes that preserve external interfaces and the behavior of the software.

Comments

@lu-pl
Copy link
Contributor

lu-pl commented Nov 7, 2024

Dynamic SPARQL query construction is essential for how rdfproxy works and also the (conceptually) most difficult and potentially error-prone part of the library.

Basically, the SPARQL adapter has to construct two queries, an items_query and a count_query - exactly how those queries are constructed and what modifications and injections have to happen depends on the query and on the supplied model.

Currently, the logic for query construction is expressed as a set of functions in rdfproxy.utils.sparql_utils.

Query construction logic should utilize a protocol-based constructor dispatch mechanism and could be bundled in a QueryConstructor class.

sparqlmodeladapter drawio

This would significantly help with keeping the query modification logic organized and also allow for much cleaner code in rdfproxy.SPARQLModelAdapter, e.g.:

class SPARQLModelAdapter(Generic[_TModelInstance]):
    # ...
    def query(self, *, page: int = 1, size: int = 100) -> Page[_TModelInstance]:
        query_constructor = QueryConstructor(query=self._query, model=self._model)

        count_query: str = query_constructor.get_count_query()
        items_query: str = query_constructor.get_items_query()

        # ...
@lu-pl lu-pl self-assigned this Nov 7, 2024
@lu-pl lu-pl added the refactor Implementation changes that preserve external interfaces and the behavior of the software. label Nov 7, 2024
@lu-pl
Copy link
Contributor Author

lu-pl commented Nov 7, 2024

Depends on #125.

@lu-pl
Copy link
Contributor Author

lu-pl commented Nov 9, 2024

Quick sketch of the general query construction and constructor dispatch mechanism:

constructor drawio

@lu-pl lu-pl mentioned this issue Nov 25, 2024
3 tasks
@lu-pl
Copy link
Contributor Author

lu-pl commented Nov 28, 2024

Note: Probably only normalizers should run in QueryConstructor, checkers are another responsibility that does not only concerns queries but e.g. also models.

Sketch:

class QueryConstructor:
    ...

    @staticmethod
    def normalize_select_query(select_query: str) -> str:
        normalized_select_query = re.sub(
            r"([{}.,;()?])", r" \1", select_query, flags=re.IGNORECASE
        )
        return " ".join(normalized_select_query.split())

lu-pl added a commit that referenced this issue Dec 5, 2024
Currently, query constructors naively inject the 'group_by' value from
the model_config into SPARQL queries; this works until a field name does not correspond to a SPARQL binding name i.e. until rdfproxy.SPARQLBinding aliases are used.

So, field-based grouping obviously requires 'group_by' values to
resolve SPARQL binding aliases - which this change implements.

Note: The change introduces duplication for resolving SPARQL binding
aliases. The planned QueryConstructor class (issue #134) will get rid of
this and other design flaws.

Fixes #161
lu-pl added a commit that referenced this issue Dec 5, 2024
Currently, query constructors naively inject the 'group_by' value from
the model_config into SPARQL queries.
This works until a field name does not correspond to a SPARQL binding
name, i.e. until rdfproxy.SPARQLBinding aliases are used.

So, field-based grouping obviously requires 'group_by' values to
resolve SPARQL binding aliases - which this change implements.

Note: The change introduces duplication for resolving SPARQL binding
aliases. The planned QueryConstructor class (issue #134) will get rid of
this and other design flaws.

Fixes #161
kevinstadler pushed a commit that referenced this issue Dec 5, 2024
Currently, query constructors naively inject the 'group_by' value from
the model_config into SPARQL queries.
This works until a field name does not correspond to a SPARQL binding
name, i.e. until rdfproxy.SPARQLBinding aliases are used.

So, field-based grouping obviously requires 'group_by' values to
resolve SPARQL binding aliases - which this change implements.

Note: The change introduces duplication for resolving SPARQL binding
aliases. The planned QueryConstructor class (issue #134) will get rid of
this and other design flaws.

Fixes #161
@kevinstadler kevinstadler added this to the v0.2.0 release milestone Dec 18, 2024
lu-pl added a commit that referenced this issue Dec 20, 2024
The change introduces a QueryConstructor class that encapsulates all
SPARQL query construction functionality. This also leads to a significant
cleanup of rdfproxy.utils.sparql_utils module (utils in general) and the
SPARQLModelAdapter class.

Query result ordering for ungrouped models is now implemented to
default to the first binding of the projection as ORDER BY value. This
might still be discussed in the future, but the decision seems
reasonable at this point.

Closes #128.
Closes #134.
Closes #168.
lu-pl added a commit that referenced this issue Dec 23, 2024
The change introduces a QueryConstructor class that encapsulates all
SPARQL query construction functionality. This also leads to a significant
cleanup of rdfproxy.utils.sparql_utils module (utils in general) and the
SPARQLModelAdapter class.

Query result ordering for ungrouped models is now implemented to
default to the first binding of the projection as ORDER BY value. This
might still be discussed in the future, but the decision seems
reasonable at this point.

Closes #128.
Closes #134.
Closes #168.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Implementation changes that preserve external interfaces and the behavior of the software.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants