Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank node skolemization #6

Open
mielvds opened this issue Dec 2, 2015 · 11 comments
Open

Blank node skolemization #6

mielvds opened this issue Dec 2, 2015 · 11 comments
Assignees

Comments

@mielvds
Copy link
Contributor

mielvds commented Dec 2, 2015

HDT files with blank nodes generate invalid turtle because of blank nodes. Can we create a skolemized Model somehow?

@RubenVerborgh
Copy link
Member

On the one hand, this seems to be a problem with the HDT library not correctly converting blank nodes to the corresponding Jena representation. On the other hand, the TPF spec clearly says that components must not be blank nodes, so we should indeed skolemize them in any case, like the JavaScript implementation does.

@mielvds
Copy link
Contributor Author

mielvds commented Dec 2, 2015

This might help, although reprocessing all nodes doesn't seem very performant.

@RubenVerborgh
Copy link
Member

We should be able to do the same as in the JavaScript code by just changing this function (probably on the base class level even).

@mielvds
Copy link
Contributor Author

mielvds commented Dec 2, 2015

Right, that would be fairly easy, but it would be datasource specific though... Another option would be to create a decorator for https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/riot/WriterGraphRIOT.html

@RubenVerborgh
Copy link
Member

Well, in the JavaScript version, it's implemented on the base class, so not source-specific. I still think this is possible. The additional complexity here is that dictionary.getNode seems to have a bug so that it does not return blank nodes but IRIs that start with _:. The generic solution would be to work around that, and then everything works with a generic base method (or RIOT decorator). But then we have some performance loss, because it would first (incorrectly) convert to IRI, then to blank, then to IRI again. So it might be best to have a one-off solution here for performance.

@mielvds
Copy link
Contributor Author

mielvds commented Dec 2, 2015

So we'll have to improve the java HDT code no matter what, which gives us the opportunity to move to Jena 3

mielvds pushed a commit that referenced this issue Jan 9, 2016
merging in latest changes from origin
@larsgsvensson
Copy link
Contributor

larsgsvensson commented Jan 10, 2019

Hi all, I just stumbled over this and it still seems to be an issue. I'm working on some other things in Server.java (including support for quad formats) so if you can give me any hints on how to fix this issue, I can give it a try.
A related note: The TPF specification says that bNodes SHOULD be skolemized, not that it is mandatory. Does anyone here know if e. g. comunica requires bNodes to be skolemized?
And for TPF to work with bNodes, a TPF server MUST have bNode identifiers that are consistent over consecutive requests, I don't think that's explicit in the spec.
Thanks,
Lars

@larsgsvensson
Copy link
Contributor

larsgsvensson commented Jan 18, 2019

As per comunica/comunica#375 the spec now says that data triples MUST NOT contain blank nodes and that the RECOMMENDED way of removing them is skolemization.

@larsgsvensson
Copy link
Contributor

Is there such a thing as a conformance test suite for TPF servers?

@RubenVerborgh
Copy link
Member

Not yet unfortunately, but that would indeed be very nice to have.

@mielvds
Copy link
Contributor Author

mielvds commented Jan 18, 2019

Sounds like fun. You can assign it to me, I'll try it as friday afternoon thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants