-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import death factoids to RDF data #1
Comments
Factoid data attached |
The fact of the deaths themselves are already in the database; here we are parsing and adding the date information. We can discuss the details further on Wednesday, and make notes in this issue. |
The spreadsheet with death records has been updated with sources on which I based the datings where my name is the authority. Therefore, the file from 21.11.2023 has been updated to the file named "C11 PBW Death records, AA_revised version_09.01.2024." xlsx, accessible here https://ucloud.univie.ac.at/index.php/f/797833040 |
Report from @lu-pl 💯 Note that some SPARQL queries return empty, in which case no RDF is generated. See the logs. |
Update: Implemented the missing P14 assertions, see output. |
Some of these are expected (where they are based on sources that we ended up not using), but others have to do with the fact that the For sanity-checking purposes, it might be helpful to keep a list of the sources we aren't using; these include |
Update: Parenthetical text in Name fields gets ignored now and unused Source values are skipped (see the log). The script now generates a trig file deaths.trig with a named graph for every table partition. I also investigated the empty queries, some of those were caused by typos or incomplete PBW strings in the tables. For the remaining empty queries in most cases the PBW data is missing in the triplestore, so I don't really know what to do about that. |
Note: I would like to/will port the metadata schema used in the r11cli application to the table conversion at some point, if that is alright. |
I've now looked at the empty queries, which have three causes:
|
I forgot the fourth case, which was a death record for Symbatios 101 from Iveron 2.178.5; this is from a document in the Iveron archive that was produced in 1098, which is past our cutoff point of 1095. |
All empty query cases are handled now (see logs and I updated the script to the new metadata schema. The way this is impemented now, a named named + metadata is generated for every table partition, see deaths.trig. Another option would be to merge all graphs in to a single named graph and generate metadata only for that graph. |
note: Metadata of course gets generated only once for every software execution, but every named graph is registered as being an output of that software execution, see the metadata graph. |
The script now produces a single turtle file with all subgraphs merged, see deaths.ttl. I had to slightly modify the metadata schema, metadata assertions are now pointing to E13 subject nodes instead of named graphs along L11_had_output. Since the range of L11 is D1_Digital_Object this implies (and a reasoner would inference) that E13 assertions are D1s i.e. E73_Information_Objects - which is not wrong but maybe something worth pointing out. |
Meeting notes: Lukas has changed the metadata schema, which Tara will put on the Graph database. A new issue might be necessary for converting all old metadata into new metadata schema. |
Ingested deaths data to https://r11.eu/rdf/resource/deaths. |
Note: Consolidation/merging of named graphs into another named graph can be automated using SPARQL update (INSERT) requests. This should be implemented in r11cli. edit: DROPing a named graph would not be reflected in the merged graph though, so one would need to SPARQL the merged triples out of target graph before deleting the named graph! delete { ?s ?p ?o . }
where {
graph <named_graph> {
?s ?p ?o .
}
}
drop graph <named_graph> |
Hi @lu-pl , concerning the metadata schema, I've just noticed a problem with the timestamps...
The first issue is that
|
hi @tla, the metadata issue should be fixed, see deaths.ttl. LODKit now has a feature for Ontology derived ClosedNamespaces, so at least typos won't be an issue anymore. |
No description provided.
The text was updated successfully, but these errors were encountered: