-
Notifications
You must be signed in to change notification settings - Fork 34
Oort Extracting Modules
This details how to extract subsets of external ontologies (modules) using the OWLAPI SyntacticLocalityModuleExtractor using OWLTools
This functionality is not yet in Oort - it's necessary to use owltools on the command line
To fetch a class (or classes) and all descendants:
owltools http://purl.obolibrary.org/obo/cl.owl --extract-module -d CL:0000540 -o neuron.owl
Assume as a starting point you have an ontology my-edit.owl (or in obo format). Assume this has axioms that point to an ontology 'foo':
E.g.
Ontoogy: my.owl
Class: FOO:1
Class: MY:1
SubClassOf: part_of some FOO:1
Note that external classes must be declared if they are not imported.
In obo format you may have something like:
ontology: my
[[Term]]
id: MY:1
relationship: part_of FOO:1
in obo-format parlance, this is known as a 'dangling' relationship, as FOO:1 is not in the file.
To generate a module "foo_import.owl" do this:
export OBO=http://purl.obolibrary.org/obo
owltools my-edit.owl $OBO/foo.owl --add-imports-from-supports --extract-module -c -s $OBO/foo.owl --set-ontology-id $OBO/my/foo_import.owl -o foo_import.owl
Remember, in owltools, commands are handled sequentially - a single line such as this can be an entire pipeline chain. The first part of the above chain attaches foo.owl as an import to my-edit.owl.
The second part uses the OWLAPI to extract a module (using BOT strategy by default) using all classes in the signature of my-edit (but not it's import closure) as seed. This module becomes the new source ontology.
The final part renames the ontology and saves a local copy.
You can now add an imports directive to my-edit.owl. You may also want to manually add an entry to your catalog
Assume now your ontology looks like this:
Ontoogy: my.owl
Imports: http://..../my/import_foo.owl
Class: MY:1
SubClassOf: part_of some FOO:1
Assume also you have a catalog entry like this:
<uri id="User Entered Import Resolution" name="http://purl.obolibrary.org/obo/my/foo_import.owl" uri="foo_import.owl"/>
Remember, with owltools you can specify this using --catalog-xml FILE or simply --use-catalog to use the default.
What happens if you add new FOO classes to MY, or you simply want to regenerate foo_import based on changes made to FOO-central?
The challenge here is that you are already importing a possibly stale subset of FOO - you don't want this to be included. You could simply re-bootstrap. However, an alternate method is to map the IRI of your import to the central location of the ontology:
owltools --use-catalog --map-ontology-iri $OBO/my/foo_import.owl $OBO/foo.owl my-edit.owl -extract-module -c -s $OBO/foo.owl --set-ontology-id $OBO/my/foo_import.owl -o foo_import.owl
One method is to include the IRI in my-edit.owl, but this is awkward.
Classes can be hacked in to foo_import.owl, but this also is less than ideal.
Another method is to keep a file foo_seed.owl in a hackable syntax like manchester or obo:
Class: FOO:5
Class: FOO:6
Then just merge this in to make your seed set and regenerate the import:
owltools --use-catalog --map-ontology-iri $OBO/my/foo_import.owl $OBO/foo.owl my-edit.owl foo_seed.owl --merge-support-ontologies -extract-module -c -s $OBO/foo.owl --set-ontology-id $OBO/my/foo_import.owl -o foo_import.owl
Of course, the ideal way to do this would be via a Protege plugin (TODO: investigate existing plugins)
The above commands can be included in a Makefile to allow for easier execution and dependency management
It is recommended the import modules are saved as RDF/XML OWL in order to maximize interoperability. This can be very verbose, especially where axioms annotations are concerned. These may not be required in the importing ontology.
Additional commands can be added to the chain above to trim down the axioms included. For example, before the "--set-ontology-id" command, you can do
--extract-mingraph
to get a minimal set of axioms (labels and SubClassOf)
or
--remove-annotation-assertions -l
to remove all annotation assertions, preserving labels
Naming/URI conventions for import modules have yet to be standardized.
Sometimes the import modules go in a subdir called "imports", so the URI looks like .../obo/my/imports/foo_import.owl".
In general, all ontology IRIs should use lowercase as specified in id-policy - e.g. "foo_import" rather than "FOO_import".
Sometimes ontologies will have mutual dependencies - e.g. uberon, go, cl. TODO docs
Oort should be extended to handle the full cycle
The module type can be controlled by adding
-m MODULE-TYPE
to the arguments. See ModuleType (OWLAPI docs)
The default is BOT (upper modules). In practice this works best for most Bio-ontologies supported by Oort at this time. Please consult the relevant papers for a formal treatment, but for practical purposes, this strategy includes everything reachable via the seed set across EL-type constructs. Experience shows this is generally sufficient for classification purposes for GO, CL, Uberon and other ontologies with similar levels of axiomatization.