Skip to content

Commit

Permalink
Merge pull request #1 from biolink/doc_updates
Browse files Browse the repository at this point in the history
make the infores catalog itself downloadable and viewable from homepa…
  • Loading branch information
sierra-moxon authored Mar 8, 2024
2 parents a3d59a7 + 0ba2f18 commit 1b791fd
Show file tree
Hide file tree
Showing 20 changed files with 216 additions and 329 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/pr-codespell.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,5 @@ jobs:
- name: Codespell
uses: codespell-project/actions-codespell@v1
with:
ignore_words_list: brite,BRITE
skip: .idea,.venv,.git,*.pdf,*.svg,context.*,project
ignore_words_list: amination,ehr,mor,nin,brite,mirgate,BRITE
skip: SEMMEDDB*,semmed*,.idea,.venv,.git,*.pdf,*.svg,context.*
1 change: 0 additions & 1 deletion .github/workflows/pr-verify-pull-request.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ jobs:
run: |
make test
poetry run codespell
poetry run yamllint -c .yamllint-config biolink-model.yaml
poetry run yamllint -c .yamllint-config infores_catalog.yaml
make validate_infores
Expand Down
6 changes: 5 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,11 @@ plugins:
nav:
# - Home: home.md
- Overview: index.md
- Infores Identifier Registry: information-resource-registry.md
- View the Infores Catalog: https://biolink.github.io/information-resource-registry/infores_catalog.yaml
- Download the Infores Catalog: infores_catalog.yaml
- Implementation Guidance: implementation-guidance.md
- Catalog Metamodel Details: biolink-infores-parameters.md
- Addition information: appendices.md
site_url: https://biolink.github.io/information-resource-registry
repo_url: https://github.com/biolink/information-resource-registry
extra_css:
Expand Down
2 changes: 1 addition & 1 deletion project/information_resource_registry.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Auto generated from information_resource_registry.yaml by pythongen.py version: 0.0.1
# Generation date: 2024-03-08T19:35:06
# Generation date: 2024-03-08T12:38:21
# Schema: Information-Resource-Registry-Schema
#
# id: https://w3id.org/biolink/information_resource_registry.yaml
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"comments": {
"description": "Auto generated by LinkML jsonld context generator",
"generation_date": "2024-03-08T19:35:05",
"generation_date": "2024-03-08T12:38:19",
"source": "information_resource_registry.yaml"
},
"@context": {
Expand Down
4 changes: 2 additions & 2 deletions project/jsonld/information_resource_registry.jsonld
Original file line number Diff line number Diff line change
Expand Up @@ -612,9 +612,9 @@
],
"metamodel_version": "1.7.0",
"source_file": "information_resource_registry.yaml",
"source_file_date": "2024-03-08T19:34:18",
"source_file_date": "2024-03-08T12:18:13",
"source_file_size": 7890,
"generation_date": "2024-03-08T19:35:05",
"generation_date": "2024-03-08T12:38:19",
"@type": "SchemaDefinition",
"@context": [
"project/jsonld/information_resource_registry.context.jsonld",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@
}
},
"$id": "https://w3id.org/biolink/information_resource_registry.yaml",
"$schema": "https://json-schema.org/draft/2019-09/schema",
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": true,
"description": "A collection of information resources",
"metamodel_version": "1.7.0",
Expand Down
62 changes: 31 additions & 31 deletions project/owl/information_resource.owl.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,10 @@
infores:InformationResourceContainer a owl:Class ;
rdfs:label "InformationResourceContainer" ;
rdfs:subClassOf [ a owl:Restriction ;
owl:minCardinality 0 ;
owl:allValuesFrom infores:InformationResource ;
owl:onProperty infores:information_resources ],
[ a owl:Restriction ;
owl:allValuesFrom infores:InformationResource ;
owl:minCardinality 0 ;
owl:onProperty infores:information_resources ],
linkml:ClassDefinition ;
skos:definition "A collection of information resources" ;
Expand All @@ -39,71 +39,71 @@ biolink:information_resource_registry.yaml.owl.ttl a owl:Ontology ;
infores:InformationResource a owl:Class ;
rdfs:label "InformationResource" ;
rdfs:subClassOf [ a owl:Restriction ;
owl:maxCardinality 1 ;
owl:allValuesFrom infores:KnowledgeLevelEnum ;
owl:onProperty infores:knowledge_level ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:name ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:synonym ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:agent_type ],
[ a owl:Restriction ;
owl:allValuesFrom xsd:string ;
owl:onProperty infores:description ],
[ a owl:Restriction ;
owl:maxCardinality 1 ;
owl:onProperty infores:agent_type ],
owl:allValuesFrom xsd:string ;
owl:onProperty infores:name ],
[ a owl:Restriction ;
owl:allValuesFrom infores:KnowledgeLevelEnum ;
owl:maxCardinality 1 ;
owl:onProperty infores:knowledge_level ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:synonym ],
owl:onProperty infores:knowledge_level ],
[ a owl:Restriction ;
owl:allValuesFrom xsd:string ;
owl:onProperty infores:id ],
owl:onProperty infores:xref ],
[ a owl:Restriction ;
owl:minCardinality 1 ;
owl:allValuesFrom xsd:string ;
owl:onProperty infores:id ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:name ],
owl:onProperty infores:xref ],
[ a owl:Restriction ;
owl:allValuesFrom infores:InformationResourceStatusEnum ;
owl:minCardinality 0 ;
owl:onProperty infores:status ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:maxCardinality 1 ;
owl:onProperty infores:description ],
[ a owl:Restriction ;
owl:allValuesFrom xsd:string ;
owl:onProperty infores:name ],
[ a owl:Restriction ;
owl:maxCardinality 1 ;
owl:onProperty infores:status ],
owl:onProperty infores:agent_type ],
[ a owl:Restriction ;
owl:allValuesFrom infores:AgentTypeEnum ;
owl:onProperty infores:agent_type ],
[ a owl:Restriction ;
owl:maxCardinality 1 ;
owl:minCardinality 1 ;
owl:onProperty infores:id ],
[ a owl:Restriction ;
owl:allValuesFrom xsd:string ;
owl:onProperty infores:xref ],
owl:onProperty infores:synonym ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:maxCardinality 1 ;
owl:onProperty infores:status ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:xref ],
[ a owl:Restriction ;
owl:allValuesFrom xsd:string ;
owl:onProperty infores:synonym ],
[ a owl:Restriction ;
owl:maxCardinality 1 ;
owl:onProperty infores:knowledge_level ],
owl:onProperty infores:id ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:agent_type ],
owl:allValuesFrom infores:InformationResourceStatusEnum ;
owl:onProperty infores:status ],
[ a owl:Restriction ;
owl:maxCardinality 1 ;
owl:minCardinality 0 ;
owl:onProperty infores:description ],
[ a owl:Restriction ;
owl:minCardinality 0 ;
owl:onProperty infores:knowledge_level ],
owl:maxCardinality 1 ;
owl:onProperty infores:name ],
linkml:ClassDefinition ;
skos:altLabel "knowledgebase" ;
skos:definition "A database or knowledgebase and its supporting ecosystem of interfaces and services that deliver content to consumers (e.g. web portals, APIs, query endpoints, streaming services, data downloads, etc.). A single Information Resource by this definition may span many different datasets or databases, and include many access endpoints and user interfaces. Information Resources include project-specific resources such as a Translator Knowledge Provider, and community knowledgebases like ChemBL, OMIM, or DGIdb." ;
Expand Down
32 changes: 16 additions & 16 deletions project/shacl/information_resource_registry.shacl.ttl
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,22 @@ infores:InformationResource a sh:NodeShape ;
sh:closed true ;
sh:description "A database or knowledgebase and its supporting ecosystem of interfaces and services that deliver content to consumers (e.g. web portals, APIs, query endpoints, streaming services, data downloads, etc.). A single Information Resource by this definition may span many different datasets or databases, and include many access endpoints and user interfaces. Information Resources include project-specific resources such as a Translator Knowledge Provider, and community knowledgebases like ChemBL, OMIM, or DGIdb." ;
sh:ignoredProperties ( rdf:type ) ;
sh:property [ sh:datatype xsd:string ;
sh:property [ sh:description "The level of knowledge that supports an edge or node. This is a general categorization of the type of evidence that supports a statement, and is not intended to be a comprehensive description of the evidence. For example, a statement may be supported by a single publication, but that publication may contain multiple types of evidence, such as a computational prediction and a manual curation. In this case, the knowledge level would be \"curated\", and the evidence would be described in more detail in the evidence graph." ;
sh:in ( "curated" "predicted" "text_mined" "correlation" "observed" "other" "mixed" ) ;
sh:maxCount 1 ;
sh:order 6 ;
sh:path infores:knowledge_level ],
[ sh:description "the status of the infores identifier, the default is \"released\"" ;
sh:in ( "released" "deprecated" "draft" "modified" ) ;
sh:maxCount 1 ;
sh:order 0 ;
sh:path infores:status ],
[ sh:datatype xsd:string ;
sh:description "A free-text description of an entity or attribute." ;
sh:maxCount 1 ;
sh:order 5 ;
sh:path rdfs:comment ],
[ sh:datatype xsd:string ;
sh:description "A unique identifier for an entity. Must be either a CURIE shorthand for a URI or a complete URI" ;
sh:maxCount 1 ;
sh:minCount 1 ;
Expand All @@ -39,21 +54,6 @@ infores:InformationResource a sh:NodeShape ;
sh:maxCount 1 ;
sh:order 1 ;
sh:path rdfs:label ],
[ sh:datatype xsd:string ;
sh:description "A free-text description of an entity or attribute." ;
sh:maxCount 1 ;
sh:order 5 ;
sh:path rdfs:comment ],
[ sh:description "The level of knowledge that supports an edge or node. This is a general categorization of the type of evidence that supports a statement, and is not intended to be a comprehensive description of the evidence. For example, a statement may be supported by a single publication, but that publication may contain multiple types of evidence, such as a computational prediction and a manual curation. In this case, the knowledge level would be \"curated\", and the evidence would be described in more detail in the evidence graph." ;
sh:in ( "curated" "predicted" "text_mined" "correlation" "observed" "other" "mixed" ) ;
sh:maxCount 1 ;
sh:order 6 ;
sh:path infores:knowledge_level ],
[ sh:description "the status of the infores identifier, the default is \"released\"" ;
sh:in ( "released" "deprecated" "draft" "modified" ) ;
sh:maxCount 1 ;
sh:order 0 ;
sh:path infores:status ],
[ sh:datatype xsd:string ;
sh:description "Alternate human-readable names for a thing" ;
sh:order 4 ;
Expand Down
16 changes: 3 additions & 13 deletions project/shex/information_resource_registry.shex
Original file line number Diff line number Diff line change
Expand Up @@ -44,24 +44,14 @@ linkml:Jsonpath xsd:string
linkml:Sparqlpath xsd:string

<InformationResource> CLOSED {
( $<InformationResource_tes> ( <status> [ <https://w3id.org/biolink/infores/InformationResourceStatusEnum#released>
<https://w3id.org/biolink/infores/InformationResourceStatusEnum#deprecated>
<https://w3id.org/biolink/infores/InformationResourceStatusEnum#draft>
<https://w3id.org/biolink/infores/InformationResourceStatusEnum#modified> ] ? ;
( $<InformationResource_tes> ( <status> @<InformationResourceStatusEnum> ? ;
rdfs:label @linkml:String ? ;
<id> @linkml:String ;
<xref> @linkml:String * ;
<synonym> @linkml:String * ;
rdfs:comment @linkml:String ? ;
<knowledge_level> [ <https://w3id.org/biolink/infores/KnowledgeLevelEnum#curated>
<https://w3id.org/biolink/infores/KnowledgeLevelEnum#predicted>
<https://w3id.org/biolink/infores/KnowledgeLevelEnum#text_mined>
<https://w3id.org/biolink/infores/KnowledgeLevelEnum#correlation>
<https://w3id.org/biolink/infores/KnowledgeLevelEnum#observed>
<https://w3id.org/biolink/infores/KnowledgeLevelEnum#other>
<https://w3id.org/biolink/infores/KnowledgeLevelEnum#mixed> ] ? ;
<agent_type> [ <https://w3id.org/biolink/infores/AgentTypeEnum#not_provided>
<https://w3id.org/biolink/infores/AgentTypeEnum#computational_model> ] ?
<knowledge_level> @<KnowledgeLevelEnum> ? ;
<agent_type> @<AgentTypeEnum> ?
) ;
rdf:type [ <InformationResource> ] ?
)
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,4 @@ docs = ["mkdocs-material"]

[tool.codespell]
skip = '.idea,.git,SEMMEDDB*,semmed*,*.svg,docs'
ignore-words-list = 'amination,ehr,mor,brite,nin,mirgate'
ignore-words-list = 'amination,ehr,mor,brite,nin,mirgate,MiRgate,EHR,nin,miR,miRNA,miRBase'
7 changes: 5 additions & 2 deletions src/doc-templates/index.md.jinja2
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
# {{ schema.name }}: a repository for information resources used in NCATS Data Translator.
# {{ schema.name }}

The information resource registry is a listing of data sources present in the NCATS Data Translator system.
Each information resource has an identifier, a short description, and an URL to more information about that resource.

[View the Infores Catalog](https://biolink.github.io/information-resource-registry/infores_catalog.yaml)

[Download the Infores Catalog](infores_catalog.yaml)

## Classes

| Class | Description |
Expand All @@ -19,7 +23,6 @@ Each information resource has an identifier, a short description, and an URL to
| {{gen.link(np)}} | {{np.description}} |
{% endfor %}


## Enumerations

| Enumeration | Description |
Expand Down
46 changes: 46 additions & 0 deletions src/docs/appendices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
## InfoRes Catalog Metadata Dictionary

At present, a minimal set of metadata is being collected in the InfoRes Registry using the following Biolink node
properties, which will be expanded in future iterations.

name: a fully informative name for the resource (we recommend a name that is as informative and unambiguous as
possible - spelling out all acronyms that are not common knowledge, and including the name of the owning organization
when the name alone may be ambiguous).

id: the CURIE form of the InfoRes identifier, wherein a short form name of the resources serves as a human readable
identifier component (e.g. ‘infores:dgidb’)

* **synonym**:
* other names for the resource (will facilitate search/discovery)
* **url**:
* a url describing the resource - preferably its primary home page for the resource (if one exists)
* **description**:
* a free text description of the resource

## Rules for Minting InfoRes Names and Identifiers

A short form human understandable name or abbreviation is used as the identifier component of an InfoRes IRI,
and should follow these conventions:

* Use lowercase characters only.
* Keep as short as possible while remaining understandable and unambiguous.
* Acronyms are good where they are well-established and used in practice in our domain (e.g. infores:omim, not infores:online-mendelian-inheritance-in-man). Otherwise, spell out the name to the extent needed to be understood by the user (e.g. infores:drug-repurposing-hub, not infores:drh).
* Where it makes sense to do so, adopt the base url of the resources home web address, or its registered prefix in an authority like identifiers.org.
* Use a hyphen (-) to separate words where needed (e.g. infores:drug-repurposing-hub), unless the words are not separated in common practice or the website url (e.g. we use infores:monarchinitiative, not infores:monarch-initiative, because their website is https://monarchinitiative.org/).
* All other non-alphanumeric characters are not allowed as delimiters.
* If we begin creating version-specific identifiers, a dot (.) will be reserved as a separator between the base resource name and its version. And versions will be specified using either dot-separated numerals (e.g. '1.1.2'), or release dates in ISO8601 format (e.g. '2021-04-18').


## Conventions for Crafting Identifiers for Translator Registry Resources

Translator applies many services that wrap or annotate existing resources in APIs to serve content that is better
aligned with Translator standards. This practice can lead to confusion around what represents a separate Information
Resource, and how resources may be related to each other. Below we describe conventions we apply for InfoRes creation
for different scenarios / use cases we encounter in the registry.

Aggregator Scenario: KPs/ARAs that aggregate content from one or more existing resource and transform the semantics
and or structure of the data to be better aligned with Translator standards

Resource Examples:
Molecular Data Provider, Biolink, ROBOKOP, SRI Reference KG, RTX KG2 (these aggregate content from multiple sources into a single KG / API)
Automat APIs (stands up a separate API per source - each of which gets its own InfoRes)
35 changes: 35 additions & 0 deletions src/docs/biolink-infores-parameters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
## Biolink Edge Properties for Source Retrieval Provenance

We define the following hierarchy of edge properties in Biolink for capturing Information Resources through which
knowledge expressed in a given edge was retrieved on its way to its presently serialized form (e.g. a TRAPI message
sent to an ARA). Full definitions and metadata for each can be found in the Biolink Model.

* **biolink:knowledge_source**:
* An Information Resource from which the knowledge expressed in an Association was retrieved,
directly or indirectly
* **biolink:primary_knowledge_source**:
* The most upstream source of the knowledge expressed in an Association that a
knowledge provider can identify.
* **biolink:aggregator_knowledge_source**:
* An intermediate aggregator resource from which knowledge expressed in an
Association was retrieved downstream of the original source, on its path to its current serialized form.
* **`biolink:supporting_data_source`**:
* An Information Resource from which data was retrieved and subsequently used as
evidence to generate the knowledge expressed in an Association (e.g. through computation on, reasoning or inference
over the retrieved data).


## The Information Resource Registry (infores) data model

An information resource is defined as a web-accessible resource that provides data. An InformationResource
(designated by its identifier in curie form, e.g. 'infores:monarchintiative') is a Biolink Model class that provides
a standard way to identify and describe information resources. The InformationResource class details can be found here:
[information_resource_registry.yaml](information_resource_registry.yaml) and contains the following properties:

- **id**: the identifier of the information resource (e.g. 'infores:monarchintiative')
- **name**: the name of the information resource (e.g. 'Monarch Initiative')
- **description**: a description of the information resource (e.g. 'Monarch is a platform for biomedical data discovery
and integration')
- **url**: the url of the information resource (e.g. 'https://monarchinitiative.org/')
- **status**: the status of the information resource (e.g. 'released', 'deprecated', etc. Please see the enumeration
listed in the model yaml for more information)
Loading

0 comments on commit 1b791fd

Please sign in to comment.