01 Home

#Semantic Metabolomics

Welcome to the wiki!

This wiki is a hub containing supplemental information about the mass spectrometry ontology mbco, which currently contains a proof-of-concept RDFication of MassBank records.

Contents:

Scope of this wiki
1. Overall set-up

1. [Core model](https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#core-model) 1. [MassBank](https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#massbank) 1. [Metabolights](https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#metabolights) 1. [Chebi](https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#chebi) 1. [SPARQL-Endpoint](https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#sparql-endpoint) 2. [Generalized Workflow of creating an rdf-Resource] (https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#generalized-workflow-of-creating-an-rdf-resource) 3. [Supplemental material](https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#supplemental-material) 1. [Links and Tools](https://github.com/sneumann/SemanticMetabolomics/wiki/01-Home#links-and-tools)

##Scope of this wiki This wiki captures the basic setup for a prototypic RDF resource mirroring essential MassBank data along the Semantic Web and LOD data paradigm. First focusing on mass spectrometry use cases, it can later serve as a test case for wider i.e. COSMOS efforts towards a semantic web of metabolomics data resources. The production version hence later needs to be reimplemented in a more decentralized way and will be expandede.g. to fulfill the requirements of the NMR world as well, i.e. by linking to HMDB data.

####Overall set-up We here describe an initial experimental RDF dump of Massbank core data. It was generated in a centralized local approach via makefiles that autoconvert selected MassBank parts into one large RDF triple store. Semantic Web best practices were followed along the sources provided below. We were in particular guided by the Bio2RDF community and the RDF resources available at the EBI

##Core model

In concept, this project consists mainly of four parts:

The RDFication of MassBank
The RDFication of Metabolights
The interlinking of the aforementioned resources using Bio2RDF Chebi
Setting up a SPARQL-Endpoint using Virtuoso

####MassBank

04 RDF MassBank Resource Module

For a current state of MassBank interlinking, visit the RDF MassBank wikipage.

####Metabolights

06 RDF Metabolights Resource Module

The RDFication of Metabolights is in an advanced state and will be added to the internal SPARQL-Endpoint in the course of the next week (proof-of-concept data sample) When this is done we will add more information.

####Chebi

05 RDF Chebi and Chembl Resource Modules

As Chebi is used as an interlinking tool, the first requirement would be to have data samples of databases to interlink - meaning, that this (like Metabolights) will be added shortly.

####SPARQL-Endpoint As the SPARQL-Endpoint depends on data, this will be expanded greatly. The current idea is to create a simplified query interface with examples, like the Chembl-Endpoint

##Generalized workflow of creating an rdf Resource Abstracted (general) workflow for creating linked Data resources: (add subheaders, make it a conditional graph?) (inspired from http://www.w3.org/2001/sw/hcls/notes/hcls-rdf-guide/ )

Review external existing RDF resources to re-use and integrate with
Define Use case: Determines scope
Define Competency Questions: Defines domain dependent content links
Select the data sources or portions thereof to be converted to RDF in the case where re-use is no option..
Identify the items of interest in your domain, the things whose properties and relationships we want to describe
Agree on which items should be URI and which stay literals, e.g. float or string values
Identify persistent HTTP URIs for information & non-information resources (use hashed URIs here)
Choose your robust namespace
Use http://identifiers.org/
Agree on Mime types you want to provide for HTTP content-negotiated additional representations presented upon dereferencing the URI, e.g. HTML in addition to RDF/XML
Generate RDF model
First sketch handwritten graph models for all modules/namespaces envisioned, with links/edges between resources needed to answer CQs
Add core predicates/edges/relations incl inverses/backlinks, and NS where to take them from
Add a 3-node-spanning link along multiple namespaces, i.e. to show nested queries & how URIs are used as primary key to pass information along
List of what essential literals in resource modules can become robust URIs
Must be translatable with high confidence (e.g. SpeciesLabel=”brassicaceae” → NCITax:ID723345
URI examples in accordance to ID.org (dereferenceable via content negotiation) for all key nodes/predicates in all NS modules existing ones AND own ones (get server name here for HTTP url to serve own NS)
To align ChemicalIDs use https://www.ebi.ac.uk/unichem/
Generate ontology defining the formal semantics of the RDF model.
Generate example RDF triples in turtle or RDF/XML syntax and along the CQ scope, i.e.
1. using own NS-URI (Massbank) to literal (e.g. Mass value as float?)
2. using own NS-URI (Massbank) to own NS-URI (Massbank)
3. using own NS-URI (Massbank) to established external resource URI (e.g. Chebi from BioToRDF or Bioportal)
Publish the RDF data as Linked Data or through SPARQL endpoint E.g. set up Virtuoso server with endpoint & configure it.
Agree on Information from different sources that merges in naturally & allows synergistic insights (context enrichment)
Set RDF links between data from different external sources
Create Semantic Web applications using the published data.
Add example SPARQL queries along use cases in a) human readable form and b) in annotated SPARQL, e.g. turtle syntax
Queries should leverage on the example triples store and document accompanying result data sets.
Build and add query library
Write Documentation

Make your LOD known with sem web crawlers/tools/Websites

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

01 Home

Clone this wiki locally