BioGrakn DN is a single knowledge graph of biomedical data describing disease networks, ingested from Uniprot, Reactome, DGIdb, DisGeNET, HPA-Tissue, EBI IntAct, Kaneko, Gene Expression Omnibus (GSE27876, GSE43696, GSE63142) and TissueNet.
BioGrakn DN provides an intuitive way to query interconnected and heterogeneous biomedical data in one single place. The schema that models the underlying knowledge graph alongside the descriptive query language, Graql, makes writing complex queries an extremely straightforward and intuitive process. Furthermore, the automated reasoning capability of Grakn, allows BioGrakn DN to become an intelligent database of biomedical data that infers implicit knowledge based on the explicitly stored data. BioGrakn DN can understand biological facts, infer based on new findings and enforce research constraints, all at query (run) time.
- Download the latest release (size: 2.5 GB).
- Unzip the downloaded file.
cd
into the unzipped folder, via terminal or command prompt.- run
./grakn server start
Queries can be run over BioGrakn DN, via Graql Console, Grakn Clients and Grakn Workbase.
Download the latest release of Grakn Workbase, install and run it.
Read the documentation on Workbase or watch a short series of videos about using workbase with the Grakn <> BLAST integration example.
While inside the unzipped folder, via terminal or command prompt, run: ./graql console -k biograkn_dn
. The console is now ready to answer your queries.
Grakn Clients are available for Java, Node.js and Python. Using these clients, you will be able to perform read and write operations over BioGrakn DN. See an example of how this is done in the Grakn <> BLAST integration example, using the Python client.
The schema for the BioGrakn DN knowledge graph defines how the knowledge graph is modelled to represent the reality of its dataset. To understand the underlying data structure, you may read through the schema.gql
or view the visualised schema.
match
$gpe (encoding-gene: $ge, encoded-protein: $pr) isa gene-protein-encoding;
$ge isa gene has entrez-id "100137049";
limit 10; get;
Note that the data to answer this question is not explicitly stored in the knowledge graph. The protein-disease-association-and-tissue-enhancement-implies-disease-tissue-association Rule
enables us to get the answer to this question using the following query.
match
$ti isa tissue has tissue-name "appendix";
$dta (associated-disease: $di, associated-tissue: $ti) isa disease-tissue-association;
limit 10; get;
Note that the data to answer this question is not explicitly stored in the knowledge graph. The gene-disease-association-and-gene-protein-encoding-protein-disease-association Rule
enables us to get the answer to this question using the following query.
match
$di isa disease has disease-name "Asthma";
$dda (associated-protein: $pr, associated-disease: $di) isa protein-disease-association;
limit 10; get;
This query also makes use of the gene-disease-association-and-gene-protein-encoding-protein-disease-association Rule
.
match
$ti isa tissue, has tissue-name "liver";
$pr isa protein;
$pr2 isa protein;
$pr != $pr2;
$di isa disease;
$pl (tissue-context: $ti, biomolecular-process: $ppi) isa process-localisation;
$ppi (interacting-protein: $pr, interacting-protein: $pr2) isa protein-protein-interaction;
$pda (associated-protein: $pr, associated-disease: $di) isa protein-disease-association;
limit 30; get;
Which drugs and diseases are associated with the same differentially expressed gene from comparisons made in geo-series with id of GSE27876?
match
$geo-se isa geo-series has GEOStudy-id "GSE27876";
$comp (compared-groups: $geo-comp, containing-study: $geo-se) isa comparison;
$def (conducted-analysis: $geo-comp, differentially-expressed-gene: $ge) isa differentially-expressed-finding;
$dgi (target-gene: $ge, interacted-drug: $dr) isa drug-gene-interaction;
$gda (associated-gene: $ge, associated-disease: $di) isa gene-disease-association;
limit 10; get;