NeDRexDB

This repository will contain the code related to building and running NeDRexDB, originally based on the repotrial/repodb_v2 repository.

Unlike the repotrial/repodb_v2 repository, this code contains only the code related to building and running the NeDRex database.

Changes in v2.0.0

Note that the first release from this repository starts at v2.0.0, reflecting the fact that a first version of NeDRexDB exists (from the repodb_v2 repository).

General

All code has been refactored and commented to reduce coupling, reduce redundancy, and improve cohesion. This should improve maintainability and readability.
Python code has been updated to use Python 3.9 features.
Tests have been added to ensure database quality.

Changes in node and edge types

A new node type, Phenotype, has been added to NeDRexDB.
A new node type, GenomicVariant, has been added to NeDRexDB.
A new edge type, DisorderHasPhenotype, has been added to NeDRexDB.
A new edge type, VariantAssociatedWithDisorder, has been added to NeDRexDB.
A new edge type, VariantAffectsGene, has been added to NeDRexDB.

Changes in databases

Data from BioGRID is now parsed and integrated into NeDRexDB, adding ProteinInteractsWithProtein edges.
Data from IntAct is now parsed and integrated into NeDRexDB, adding ProteinInteractsWithProtein edges.
Data from the Comparative Toxicogenomics Database (CTD) is now parsed and integrated into NeDRexDB, adding DrugHasIndication edges.
Data from the Human Phenotype Ontology (HPO) is now parsed and integrated into NeDRexDB, adding Phenotype nodes and DisorderHasPhenotype edges.
Data from ClinVar is now parsed and integrated into NeDRexDB, adding GenomicVariant nodes, VariantAssociatedWithDisorder edges, and VariantAffectsGene edges.
Data from dbSNP is now parsed and integrated into NeDRex, adding prevalence data to some GenomicVariant nodes.

Changes in existing parsers and integrations

MONDO integration now includes a check for obsolete disorders and no longer include these nodes.
Display names on Protein nodes are now the value displayed in the 'Protein' field on the UniProt website (e.g., "Cystic fibrosis transmembrane conductance regulator")
Drug Central integration now works on the SQL dump download (rather than requiring a seperate export to CSV outside of the NeDRexDB code).
All integrations for edges now include a check to ensure that both nodes involved (source/target or memberOne/memberTwo, as appropriate) exist in NeDRexDB.

Database orchestration

Database orchestration changes have been added to facilitate automatic updates. This is implemented by setting up a second instance of MongoDB in which NeDRexDB is rebuilt; the database volume of this container then replaces the database volume of the main MongoDB instance^$, effecting the update.

Code to orchestrate docker containers and volumes is now included in this repository.
Code to download the latest version of source databases is now included in this repository.

^$ The scheduling of the NeDRexDB update is not included in this repository, because this depends on external factors (e.g., whether any tasks are running in the API).

Licence

This code is released under a GPLv3 license, which is available here. Please note that this licence applies ONLY to the code. An instance of NeDRexDB may require a different licence, depending in part on the databases integrated as part of the NeDRex instance and the intended use of the NeDRex instance. Repotrial cannot advise you on what a suitable license is for your circumstances.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
cli		cli
nedrexdb		nedrexdb
scripts		scripts
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
build.py		build.py
build.sh		build.sh
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
set_metadata.py		set_metadata.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeDRexDB

Changes in v2.0.0

General

Changes in node and edge types

Changes in databases

Changes in existing parsers and integrations

Database orchestration

Licence

About

Releases

Packages

Languages

License

daisybio/nedrexdb

Folders and files

Latest commit

History

Repository files navigation

NeDRexDB

Changes in v2.0.0

General

Changes in node and edge types

Changes in databases

Changes in existing parsers and integrations

Database orchestration

Licence

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages