Workflow KG Plants Taxon Compound

This workflow involves a chain of processes to construct a knowledge graph from a list of scientific article DOIs. It aims to establish connections between scientific articles contained in PubMed and pairs of taxa/metabolites through the "produces" relationship. This work is based on the repository Relation Extraction in underexplored biomedical domains: A diversity-optimised sampling and synthetic data generation approach.

1 - Building DOI list file

Search in PubMed for articles related to a taxon of the Brassicaceae family and glucosinolate compounds.

curl -s 'https://pubmed.ncbi.nlm.nih.gov/?term=brassica+glucosinolate&format=pubmed&size=200' | grep "\[doi\]" | cut -d" " -f3 > data/brassicale_glucosinolate.txt

2 - a) Building the article base from a list of DOIs

python src/api_doi.py --list_doi "10.1021/jf401802n,10.1021/jf405538d" --output test.json

2 - b) Building the article base from a list of DOIs in a file

python src/api_doi.py --list_doi_file data/list_doi_example.txt --output test.json

2 - c) Building tha article base from pdf article

TODO

3 - IDIAP Workflow to generate Taxon / Metabolite "produces" associations

Working with a GPU environment

Genouest Org

ssh $USER@genossh
srun --gpus 1 -p gpu --pty bash
. /local/env/envpython-3.9.5.sh
virtualenv ~/env-idiap ## only the first time !!
source ~/env-idiap/bin/activate 
export PATH=/home/genouest/inra_umr1349/$USER/.local/bin:$PATH

python src/workflow_idap.py --dump igepp.json

References

4 - Build RDF Graph

pip install pygbif rdflib
python src/build_rdf_graph.py --dump_doi test.json --dump_taxon_compound test_taxon_metabolite_associations_idiap.json

Note about relation to build/infere

gist

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
config		config
data		data
img		img
results_sample		results_sample
src		src
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Workflow KG Plants Taxon Compound

1 - Building DOI list file

2 - a) Building the article base from a list of DOIs

2 - b) Building the article base from a list of DOIs in a file

2 - c) Building tha article base from pdf article

3 - IDIAP Workflow to generate Taxon / Metabolite "produces" associations

Genouest Org

References

4 - Build RDF Graph

Note about relation to build/infere

About

Releases

Packages

Languages

p2m2/workflow-kg-plants-taxon-compound

Folders and files

Latest commit

History

Repository files navigation

Workflow KG Plants Taxon Compound

1 - Building DOI list file

2 - a) Building the article base from a list of DOIs

2 - b) Building the article base from a list of DOIs in a file

2 - c) Building tha article base from pdf article

3 - IDIAP Workflow to generate Taxon / Metabolite "produces" associations

Genouest Org

References

4 - Build RDF Graph

Note about relation to build/infere

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages