This repository contains the source code for the paper Alimova I., Tutubalina E. Multiple features for clinical relation extraction: A machine learning approach //Journal of Biomedical Informatics. – 2020. – Т. 103. – С. 103382.
@article{alimova2020multiple,
title={Multiple features for clinical relation extraction: A machine learning approach},
author={Alimova, Ilseyar and Tutubalina, Elena},
journal={Journal of Biomedical Informatics},
volume={103},
pages={103382},
year={2020},
publisher={Elsevier}
}
MADE corpus is taken from Jagannatha A. et al. Overview of the first natural language processing challenge for extracting medication, indication, and adverse drug events from electronic health record notes (MADE 1.0) //Drug safety. – 2019. – Т. 42. – №. 1. – С. 99-111.
n2c2 corpus is taken from Henry S. et al. 2018 n2c2 shared task on adverse drug events and medication extraction in electronic health records //Journal of the American Medical Informatics Association. – 2019.
Word2Vec models
PubMed+PMC+Wikipedia - Moen S., Ananiadou T. S. S. Distributional semantics resources for biomedical text processing //Proceedings of LBM. – 2013. – С. 39-44.
BioWordVec - Zhang Y. et al. BioWordVec, improving biomedical word embeddings with subword information and MeSH //Scientific data. – 2019. – Т. 6. – №. 1. – С. 52.
Concept embeddings - Beam A. L. et al. Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data //arXiv preprint arXiv:1804.01486. – 2018.
Sent2VecModel
BioSentVec - Chen Q., Peng Y., Lu Z. BioSentVec: creating sentence embeddings for biomedical texts //2019 IEEE International Conference on Healthcare Informatics (ICHI). – IEEE, 2019. – С. 1-5.
UMLS semantic types
Unified Medical Language System - https://www.nlm.nih.gov/research/umls/
MeSH concept types
Medical Subject Headings - https://www.nlm.nih.gov/mesh/meshhome.html
features - directory with features implementation
models - directory with accessory models implementation
relation_classification.py - classifier implementation
utils.py - additional functions for resource loading
- install requirements
- download neccesary additional resources listed in Resources section
- download dataset
- add paths to relation_classification.py file
- run relation_classification.py