Utilizing NCBI dataset, this project creates a Knowledge Graph linking disease-disease, chemical-chemical, and disease-chemical entities. Enables chatbots, question-answering systems, and relation extraction for complex biomedical queries and drug discovery. Notebook summary
04_ner notebook Named-entity recognition is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
01_coref notebook Coreference resolution is the task of finding all expressions that refer to the same entity in a text. It is an important step for a lot of higher-level NLP tasks that involve natural language understanding such as document summarization, question answering, and information extraction. 02_simplification notebook aims to reduce the linguistic complexity of content to make it easier to understand, while still retaining the original information and meaning 03_Proj_minIE extracting relation triples
data tripplets for neo4j GraphXr input :justttxt.txt