Negation Cue Detection

Code repository for negation cue detection. Project for Applied Text Mining course @ VU

Description

This repository contains data, code, trained models and results for negation cue detection task. We present approached based on using pretrained BERT model and finetuning it on SEM-2012 Shared Task dataset for negation cue detection. We use BERT models similarly to Named Entity Recogniotion task described the in original paper. Besides baseline model we use the approach of adding POS-tags and pre/suf-fixes to enhance model's performance (baseline+lexicals model).

Our repository contains also the code for other lexical features generation as well as annotation study results (annotations folder).

Prerequisities

Python >= 3.6
Python libraries:
- transformers - pretrained BERT model
- SpaCy - lexical features
- PyTorch - deep learning backend, data loaders
- pandas - data processing
- scikit-learn - evaluation tools
- numpy - math
- nltk - stemmers
- tqdm - progress bar
Optional git-lfs - if you want to use pretrained models

All dependencies can be installed with:

pip install -r requirements.txt

If problems with installation encounter, please visit official libraries' websites.

Code Usage

End2end pipeline

Run e2e experiment pipeline including data preprocessing, features generation, baseline and baseline+lexical model trainig and evaluation on devset and testset.

python main.py

Features generation

Generate features and store them as *-features.tsv inside data folder. They are already precomputed and stored in this repository.

python run_generate_features.py

Training

Train both baseline and baseline+lexicals models.

python train.py

Evaluation

Generate error analysis reports and calculate metrics. Results are stored in reports folder. Our results are included in the repo.

python run_evaluate.py

Pre-trained models

We include pre-trained models in the repository with git-lfs.

Baseline model: neg_cue_detection_model_baseline
Baseline+lexicals model: neg_cue_detection_model_lex

Results

All results can be found in reports/*metrics.txt files.

Error Analysis

Although we achieved very good F1 scores our models still make errors. Check them out in reports/*PPerror_analysis.txt files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Negation Cue Detection

Description

Prerequisities

Code Usage

End2end pipeline

Features generation

Training

Evaluation

Pre-trained models

Results

Error Analysis

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
annotations		annotations
data		data
neg_cue_detection_model_baseline		neg_cue_detection_model_baseline
neg_cue_detection_model_lex		neg_cue_detection_model_lex
reports		reports
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
evaluate.py		evaluate.py
feature_embeddings.py		feature_embeddings.py
generate_features.py		generate_features.py
main.py		main.py
requirements.txt		requirements.txt
run_evaluate.py		run_evaluate.py
run_generate_features.py		run_generate_features.py
train.py		train.py

License

boczekbartek/negation_cue_detection

Folders and files

Latest commit

History

Repository files navigation

Negation Cue Detection

Description

Prerequisities

Code Usage

End2end pipeline

Features generation

Training

Evaluation

Pre-trained models

Results

Error Analysis

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages