BioBERT Repro

Trying to reproduce BioBERT(paper link) results for the NER task. The original work of the paper is in Tensorflow. This repo is made using 🤗HuggingFace transformers library and Pytorch.

Credits:

Code and project workflow heavily inspired by Abhishek Thakur's tutorials.
🤗HuggingFace's Transformers documentation
Evaluation details and other specifics taken from original paper

Steps:

Run setup.sh to download the NER datasets of the paper and do preprocessing.

bash setup.sh

Finetuning on NER-datasets:

Run train.py to finetune the model on disease-datasets of the paper. Additionally to finetune on other datasets, make changes to config.py. There is an optional argument of secondary folderpath which can be when working on a remote server. When invoked, apart from saving the metadata and model in the local repository, it is also saved at a secondary path (For eg a G-drive path when working on a Google Colab server).

cd src
python train.py [-f(optional) "SECONDARY_FOLDERPATH"]

The best model (according to val score) will be saved in models/ and metadata (containing label encoder object) will be saved in src/. The same will be done for the SECONDARY_FOLDERPATH if provided.

Evaluating a finetuned model:

Make sure your trained model is in models/ and your metadata dict containing the LabelEncoder object is in src/. Run evaluate.py. There is optional argument -g which if invoked, will evaluate all metrics on exact entity-level matching; use python evaluate.py --help for more info on the arguments. You should see a classification report generated with the help of seqeval.

cd src
python evaluate.py [-g]

Results:

Classification report of the NER task on a model finetuned for 10 epochs on the disease-datasets:

Here evaluation is done for single entity-type (disease) and hence the averages are the same.

Some important points:

In the original paper, they have pretrained on biomedical datasets and then finetuned for downstream tasks. Here we only finetune a bert-base-uncased model without any pretraining. The model can be changed in config.py with minimal changes to model.py.
evaluate.py contains custom methods by which we carry out entity-level NER exact evaluation (as stated in the paper). Meaning we first convert the predictions on Wordpiece tokens to word-level predictions, and then carry out exact matching to get NER metrics namely F1 score, precision and recall.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

BioBERT Repro

Credits:

Steps:

Finetuning on NER-datasets:

Evaluating a finetuned model:

Results:

Some important points:

Files

README.md

Latest commit

History

README.md

File metadata and controls

BioBERT Repro

Credits:

Steps:

Finetuning on NER-datasets:

Evaluating a finetuned model:

Results:

Some important points: