Skip to content

danielvangelder/Applied-NLP-Project-IN4325

Repository files navigation

Applied-NLP-Project-IN4325

Authors: Daniël van Gelder ([email protected]) and Thomas Bos ([email protected])

Requirements:

  • Python
  • simpletransformers
  • transformers
  • numpy
  • pandas
  • jupyter notebook
  • scikit-learn
  • nltk
  • gensim
  • tqdm

Getting Started

1. Dataset and Baseline Model

The dataset for the FNC-1 challenge can be retrieved from the dataset repository. Place all dataset files in the /data/fnc-1/ directory. Download the baseline from the baseline repository and run the baseline model on the dataset. The code needs to be adapted so that it can output the results as a stance file. Change and add the following lines of code to the fnc_kfold.py file at line 97:

OUT_DIR = "DIRECTORY_TO_REPOSITORY_DATASET" # Change this to the directory where you want the predictions to be stored
df = pd.read_csv("fnc-1/competition_test_stances.csv", names=['Headline', 'Body ID', 'Stance'], header=0)
df['Stance'] = predicted
df.to_csv(OUT_DIR + "/baseline_output.csv")

2. Running ALBERT

The notebook albert_fnc1.ipynb containing further instructions can be opened in Google Colab, which was used to generate all our results regarding the use of ALBERT on the FNC-1 data set.

Contact:

If information is missing from this repository, please reach out to either of us so that we can clarify.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published