Skip to content

Latest commit

 

History

History
66 lines (42 loc) · 1.39 KB

README.md

File metadata and controls

66 lines (42 loc) · 1.39 KB

Disambiguation Study for Arabic Applied on Text Classification

Setup

This environment is setup to work on a Linux platform. Make sure to use WSL2 on windows.

  • Install developer tools for C++ package building.
sudo apt install build-essential
  • Install Java to work with AraBERT preprocessors.
sudo apt install default-jre
  • Download and install Miniconda
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p ./miniconda
  • Activate conda base env
source ./miniconda/bin/activate
  • Create disambg env from YAML file
conda env create -f environment.yml
conda activate disambg
  • Fill .ENV.EXAMPLE file and save it into .env file

  • Download Pretrained FastText model for baseline calculations

cd models
python ../download_fasttext_model.py ar
rm cc.ar.300.bin.gz
cd ..

Topic Classification Results

Accuracy

Accuracy

Macro-F1

Macro F1

W&B Workspace

Lineage