Skip to content

Latest commit

 

History

History
27 lines (18 loc) · 902 Bytes

File metadata and controls

27 lines (18 loc) · 902 Bytes

Pruning the Classification model

These scripts perform vocabulary pruning on the classification model (XLMRobertaForSequenceClassification) and evaluate the performance.

We use the English and Chinese training sets as the vocabulary file.

Download the fine-tuned model or train your own model on XNLI dataset, and save the files to ../models/xlmr_xnli.

Download link: * Hugging Face Models

See the README in ../datasets/xnli for how to construct the dataset.

  • Pruning with the python script:
VOCABULARY_FILE=../datasets/xnli/multinli.train.en_zh.tsv
MODEL_PATH=../models/xlmr_xnli
python vocabulary_pruning.py $MODEL_PATH $VOCABULARY_FILE
  • Evaluate the model:

Set $PRUNED_MODEL_PATH to the directory where the pruned model is stored.

python measure_performance.py $PRUNED_MODEL_PATH