Skip to content

Latest commit

 

History

History
37 lines (27 loc) · 2.16 KB

README.md

File metadata and controls

37 lines (27 loc) · 2.16 KB

LGEB

LGEB: Benchmark of Language Generation Evaluation

目前支持的任务有Abstractive Summarization和Question Generation

用法:

  1. 首先将数据下载到对应文件夹
    将tmp和data和model解压到unilm目录下,下载链接如下:
    https://pan.baidu.com/s/1nr9Xet0tx7bCI5UVICX1Pg
  2. 执行脚本
    cd LGEB/baseline/unilm/scripts
    训练: run_giaword_as.sh / run_squad_qg.sh
    评估: run_giaword_as_eval.sh / run_squad_qg_eval.sh

目前效果如下:

https://github.com/microsoft/unilm 保持一致)

Abstractive Summarization - Gigaword (10K)

The data can be downloaded from here.

Model ROUGE-1 ROUGE-2 ROUGE-L
Transformer 10.97 2.23 10.42
UniLM 34.21 15.28 31.54

Question Generation - SQuAD

Our processed data can be downloaded from here.

Model BLEU-4 METEOR ROUGE-L
(Du and Cardie, 2018) 15.16 19.12 -
(Zhang and Bansal, 2019) 18.37 22.65 46.68
UniLM 22.78 25.49 51.57

Note: If we directly use the tokenized references provided by Du et al. (2017), the results are (22.17 BLEU-4 / 25.47 METEOR / 51.53 ROUGE-L) on the raw data split