From 836145ae1addf927d0d56b9ff2796098fe8ce872 Mon Sep 17 00:00:00 2001 From: Mingyu Kim Date: Fri, 25 Oct 2024 17:04:07 +0900 Subject: [PATCH] [wwb] typo from README.md (#1072) --- tools/who_what_benchmark/README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/who_what_benchmark/README.md b/tools/who_what_benchmark/README.md index e57e9edd50..b5cad666c8 100644 --- a/tools/who_what_benchmark/README.md +++ b/tools/who_what_benchmark/README.md @@ -107,10 +107,10 @@ wwb --target-model sd-lcm-int8 --gt-data lcm_test/sd_xl.json --model-type text-t ### Supported metrics * `similarity` - averaged similarity measured by neural network trained for sentence embeddings. The best is 1.0, the minimum is 0.0, higher-better. -* `FDT` - Average position of the first divergent token between sentences generated by differnrt LLMs. The worst is 0, higher-better. [Paper.](https://arxiv.org/abs/2311.01544) -* `FDT norm` - Average share of matched tokens until first divergent one between sentences generated by differnrt LLMs. The best is 1, higher-better.[Paper.](https://arxiv.org/abs/2311.01544) -* `SDT` - Average number of divergent tokens in the evaluated outputs between sentences generated by differnrt LLMs. The best is 0, lower-better. [Paper.](https://arxiv.org/abs/2311.01544) -* `SDT norm` - Average share of divergent tokens in the evaluated outputs between sentences generated by differnrt LLMs. The best is 0, the maximum is 1, lower-better. [Paper.](https://arxiv.org/abs/2311.01544) +* `FDT` - Average position of the first divergent token between sentences generated by different LLMs. The worst is 0, higher-better. [Paper.](https://arxiv.org/abs/2311.01544) +* `FDT norm` - Average share of matched tokens until first divergent one between sentences generated by different LLMs. The best is 1, higher-better.[Paper.](https://arxiv.org/abs/2311.01544) +* `SDT` - Average number of divergent tokens in the evaluated outputs between sentences generated by different LLMs. The best is 0, lower-better. [Paper.](https://arxiv.org/abs/2311.01544) +* `SDT norm` - Average share of divergent tokens in the evaluated outputs between sentences generated by different LLMs. The best is 0, the maximum is 1, lower-better. [Paper.](https://arxiv.org/abs/2311.01544) ### Notes