forked from alibaba/FederatedScope
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
12 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,15 +1,19 @@ | ||
# Rouge-L | ||
|
||
## Dolly-15K | ||
To assess the performance of our fine-tuned model, we leverage the Rouge-L | ||
metric and conduct experiments with a large number of clients, utilizing the | ||
Dolly-15K dataset as our training corpus. The Dolly-15K dataset encompasses | ||
a total of 15,015 data points, distributed across eight distinct tasks. For | ||
a more comprehensive evaluation, we allocate the final task exclusively for | ||
evaluation purposes, while dedicating the remaining ones to the training | ||
phase. Our experimental setup involves a network of 200 clients, utilizing a Dirichlet distribution for data partitioning to emulate non-IID conditions across the client base. | ||
metric and conduct experiments with a large number of clients, utilizing the Dolly-15K dataset as our training corpus. | ||
The Dolly-15K dataset encompasses a total of 15,015 data points, distributed across eight distinct tasks. For a more comprehensive evaluation, we allocate the final task exclusively for evaluation purposes, while dedicating the remaining ones to the training phase. Our experimental setup involves a network of 200 clients, utilizing a Dirichlet distribution for data partitioning to emulate non-IID conditions across the client base. | ||
|
||
To do the evaluation, run | ||
```bash | ||
python federatescope/eval/eval_for_rougel/eval.py --cfg | ||
federatescope/llm/baselime/xxx.yaml | ||
python federatescope/eval/eval_for_rougel/eval_dolly.py --cfg federatescope/llm/baselime/xxx.yaml | ||
``` | ||
|
||
## Natural Instructions | ||
We also leverage the Rouge-L metric and conduct experiments with a large number of clients, utilizing the Natural Instructions (NI) dataset as our training corpus. In the NI dataset, we allocate each of the 738 training tasks exclusively to a distinct client for model training, thereby cultivating a non-IID setting characterized by feature distribution skew. Meanwhile, evaluation is performed on separate test tasks. | ||
|
||
To do the evaluation, run | ||
```bash | ||
python federatescope/eval/eval_for_rougel/eval_ni.py --cfg federatescope/llm/baselime/xxx.yaml | ||
``` |
File renamed without changes.