Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] DatasetNotFoundError: Dataset 'asas-ai/AraTrust-categorized' doesn't exist on the Hub or cannot be accessed. #441

Closed
BobTsang1995 opened this issue Dec 12, 2024 · 18 comments
Labels
bug Something isn't working

Comments

@BobTsang1995
Copy link

BobTsang1995 commented Dec 12, 2024

It seems that the asas-ai/AraTrust-categorized dataset does not exist on hugging face. can you guys fix it?@alielfilali01

Describe the bug

When trying to run lighteval with custom Arabic evaluation tasks, it fails with a DatasetNotFoundError, indicating that the dataset 'asas-ai/AraTrust-categorized' cannot be found on the Hugging Face Hub.

To Reproduce

  1. Set up the conda environment with Python 3.10
  2. Install required packages
  3. Run the following command:
accelerate launch --multi_gpu --num_processes=8 -m lighteval \
accelerate "pretrained=/mnt/sg_nas/liheng/Marco_checkpoint/Qwen2-7B-mmmlu-latest/checkpoint-1150,dtype=bfloat16,max_length=16384" \
"examples/tasks/OALL_v2_tasks.txt" \
--custom-tasks "community_tasks/arabic_evals.py" \
--output-dir="./evals/"

Full Error Message

[rank3]: DatasetNotFoundError: Dataset 'asas-ai/AraTrust-categorized' doesn't exist on the Hub or cannot be accessed.

Version info

  • Operating System: (Linux)
  • Python Version: Python 3.10

@BobTsang1995 BobTsang1995 added the bug Something isn't working label Dec 12, 2024
@clefourrier
Copy link
Member

Thanks for the issue! It would help a lot if you could follow the template, and provide, each time you report a bug:

  • the command you run
  • the full stack trace of the error

@BobTsang1995
Copy link
Author

Thanks for the issue! It would help a lot if you could follow the template, and provide, each time you report a bug:

  • the command you run
  • the full stack trace of the error

sry, my bad bro. I submitted the detailed error information above.

@alielfilali01
Copy link
Contributor

Hey @BobTsang1995 & @clefourrier
Plz feel free to close this issue since it was not a bug.

Dataset is public again (someone switched it to private by mistake). Find here
Now the task should run as expected.

@BobTsang1995 can you plz keep commenting here even after the issue get closed if you face any further issues ? Usually they are just minor things otherwise if it's something really wrong we can open a separate issue for it and adress it properly.
Thank you

@clefourrier
Copy link
Member

@BobTsang1995 thanks a lot for the update, much better! :)

@BobTsang1995
Copy link
Author

Hey @BobTsang1995 & @clefourrier Plz feel free to close this issue since it was not a bug.

Dataset is public again (someone switched it to private by mistake). Find here Now the task should run as expected.

@BobTsang1995 can you plz keep commenting here even after the issue get closed if you face any further issues ? Usually they are just minor things otherwise if it's something really wrong we can open a separate issue for it and adress it properly. Thank you

thx a lot, I will try it again, appreciate for your repeat

@alielfilali01
Copy link
Contributor

One last thing.
I hope the name Qwen2-7B-mmmlu-latest doesn't mean that is trained on mmmlu >_<
Have a good day/night and plz keep the comments coming !

@BobTsang1995
Copy link
Author

One last thing. I hope the name Qwen2-7B-mmmlu-latest doesn't mean that is trained on mmmlu >_< Have a good day/night and plz keep the comments coming !

LOL! This is my unique naming method. We have indeed done a lot of work on the mmmlu benchmark. Our multilingual model will be coming soon, so you can look forward to it.

@BobTsang1995
Copy link
Author

BobTsang1995 commented Dec 12, 2024

another error occurred. it seems that some entries do not have these keys KeyError: 'sol3' @alielfilali01
image

@alielfilali01
Copy link
Contributor

Oh shot ! That's because of the last PR #440 !
AlGhafa prompt function is applied on all 9 subsets of AlGhafa Native (here) but not all of the subsets share the same columns !
This PR #442 should fix it ounce merged (make sure to make the changes on your local machine so you don't wait for the PR to be merged)

@clefourrier
Copy link
Member

clefourrier commented Dec 12, 2024

Last PR's fixed the issue that @BobTsang1995 had on the previous subset, because this line caused an error. Please find a way to explicitly provide the columns (for example, providing the full list of all allowed columns, instead of excluding some)

@alielfilali01
Copy link
Contributor

Oh sorry ! got mixed between the issues ! Ok i will investigate this further and update the associated PR

@BobTsang1995
Copy link
Author

BobTsang1995 commented Dec 12, 2024

Oh sorry ! got mixed between the issues ! Ok i will investigate this further and update the associated PR

There is still a problem with PR #444. The task_name passed to the alghafa_pfn() function has a prefix of community|xxx:, and the key cannot be found in the dictionary. I think maybe you should run the code before PR.

at the same time, It seems that you added an undefined parameter when inheriting the Doc class, resulting in an error: TypeError: Doc.init() got an unexpected keyword argument 'target_for_fewshot_sorting'

@alielfilali01
Copy link
Contributor

Hey @BobTsang1995 tnx for the highlight.

Well, the suite "community" is already part of the config so naturally it shouldn't be part of the task_name + the main task "alghafa:" is already defined in the tasks list which goes later to the final (end of the script) tasks table. So in theory it should be fine !

But you are correct i haven't tested the code before pushing as i was leaving desk actually before i checked the notif 🥲

I'll give it a look tomorrow and try to fix it asap. Thanks again for your feedback

@BobTsang1995
Copy link
Author

Hey @BobTsang1995 tnx for the highlight.

Well, the suite "community" is already part of the config so naturally it shouldn't be part of the task_name + the main task "alghafa:" is already defined in the tasks list which goes later to the final (end of the script) tasks table. So in theory it should be fine !

But you are correct i haven't tested the code before pushing as i was leaving desk actually before i checked the notif 🥲

I'll give it a look tomorrow and try to fix it asap. Thanks again for your feedback

Sorry to bother,Do you have any plans to fix this issue today?

@alielfilali01
Copy link
Contributor

It is fixed (test went well) and waiting for the PR #444 to be merged.
In the meantime feel free to run using this fork
Thanks @BobTsang1995 for your feedback !

@BobTsang1995
Copy link
Author

BobTsang1995 commented Dec 22, 2024

It is fixed (test went well) and waiting for the PR #444 to be merged. In the meantime feel free to run using this fork Thanks @BobTsang1995 for your feedback !

@alielfilali01 another question, bro now I want to eval 72B model, but accelerate always oom when i use model parallel,I'm wondering how you guys eval big size model when use lighteval.

accelerate launch --multi_gpu --num_processes=8 -m lighteval accelerate "pretrained=/mnt/sg_nas/liheng/Marco_checkpoint/Qwen2-72B,dtype=bfloat16,max_length=2048,model_parallel=True" --override-batch-size 1 "examples/tasks/OALL_v1_tasks.txt" --custom-tasks "community_tasks/arabic_evals.py" --output-dir="./evals/"

@alielfilali01
Copy link
Contributor

alielfilali01 commented Dec 23, 2024

Hey @BobTsang1995
You are already using DP by setting --multi_gpu --num_processes=8 and no GPUs left for PP.
Consider setting --num_processes=4 for DP so you can still be able to shard the model on 2 GPUs with PP

Run this instead:

accelerate launch --multi_gpu --num_processes=4 -m lighteval accelerate "pretrained=/mnt/sg_nas/liheng/Marco_checkpoint/Qwen2-72B,dtype=bfloat16,max_length=2048,model_parallel=True" --override-batch-size 1  "examples/tasks/OALL_v1_tasks.txt" --custom-tasks "community_tasks/arabic_evals.py" --output-dir="./evals/"

PS: plz consider closing this issue.

@BobTsang1995
Copy link
Author

@BobTsang1995 您已通过设置使用 DP --multi_gpu --num_processes=8,并且没有剩余 GPU 用于 PP。 请考虑设置--num_processes=4DP,以便您仍能使用 PP 在 2 个 GPU 上分片模型

运行这个:

accelerate launch --multi_gpu --num_processes=4 -m lighteval accelerate "pretrained=/mnt/sg_nas/liheng/Marco_checkpoint/Qwen2-72B,dtype=bfloat16,max_length=2048,model_parallel=True" --override-batch-size 1  "examples/tasks/OALL_v1_tasks.txt" --custom-tasks "community_tasks/arabic_evals.py" --output-dir="./evals/"

PS:请考虑关闭此问题。

thx, bro have a good day~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants