-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] DatasetNotFoundError: Dataset 'asas-ai/AraTrust-categorized' doesn't exist on the Hub or cannot be accessed. #441
Comments
Thanks for the issue! It would help a lot if you could follow the template, and provide, each time you report a bug:
|
sry, my bad bro. I submitted the detailed error information above. |
Hey @BobTsang1995 & @clefourrier Dataset is public again (someone switched it to private by mistake). Find here @BobTsang1995 can you plz keep commenting here even after the issue get closed if you face any further issues ? Usually they are just minor things otherwise if it's something really wrong we can open a separate issue for it and adress it properly. |
@BobTsang1995 thanks a lot for the update, much better! :) |
thx a lot, I will try it again, appreciate for your repeat |
One last thing. |
LOL! This is my unique naming method. We have indeed done a lot of work on the mmmlu benchmark. Our multilingual model will be coming soon, so you can look forward to it. |
another error occurred. it seems that some entries do not have these keys KeyError: 'sol3' @alielfilali01 |
Oh shot ! That's because of the last PR #440 ! |
Last PR's fixed the issue that @BobTsang1995 had on the previous subset, because this line caused an error. Please find a way to explicitly provide the columns (for example, providing the full list of all allowed columns, instead of excluding some) |
Oh sorry ! got mixed between the issues ! Ok i will investigate this further and update the associated PR |
There is still a problem with PR #444. The task_name passed to the alghafa_pfn() function has a prefix of community|xxx:, and the key cannot be found in the dictionary. I think maybe you should run the code before PR. at the same time, It seems that you added an undefined parameter when inheriting the Doc class, resulting in an error: TypeError: Doc.init() got an unexpected keyword argument 'target_for_fewshot_sorting' |
Hey @BobTsang1995 tnx for the highlight. Well, the suite "community" is already part of the config so naturally it shouldn't be part of the task_name + the main task "alghafa:" is already defined in the tasks list which goes later to the final (end of the script) tasks table. So in theory it should be fine ! But you are correct i haven't tested the code before pushing as i was leaving desk actually before i checked the notif 🥲 I'll give it a look tomorrow and try to fix it asap. Thanks again for your feedback |
Sorry to bother,Do you have any plans to fix this issue today? |
It is fixed (test went well) and waiting for the PR #444 to be merged. |
@alielfilali01 another question, bro now I want to eval 72B model, but accelerate always oom when i use model parallel,I'm wondering how you guys eval big size model when use lighteval.
|
Hey @BobTsang1995 Run this instead: accelerate launch --multi_gpu --num_processes=4 -m lighteval accelerate "pretrained=/mnt/sg_nas/liheng/Marco_checkpoint/Qwen2-72B,dtype=bfloat16,max_length=2048,model_parallel=True" --override-batch-size 1 "examples/tasks/OALL_v1_tasks.txt" --custom-tasks "community_tasks/arabic_evals.py" --output-dir="./evals/" PS: plz consider closing this issue. |
thx, bro have a good day~ |
It seems that the asas-ai/AraTrust-categorized dataset does not exist on hugging face. can you guys fix it?@alielfilali01
Describe the bug
When trying to run lighteval with custom Arabic evaluation tasks, it fails with a DatasetNotFoundError, indicating that the dataset 'asas-ai/AraTrust-categorized' cannot be found on the Hugging Face Hub.
To Reproduce
Full Error Message
Version info
The text was updated successfully, but these errors were encountered: