-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: 'weight' must be 2-D #3
Comments
Deepspeed zero3 currently throws a lot of errors - we're working on it and will have a fix out soon. |
@Hasan-Syed25 could you paste your env and your code, so I can deep dive on it? |
Same problem happens to me. Has anyone solved it? |
Same problem happens to me. Has anyone solved this problem? |
I would like to add a +1 too I guess its probably because deepspeed is wrapping the student model but not the teacher model A quick fix could be to store the logits offline. |
FSDP also brings the same issue. |
RuntimeError: 'weight' must be 2-D occurs when I am using Deepspeed Zero3 for distributed training. Is this an issue with deepspeed or is it an initialization issue. Here is link to the same issue that I am facing. What am I missing here?
Thanks
The text was updated successfully, but these errors were encountered: