-
Notifications
You must be signed in to change notification settings - Fork 27.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NotImplementedError: Cannot copy out of meta tensor; no data! #26510
Comments
Hello sir can you assign the issue to me |
It does not provide me the option to assign anyone. Sorry! |
@mdazfar2 feel free to open a PR and link it to this issue if you'd like to work on it! |
It works without FSDP (i.e. with DDP) |
@LysandreJik Yeah okk i will do it now |
It works with DeepSpeed Stage2 as well. The error only occurs when using FSDP to train. |
Hit the same problem on slurm as well. |
The same problem with llama2 as well. This is not specific for MistralAI.
…On Tue, Oct 3, 2023 at 8:20 PM Danny Hung ***@***.***> wrote:
Hit the same problem on slurm as well.
—
Reply to this email directly, view it on GitHub
<https://urldefense.com/v3/__https://github.com/huggingface/transformers/issues/26510*issuecomment-1746070076__;Iw!!IKRxdwAv5BmarQ!eG_s4cCgluxzpX0_lLg3aeWu3YIyXfl7pkfljbtx-wD6p1HUWCpl3VSgIFyrFLI5RrYTkhfPWrfsTjzjmY5GjQY$>,
or unsubscribe
<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADL24YX5AMCMQR4GZXMFBELX5TIZHAVCNFSM6AAAAAA5N2KDIGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBWGA3TAMBXGY__;!!IKRxdwAv5BmarQ!eG_s4cCgluxzpX0_lLg3aeWu3YIyXfl7pkfljbtx-wD6p1HUWCpl3VSgIFyrFLI5RrYTkhfPWrfsTjzjpRchORM$>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
What are possible reasons? I could run my code with 4.33.1. Is it accelerate? |
Maybe cc @muellerzr as well |
Gentle ping @muellerzr @pacman100 |
Hello, using the latest releases of transformers (4.35.0) and Accelerate (0.24.1), I am unable to reproduce the issue.
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
The missing name of config parameter `` in the comment above is |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
System Info
transformers==4.34.0.dev0
accelerate==0.23.0
torch==2.0.1
cuda==11.7
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
import transformers
model = transformers.MistralForCausalLM.from_pretrained(model_path)
Error:
Traceback (most recent call last):
File "./trainer.py", line 198, in
train()
File "./trainer.py", line 152, in train
model = transformers.MistralForCausalLM.from_pretrained(
File "/opt/conda/envs/ptca/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3301, in from_pretrained
) = cls._load_pretrained_model(
File "/opt/conda/envs/ptca/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3689, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/opt/conda/envs/ptca/lib/python3.8/site-packages/transformers/modeling_utils.py", line 741, in _load_state_dict_into_meta_model
set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
File "/opt/conda/envs/ptca/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 317, in set_module_tensor_to_device
new_value = value.to(device)
NotImplementedError: Cannot copy out of meta tensor; no data!
Expected behavior
model loads sucessfully
The text was updated successfully, but these errors were encountered: