Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <title>启动双卡推理失败,只能启动cpu推理,return torch.load(checkpoint_file, map_location="cpu") #1494

Open
1 task done
MentalBaka opened this issue Oct 14, 2024 · 0 comments

Comments

@MentalBaka
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

在linux中,我已经安装了cuda,同时torch.cuda.device_count返回值为2

当我启动api.py时,日志为return torch.load(checkpoint_file, map_location="cpu"),可以看到,并没有启用gpu推理
api.py代码如下
tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()


但当我按照网上教程改为 load_model_on_gpus 加载模型时,发生报错
我的web_demo.py代码如下

from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html
import os
from utils import load_model_on_gpus

os.environ["CUDA_VISIBLE_DEVICES"]='0,1'
tokenizer = AutoTokenizer.from_pretrained("ChatGLM-6B", trust_remote_code=True)
model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
model = model.eval()


发生报错为
Traceback (most recent call last):
File "openai_api.py", line 173, in
model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
File "/usr/local/glm/ChatGLM-6B/utils.py", line 50, in load_model_on_gpus
model = dispatch_model(model, device_map=device_map)
File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/big_modeling.py", line 352, in dispatch_model
check_device_map(model, device_map)
File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1420, in check_device_map
raise ValueError(
ValueError: The device_map provided does not give any device for the following parameters: transformer.embedding.word_embeddings.weight, …………

Expected Behavior

No response

Steps To Reproduce

OS : Ubuntu 20.04
cd /usr/local/glm
conda activate glm3
python api.py

api.py内容
tokenizer = AutoTokenizer.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/visualglm-6b", trust_remote_code=True).half().cuda()
model = model.eval()

出现问题
return torch.load(checkpoint_file, map_location="cpu")

web_demo.py内容
from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html
import os
from utils import load_model_on_gpus

os.environ["CUDA_VISIBLE_DEVICES"]='0,1'
tokenizer = AutoTokenizer.from_pretrained("ChatGLM-6B", trust_remote_code=True)
model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
model = model.eval()

出现问题
Traceback (most recent call last):
File "openai_api.py", line 173, in
model = load_model_on_gpus("ChatGLM-6B", num_gpus=2)
File "/usr/local/glm/ChatGLM-6B/utils.py", line 50, in load_model_on_gpus
model = dispatch_model(model, device_map=device_map)
File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/big_modeling.py", line 352, in dispatch_model
check_device_map(model, device_map)
File "/home/xxx/miniconda3/envs/glm/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 1420, in check_device_map
raise ValueError(
ValueError: The device_map provided does not give any device for the following parameters: transformer.embedding.word_embeddings.weight, …………

Environment

- OS: Ubuntu 20.04
- Python:3.8
- Transformers:
- PyTorch:
- CUDA Support:True

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant