Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I train the model by using Accelerator? #147

Open
huanghaosen110 opened this issue Jun 4, 2024 · 0 comments
Open

How can I train the model by using Accelerator? #147

huanghaosen110 opened this issue Jun 4, 2024 · 0 comments

Comments

@huanghaosen110
Copy link

Because the officially implemented training code is too slow,I use the Accelerator to speed up the training,but when I use the mixed_precision="bf16",the forward propagation is fine,but It always cause this error when doing the backward.
File "E:\DeepLearningProject\Anti-DreamBooth-main2\train2.py", line 233, in
trainUnet()
File "E:\DeepLearningProject\Anti-DreamBooth-main2\train2.py", line 195, in trainUnet
accelerator.backward(total_loss)
File "E:\miniconda\envs\py310\lib\site-packages\accelerate\accelerator.py", line 1853, in backward
loss.backward(**kwargs)
File "E:\miniconda\envs\py310\lib\site-packages\torch_tensor.py", line 522, in backward
torch.autograd.backward(
File "E:\miniconda\envs\py310\lib\site-packages\torch\autograd_init_.py", line 266, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "E:\miniconda\envs\py310\lib\site-packages\torch\autograd\function.py", line 289, in apply
return user_fn(self, *args)
File "E:\DeepLearningProject\Anti-DreamBooth-main2\guided_diffusion\nn.py", line 168, in backward
output_tensors = ctx.run_function(*shallow_copies)
File "E:\DeepLearningProject\Anti-DreamBooth-main2\guided_diffusion\unet.py", line 304, in _forward
qkv = self.qkv(self.norm(x))
File "E:\miniconda\envs\py310\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "E:\miniconda\envs\py310\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "E:\miniconda\envs\py310\lib\site-packages\torch\nn\modules\conv.py", line 310, in forward
return self._conv_forward(input, self.weight, self.bias)
File "E:\miniconda\envs\py310\lib\site-packages\torch\nn\modules\conv.py", line 306, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
RuntimeError: Input type (struct c10::BFloat16) and bias type (float) should be the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant