You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please check that this issue hasn't been reported before.
I searched previous Bug Reports didn't find any similar reports.
Expected Behavior
I expect deepspeed zero3 and 8-bit LoRA to be compatible and runs without error
Current behaviour
When loading model with deepspeed zero3 and 8-bit LoRA enabled, I ran into the error RuntimeError: Only Tensors of floating point and complex dtype can require gradients :
However, if you use zero3 in tandem with 4-bit qLoRA, or just do full fine-tuning with zero3 enabled, it works fine.
Steps to reproduce
Set up LoRA
enable deepspeed zero3
RuntimeError: Only Tensors of floating point and complex dtype can require gradients
(Putting on a tinfoil hat) I think it's a bug within axolotl code base as opposed to some deeper issue with deepspeed zero3, seeing as it works with qlora.
Which Operating Systems are you using?
Linux
macOS
Windows
Python Version
3.11.10
axolotl branch-commit
main
Acknowledgements
My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this bug has not been reported yet.
I am using the latest version of axolotl.
I have provided enough information for the maintainers to reproduce and diagnose the issue.
The text was updated successfully, but these errors were encountered:
Please check that this issue hasn't been reported before.
Expected Behavior
I expect deepspeed zero3 and 8-bit LoRA to be compatible and runs without error
Current behaviour
When loading model with deepspeed zero3 and 8-bit LoRA enabled, I ran into the error
RuntimeError: Only Tensors of floating point and complex dtype can require gradients
:However, if you use zero3 in tandem with 4-bit qLoRA, or just do full fine-tuning with zero3 enabled, it works fine.
Steps to reproduce
RuntimeError: Only Tensors of floating point and complex dtype can require gradients
Config yaml
Possible solution
(Putting on a tinfoil hat) I think it's a bug within axolotl code base as opposed to some deeper issue with deepspeed zero3, seeing as it works with qlora.
Which Operating Systems are you using?
Python Version
3.11.10
axolotl branch-commit
main
Acknowledgements
The text was updated successfully, but these errors were encountered: