Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix FSDP gradient reduction with orig params
The `param.grad is not None` check also fixes gradient reduction in the case of parameters not having acquired gradients (as parameters could become empty tensors in FSDP). Thanks to @ofivite for suggesting that `use_orig_params=True` could be the cause of the issue, which greatly helped with analysis. Signed-off-by: janEbert <[email protected]>
- Loading branch information