Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
dont save moe lb-loss tensors if args.moe_loss_weight=0 (#119)
it takes GPU memory, and can also cause a leak if clear_load_balancing_loss() is not called Co-authored-by: Michael Gokhman <[email protected]>
- Loading branch information