Reduce memory by using all_gather_into_tensor
(#1968)
#907
Job | Run time |
---|---|
3m 50s | |
3m 50s |
all_gather_into_tensor
(#1968)
#907
Job | Run time |
---|---|
3m 50s | |
3m 50s |