Reduce memory by using all_gather_into_tensor
#3277
Job | Run time |
---|---|
3m 36s | |
5m 17s | |
2m 40s | |
1m 12s | |
1m 18s | |
1m 6s | |
1m 8s | |
3m 58s | |
2m 44s | |
7m 29s | |
3m 38s | |
5m 31s | |
2m 9s | |
1m 55s | |
1m 24s | |
1m 48s | |
1m 5s | |
3m 41s | |
3m 28s | |
12m 52s | |
1h 7m 59s |