You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current code seems to calculate the loss on the global similarity matrix on each gpu. Computing loss only for local and global features as described in openai/CLIP#132 seems to be more computationally and memory efficient.
Sorry to bother you if I misunderstood the code
The text was updated successfully, but these errors were encountered:
My idea is just like yours. After debugging, I found that during the training epoch, all GPUs compute the same global loss with the same sim_matrix instead of individually calculating local losses and then gathering and averaging them. There is a clear computation overlap here. I also have seen that in the function "train_epoch", there is an useless computation "loss.mean()" that seems do nothing after the model.forward(). We only need do local loss following the openai/CLIP#132 and do loss.backward(), The gradient synchronization will be done automatically by DDP.
CLIP4Clip/modules/modeling.py
Line 400 in 508ffa3
The current code seems to calculate the loss on the global similarity matrix on each gpu. Computing loss only for local and global features as described in openai/CLIP#132 seems to be more computationally and memory efficient.
Sorry to bother you if I misunderstood the code
The text was updated successfully, but these errors were encountered: