You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For vit-s training, the batch size per gpu is 64 and the world size is 16, making the effective batch size 1024. Does the training become unstable if the effective batch size is increased beyond this number? say 1280.
The text was updated successfully, but these errors were encountered:
For vit-s training, the batch size per gpu is 64 and the world size is 16, making the effective batch size 1024. Does the training become unstable if the effective batch size is increased beyond this number? say 1280.
The text was updated successfully, but these errors were encountered: