You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @wy1iu@ydwen
Thank you for open-sourcing this repo.
Section 2.4 of the Sphereface2 paper says
the gradient computations in SphereFace2 are class-independent and can be performed locally within one GPU. Thus no communication cost is needed.
But when you distribute classifiers $W_i$ to different GPUs, what if a GPU only gets a batch of negative features? The lines after one_hot cannot be executed.
Have you tried training on WebFace42M? What is the performance of Sphereface2?
Happy to discuss the implementation details and contribute to this repo :)
The text was updated successfully, but these errors were encountered:
Hi @wy1iu @ydwen
Thank you for open-sourcing this repo.
Section 2.4 of the Sphereface2 paper says
But when you distribute classifiers$W_i$ to different GPUs, what if a GPU only gets a batch of negative features? The lines after one_hot cannot be executed.
Have you tried training on WebFace42M? What is the performance of Sphereface2?
Happy to discuss the implementation details and contribute to this repo :)
The text was updated successfully, but these errors were encountered: