How to train SphereFace2 on >1m identities, e.g. WebFace24M? #14

yiminglin-ai · 2023-01-31T11:06:35Z

Hi @wy1iu @ydwen
Thank you for open-sourcing this repo.
Section 2.4 of the Sphereface2 paper says

the gradient computations in SphereFace2 are class-independent and can be performed locally within one GPU. Thus no communication cost is needed.

But when you distribute classifiers $W_i$ to different GPUs, what if a GPU only gets a batch of negative features? The lines after one_hot cannot be executed.

Have you tried training on WebFace42M? What is the performance of Sphereface2?

Happy to discuss the implementation details and contribute to this repo :)

ydwen · 2023-03-09T16:07:58Z

There is no problem for SF2 to get a batch of negative features. The final loss will be averaged across all gpus.

"The lines after one_hot cannot be executed."
In this case, we will construct a zero matrix as labels.

I haven't tried WebFace42M yet, since it requires too much computational resources.
I am looking forward to seeing how SF2 works on this dataset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train SphereFace2 on >1m identities, e.g. WebFace24M? #14

How to train SphereFace2 on >1m identities, e.g. WebFace24M? #14

yiminglin-ai commented Jan 31, 2023 •

edited

Loading

ydwen commented Mar 9, 2023

How to train SphereFace2 on >1m identities, e.g. WebFace24M? #14

How to train SphereFace2 on >1m identities, e.g. WebFace24M? #14

Comments

yiminglin-ai commented Jan 31, 2023 • edited Loading

ydwen commented Mar 9, 2023

yiminglin-ai commented Jan 31, 2023 •

edited

Loading