-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
comparison with univnet #1
Comments
@thepowerfuldeez Fre-GAN is better than UnivNet |
have you tried to train on LJSpeech or your dataset? How much iterations needed comparing with HiFiGAN? Do you have checkpoints somewhere? |
I tried on my own dataset it takes 150k itr to generate excellent voice whereas HiFi-GAN usually takes 1 M steps for same quality. |
It only takes 2 days to reach 150k itr |
got it, thanks |
tried it out. i compare publicly available universal v1 hifigan (trained on 2.5M iterations on vctk) with this one trained on 150k at new HIFI-TTS dataset (5 times more data). It sounds great but I think it should be trained a bit more. Maybe 250k will be enough. |
Out of curiosity how many GPUs did you train with, and which ones? |
3x3090 with batch 16 |
Hi! How this work compares with UnivNet for which one you already implemented code: https://github.com/rishikksh20/UnivNet-pytorch
This paper is a little bit newer but afaik they're more concerned about generalizability of model for unseen speakers whlie this work focuses on overall quality (especially in high frequences)
can you maybe elaborate?
The text was updated successfully, but these errors were encountered: