-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wav output has no sound #1
Comments
Thank you pulling up the issue . You have added the right path file. The problem is TTS speech generation from text requires at least 100000 epochs to get a suitable output .It also requires a big audio dataset. You can use Colab pro or AWS to achieve the results. This is the same issue you are talking about ! Refer to the comments in the solution - |
Also anyone achieving any solution on colab do let me know the way around ... |
@souvikg544 Tried using over an hour of cleaned data, training took about 50 minutes but still out.wav has no speech, just a buzzing sound for a second. Every text extraction was successful, with 483 extracted 10 second bits. Tensorboard not launching so I skipped that stage. Audio Processor from TTS.Utils.Audio shows error first time trying to run command but runs normally with no changes the second time. Inferencing code below: Model recorded 100 epochs in training on 1hr of data, so the suggested 100000 would require 1000 hours of audio? |
@souvikg544
Ive tried running this part of the script, but it raises a no argument error for these two:
I manually copy the paths of the Best Model.pth and the config.json under tts_train_dir, and then it works with no error, but the output wav file has no speach, just a monotone buzzing sound.
Also tensorboard wouldn't launch so just skipped the step, could be related.
The text was updated successfully, but these errors were encountered: