Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio Length Limitation and FlashAttention Warning in Parler TTS #1

Open
suman819 opened this issue Sep 5, 2024 · 0 comments
Open

Comments

@suman819
Copy link

suman819 commented Sep 5, 2024

I have been working with Parler TTS and encountered an issue where I am unable to generate audio longer than 20 seconds. Despite trying various methods, such as streaming and splitting the text into chunks, the audio output is still truncated to around 15-20 seconds.

Also I have applied the method of splitting the text if it exceeds 30 seconds or 600 characters by using punctuation (.,). However, when I combine the audio segments, there is an inconsistency in the voice tone, even when a specific voice prompt is set.

Additionally, I received a warning stating that FlashAttention is not installed. Could this be the cause of the issue? I would appreciate any guidance or suggestions on how to handle longer input text effectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant