-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage output on multi channel audio and audio above 24khz #9
Comments
the whisper model itself expects 16Khz mono. |
ah that make sense, I would assume burn doesn't do down sampling for the samplerate or for channel downmixing |
This is partially addressed by 4080a33, but if I get the time I plan on looking into resampling and channel downmixing. I do have some work done, however I was using dasp which has proven it'self to be rather unusable, so im looking into different crates. Looked into fon and it seems like it may work, but i don't like how it hasn't been active since feb'22. currently looking into other crates |
@Quackdoc have a look at https://github.com/HEnquist/rubato It does what you need. I've had no success with the sync Ftt methods yet but SincFixedIn which is in their main example works well. Here's how I'm using it - I have a pop at the end but the main downsampling is very good: (I had a feeling the Synchronous resampling FFT method might be better for wasm but haven't tested it and may have misunderstood what's its designed for, as the output is terribly distorted. Still investigating. Hopefully SincInterpolationType::Linear is good enough for real-time use cases) |
Seems like audio decode is picky on what gets input to it
Audio mediainfo
Audio file: https://cdn.discordapp.com/attachments/615105639567589376/1141946730485665893/slap.wav
whisper-ctranslate2:
EDIT: transcoding the audio file using
ffmpeg -i .\slap.wav -ar SAMPLE_RATE -ac 1 slap-edit.wav
seems to make it work, It needs to be both single channel as well as 41khz or less.at 41khz the audio output was
at 24khz and below it is
The text was updated successfully, but these errors were encountered: