Skip to content

Differences in Model Behavior for 8kHz and 16kHz Audio Inputs #575

Answered by snakers4
tamarabanovac asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,

Are there any concrete differences in how the model operates when given audio data at these two sampling rates, or is there just an upsampling to 16kHz happening internally?

For previous model versions there was some difference in quality.
#2 (comment)

There was a chart on old wiki pages, but it's gone now.
For the latest model it is kind of the same.
If I am not mistaken, both the JIT and the ONNX models contain 2 actual models - one for 8k and one for 16k.
Hence the problem with ONNX export with opset < 16.

Was the model trained on both 16kHz and 8kHz audio data, or is it specifically optimized for one of the rates (such as 16kHz)?

The model was trained by resampling into either…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@tamarabanovac
Comment options

Answer selected by snakers4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants