Skip to content

Is RealtimeSTT able to recognize audio from ffmpeg stream? #137

Answered by KoljaB
qbxdp asked this question in Q&A
Discussion options

You must be logged in to vote

You'll need to convert the audio stream from FFmpeg into PCM WAV 16 kHz and then use feed_audio method. Depending on the actual mp3 format of the chunks the conversion can be rather straightforward (plain mp3) or quite complicated (if the mp3 chunks depend on each other).

If it's easy format conversion can be done with pydub:

from pydub import AudioSegment
segment = AudioSegment.from_file(io.BytesIO(chunk), format="mp3")

Or you can use a ffmpeg cli command to convert.

feed_audio method will require 16 kHz mono PCM chunks of 1024 samples feeded in realtime (chunks have to come in with correct timing). Demo code:

if __name__ == "__main__":
    import threading
    import pyaudio
    from Re…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@qbxdp
Comment options

Answer selected by qbxdp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants