Audio transcription in supervisor route #19

Luisotee · 2024-11-27T08:59:52Z

Audio Transcription Support for Supervisor Route

Overview

Add audio transcription capabilities to the supervisor route using Groq's Whisper V3 Turbo API. This centralizes audio processing in the AI API service, eliminating the need for individual transcription handling in client integrations (simulator, WhatsApp, Telegram).

Technical Changes

Added support for multiple audio formats:
- Direct support: mp3, mp4, mpeg, mpga, m4a, wav, webm
- Conversion support: ogg -> mp3
Implemented audio file handling with temporary storage
Integrated Groq's Whisper V3 Turbo for transcription
Added content type detection and validation
Centralized error handling for audio processing

Dependencies

Added python-multipart for form data handling
Added python-ffmpeg for audio conversion
Added groq for Whisper API access

Configuration

Requires GROQ_API_KEY environment variable

Benefits

Centralized audio processing
Consistent transcription quality
Reduced implementation complexity in clients
Unified error handling

… value

luandro · 2024-11-27T16:48:22Z

@Luisotee, it's working great, amazing job! Just don't forget to add the packages you use. For example:

uv add ffmpeg python-multipart

Which will automatically add to the pyproject.toml file and will reflect on every run of the the project. Added on my commit.

A second comment is that the route /api/supervisor/supervisor isn't ideal. Might be a good opportunity to change to /api/classifier or something else.

…idem/earth-defenders-assistant into luisotee/audio-transcription

Luisotee · 2024-11-28T09:52:08Z

@luandro should be fine now

…lassify

luandro · 2024-11-28T13:43:33Z

@Luisotee when testing on the docs page, "send empty value" works when set for message, but for some reason when setting empty value for message an error is throw:

Luisotee added 3 commits November 27, 2024 04:33

audio route

868d2b7

feat: audio transcription

41fc958

lint, optimizations

ad2f818

Luisotee added the feature New feature label Nov 27, 2024

This was linked to issues Nov 27, 2024

Intent classification #1

Open

Integrate text-to-speech #12

Open

Luisotee requested a review from luandro November 27, 2024 09:00

luandro added 2 commits November 27, 2024 13:47

chore: add missing deps for uv project

48851e2

chore: make sure both API params are optional and have a None default…

d875b3a

… value

Luisotee added 2 commits November 28, 2024 06:48

renamed supervisor to classifier

b90ce01

Merge branch 'luisotee/audio-transcription' of https://github.com/dig…

7f1ef02

…idem/earth-defenders-assistant into luisotee/audio-transcription

chore: deduplicate route from /classifier/classifier to /classifier/c…

e0853e0

…lassify

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio transcription in supervisor route #19

Audio transcription in supervisor route #19

Luisotee commented Nov 27, 2024

luandro commented Nov 27, 2024

Luisotee commented Nov 28, 2024

luandro commented Nov 28, 2024

Audio transcription in supervisor route #19

Are you sure you want to change the base?

Audio transcription in supervisor route #19

Conversation

Luisotee commented Nov 27, 2024

Audio Transcription Support for Supervisor Route

Overview

Technical Changes

Dependencies

Configuration

Benefits

luandro commented Nov 27, 2024

Luisotee commented Nov 28, 2024

luandro commented Nov 28, 2024