Transcription and QA based on video

Scripts designed to download video based on the URL and transcribe the video using a LLM audio-to-text transcription model to be defined. The transcribed text is then used to generate questions and answers using a LLM to be defined - probably ChatGPT. The QA model is then used to answer questions based on the video content.

Setup

Create conda environment

conda create -n video-transcription python=3.12

Activate conda environment

conda activate video-transcription

Install requirements

pip install -r requirements.txt

Run the video download script with the URL of the video as an argument

python scripts/video_downloader.py --url <URL>

Run the audio from video script with the filepath of the video and the output path of the audio file as arguments

python scripts/audio_from_video.py --filepath download/video --output_path download/audio.wav

Transcribe the audio file using the OpenAI whisper-large-v3 model

python scripts/audio_transcription.py --audio_path download/audio.wav --transcription_path download/transcription.json --timestamps True

Add OPENAI_API_KEY to the environment variables

export OPENAI_API_KEY=<API_KEY>

Ask question based on context of the transcription using ChaGPT API

python scripts/question_answer.py --transcription_path download/transcription.txt --questions_path download/questions.txt --answers_path download/answers.txt

Execute bash script transcribe.sh to run the entire pipeline

./transcribe_url.sh <URL>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Transcription and QA based on video

Setup

Files

README.md

Latest commit

History

README.md

File metadata and controls

Transcription and QA based on video

Setup