Skip to content

Latest commit

 

History

History
60 lines (41 loc) · 1.6 KB

README.md

File metadata and controls

60 lines (41 loc) · 1.6 KB

Transcription and QA based on video

Scripts designed to download video based on the URL and transcribe the video using a LLM audio-to-text transcription model to be defined. The transcribed text is then used to generate questions and answers using a LLM to be defined - probably ChatGPT. The QA model is then used to answer questions based on the video content.

Setup

  1. Create conda environment
conda create -n video-transcription python=3.12
  1. Activate conda environment
conda activate video-transcription
  1. Install requirements
pip install -r requirements.txt
  1. Run the video download script with the URL of the video as an argument
python scripts/video_downloader.py --url <URL>
  1. Run the audio from video script with the filepath of the video and the output path of the audio file as arguments
python scripts/audio_from_video.py --filepath download/video --output_path download/audio.wav
  1. Transcribe the audio file using the OpenAI whisper-large-v3 model
python scripts/audio_transcription.py --audio_path download/audio.wav --transcription_path download/transcription.json --timestamps True
  1. Add OPENAI_API_KEY to the environment variables
export OPENAI_API_KEY=<API_KEY>
  1. Ask question based on context of the transcription using ChaGPT API
python scripts/question_answer.py --transcription_path download/transcription.txt --questions_path download/questions.txt --answers_path download/answers.txt
  1. Execute bash script transcribe.sh to run the entire pipeline
./transcribe_url.sh <URL>