Podalize: Podcast Transcription and Analysis

This GitHub repository contains a Streamlit app that allows users to transcribe podcasts and video/audio content, as well as perform text analysis on the transcript. The app uses OpenAI's Whisper for transcription and Pyannote.audio for speaker diarization. Users have the option to manually enter speaker names and the app works with YouTube URLs, audio URLs, and MP3 files. The app outputs spoken time, a word cloud per speaker, and a transcript of the audio, and the results can be downloaded as a PDF file.

Sample episode

How to install

Note: This code was only tested on Ubuntu 20.04.5 LTS.

Install Anaconda
Clone/download this repo to your local machine.
Get a pyannote.adudio access token by following the instructions: here
Launch anaconda prompt and navigate to the repo on your local machine
Create a conda environment from environment.yml

$ conda create -n podalize python=3.9

Activate the conda environment

$ conda activate podalize

Install packages

$ pip install -r requirements.txt

Run streamlit app

$ streamlit run podalize_app.py

Tips

You may need to install ffmpeg. Follow instructions here: https://github.com/openai/whisper
You would need to install youtube downloader: https://github.com/yt-dlp/yt-dlp

Usage

Either upload a .mp3 file or provide a YouTube/Podcast URL for transcription and analysis.

Refrencess

Contributions Welcome

TODo

running the app on windows and macos
dockerize

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.gitignore		.gitignore
DocumentGenerator.py		DocumentGenerator.py
LICENSE.md		LICENSE.md
README.md		README.md
configs.py		configs.py
environment.yml		environment.yml
myutils.py		myutils.py
playground.py		playground.py
podalize_app.py		podalize_app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Podalize: Podcast Transcription and Analysis

How to install

Tips

Usage

Refrencess

Contributions Welcome

TODo

About

Releases

Packages

Contributors 4

Languages

License

mave5/podalize

Folders and files

Latest commit

History

Repository files navigation

Podalize: Podcast Transcription and Analysis

How to install

Tips

Usage

Refrencess

Contributions Welcome

TODo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages