- What categories of podcasts are popular?
- What makes a 5-star podcast? What makes a 1-star podcast?
- How do reviewer tendencies vary and differ?
- Working with a SQLite Database
- Using Python-based Data Analysis Packages such as Pandas
- Data Visualization
- Hypothesis Testing and Statistical Inference
- Making a Dashboard Using Looker Studio
This project delves into a dataset containing 2 million reviews on 100k podcasts, available on Kaggle (Podcast Reviews Dataset). The analysis encompasses statistical examination of podcast ratings across various categories, employing both traditional statistical methods and resampling techniques like bootstrap and permutation procedures. Assumptions such as normality and independence of samples are rigorously assessed.
- Analyze podcast ratings statistically across different categories using traditional statistics as well as bootstrap and permutation procedures.
- Assess statistical assumptions such as normality and independence of samples.
- Analyze the length of 1-star vs 5-star comedy podcasts.
- Compare paired non-parametric samples of podcast ratings on two podcasts.
- 5-star reviews are the most common for all categories.
- Most podcasts are rated similarly (5-stars).
- Ratings are not normally distributed.
- Negatively reviewed podcast categories (e.g., News-Government) receive more 3-star reviews than positively reviewed podcast categories (e.g., Business).
- 1-star reviews are slightly longer than 5-star reviews.
- People who rate two podcasts do so similarly.
- python>=3.10.13
- ipython
- jupyter
- matplotlib
- numpy
- pandas
- pyarrow
- scipy
- seaborn
- statsmodels
see requirements.txt for more details and versions used in development environment
podcast_reviews.ipynb
: Jupyter Notebook containing the analysis codepodcast_utils.py
: Python script with utility functionsrequirements.txt
: File containing the project dependenciesLICENSE
: MIT
-
Clone the repository
-
Create a virtual environment using the requirements.txt file provided
e.g.
python3 -m venv podcast_reviews/
# activate the venv and install all requirements provided
source podcast_reviews/bin/activate
pip install -r requirements.txt
- Open the Jupyter notebook file, podcast_reviews.ipynb, in your Jupyter environment and step through to see analysis.