Skip to content

jasminelo2020/shazam2.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

shazam2.0

Music Genre Classifier Project for Data Science Student Society @ UCSD

Introduction

Shazam is a service that can identify a song based on a short segment of a song. Our project aims to create a similar product which can determine the genre of a song, Shazam style. We utilized the Million Song dataset and various Python packages to create a CNN model that classifies the song genre of raw audio data. This GitHub repository contains our code from the process of creating our models and deliverable, a song genre classifier. Make sure to check out our poster for a overview of our project!

Dataset

The Million Song Dataset (MSD) is a collection of audio features and metadata for 1 million songs published before 2012. Some audio features were extracted using the Echo Nest API, an API that was provided by Echo Nest, now part of Spotify. We augmented this dataset with the Tagtrum MSD Genre Labels Dataset, since MSD does not come with genre labels. The combined dataset has around 200k rows. We lose around 4/5th of our dataset since many songs do not have a corresponding label in Tagtrum. MSD has many interesting metrics including danceability, loudness, and song hotness, but for our project, we are interested in segments, pitch, and timbre.

Final Deliverable

Our deliverable is a genre classifier tool that we created using Streamlit. The website prompts you to record some snippet of a song and the model will classify the genres, as well as showing you other metrics it calculated to arrive at a genre classification.

About This Repo

This repo contains the code that our group used to create the genre classifier. It will walk you through the exploratory data analysis (EDA), data preprocessing, model training, and how we created our deliverable with streamlit. Each section has some python, alongside a markdown file that walks you through the steps.

Acknowledgments

  • Thanks to DAn Ellis for the massive help with timbre and pitch. DAn's Github
  • Thanks to DS3 for providing us with guidance and financial resources to use AWS.
  • Thanks to UCSD IT Services for allowing us to utilize computing resources for data processing and model training.

Contact

So Hirota

Sean Furhman

Jasmine Lo

About

Music Genre Classifier Project for DS3 @ UCSD

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •