Machine Learning group project with KNN, SVM, Random Forest and XGBoost in Python. Completed as part of the Data Science & Data Analytics Certificate Program at UCSC Extension.
Latest dataset available from: https://www.kaggle.com/sobhanmoosavi/us-accidents
Click here to learn more about the dataset.
Also included are the files required for the map of California in the States 21basic folder. All files in the folder are required even though only one is referenced in the notebook.
Project Co-authors: Chien-Yu Huang, Sameer Sainani
Dataset authors:
-
Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, and Rajiv Ramnath. “A Countrywide Traffic Accident Dataset.”, 2019.
-
Moosavi, Sobhan, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. "Accident Risk Prediction based on Heterogeneous Sparse Data: New Dataset and Insights." In proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM, 2019.