Skip to content

Original Project Proposal

Heidi Landenberger edited this page Mar 18, 2018 · 1 revision

Project Details

Data Sources:

Libraries:

  • Pandas
  • Seaborn
  • Matplotlib
  • Plotly
  • Numpy
  • scikit-learn
  • OS

Data Import & Cleaning: Charles!

Statistics

For all stats questions, we will say results are statistically significant at:

  • 95% Confidence Interval / P-Value < 0.5

Questions:

  1. Correlation between weather and severity/casualties? (Pratham)
  2. Correlation between time of day and accidents? (Dolly)
  3. More/less casualties based on road type (taking into account urban vs rural)? (Dolly)
  4. Correlation between accident COUNT and day of the week? (Heidi)

Features:

  • Basic visualizations (Heidi)
    • map viz
  • Create a predictive model using Naive Bayes relating weather conditions and the likelihood of an accident. (Pratham)
    • Predict likelihood of getting into an accident based on weather conditions

Roadmap (Due Sun. March 18th)

  1. EOD Sun March 11 (Ambitious)/EOD Mon March 12 (Final): Have cleaned dataset uploaded to Github
  2. By Start of Class Thurs March 15: Have at least one visualization done per question
  3. By Start of Class Sat March 17: Have everything done
  4. By 6PM Sun March 18: Review and turn it in