In this repo is data I downloaded from a variety of sources in an attempt to classify the type of San Francisco crime incidents based on the location and time of day a crime occurred.
- Police incidents recorded between Jan 2016 and April 2017 (category, description,
location, police district, date, time)
a. Assault, shoplifting, theft from cars, drugs, vandalism, robbery, vehicle theft, theft of property - Geospatial data of Zip code boundaries
- Zillow house price data
- Bart and Caltrain station locations
- Police station locations
- Medical marijuana dispensary locations
- Health care facility locations
- Homeless shelter locations
- Zip code in which each incident occurred
- Average house price, median income level and population density in each zip code
- Distance to Union Square
- Distance to nearest police station, train station, medical marijuana dispensary, healthcare facility, homeless shelter
- Total number of nearby medical marijuana dispensaries, train stations, health care facilities, homeless shelters
Classification | Precision | Recall | F1 |
---|---|---|---|
Assault | 0.42 | 0.48 | 0.45 |
Shoplifting | 0.50 | 0.47 | 0.48 |
Theft from auto | 0.57 | 0.71 | 0.63 |
Drugs/narcotic | 0.55 | 0.50 | 0.52 |
Vandalism | 0.21 | 0.14 | 0.17 |
Robbery | 0.15 | 0.09 | 0.11 |
Vehicle theft | 0.33 | 0.25 | 0.28 |
Theft of property | 0.31 | 0.22 | 0.26 |
Avg / total | 0.43 | 0.46 | 0.44 |
Inside the repo is a D3 visualization of the data contained within a Flask app.