-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project_Feedback_Anuwat #1
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,14 +5,14 @@ subtitle: Visualization and Classification for Larceny, Assault and Harassment | |
output: html_document | ||
--- | ||
|
||
# Introduction | ||
# Introduction | ||
Crime is a social issue, like a disease, which tends to spread as spatial clusters. We are always seeking for a way to minimize and prevent the occurrance of crime. Imagine if we could predict where the probability of crime occurring, our police could deploy the law enforcement to the potentially dangerous areas, which is more efficient. Usually, we may assume occurance of crime as random and researchers used behavioral and social methods to study it. However, with the development of data analysis and techonology, we could use more quantitative ways to analyze it. | ||
|
||
For example, there is one program named PredPol, which is conducted by researchers from the University of California, Los Angeles (UCLA). With the help of the department of Los Angeles Police, they collected about 13 billion cases in 80 years and just used two variables, when and where to build models to predict where a crime could happen during each day, which is amazing and shows us the power of the environment influenting human's choice. And another paper written by Dr.Irina Matijosaitiene revealed the effect of land uses on crime type classification and prediction. | ||
|
||
When using classification models, they are actually calculating the probability of when and where one crime type may happe. So in this project, I will focus on classification models. Of course, I'd like to use visulazation to give audience an intuitive feel about the relationship between the occurance of crime with time and location. | ||
|
||
# Materials and methods | ||
# Materials and methods | ||
I will use the crime data from 2015-2017 in Manhattan, New York City to build classification models to classify the top three crime types occurred in this study area, which are larceny, assault and harassment. And the main factors input as features in the models are time and location, to be specific, time refers to exact time and day of week, and location refers to land use. | ||
|
||
* Dataset Sources | ||
|
@@ -201,12 +201,12 @@ Still Working on it... | |
|
||
# Results | ||
|
||
## Top ten most committed crime types | ||
## Top ten most committed crime types | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Awesome results! I like it a lot. It would be better if you would use spec_color, spec_font_size or fullwidth to be the more attractive and user-friendly table. |
||
```{r echo=FALSE} | ||
kable(top10_Crime_MAN[1:10,]) | ||
``` | ||
|
||
## The Preference on Time of Top Three Committed Crime Types | ||
## The Preference on Time of Top Three Committed Crime Types | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great job! This graph provides me useful information about the relationship between time and amount of top three committed crime types. I think it could be better if you could add a period of the dataset in the title (2015-2017), add units of data on x- and y-axis that would help readers easier to understand the graph. In addition, please describe the meaning of NA in the footnote, which helps readers better understanding. |
||
```{r echo=FALSE} | ||
ggplot(time_top3,aes(x = TimeInterval, y= amount,group=1))+ | ||
geom_point(aes(color = CrimeType))+ | ||
|
@@ -215,7 +215,7 @@ ggplot(time_top3,aes(x = TimeInterval, y= amount,group=1))+ | |
theme(legend.position = "none",axis.text.x = element_text(angle = 60, hjust = 1)) | ||
``` | ||
|
||
## The Preference on Day of Week of Top Three Committed Crime Types | ||
## The Preference on Day of Week of Top Three Committed Crime Types | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Awesome job! I think it could be better if you could add a period of the dataset in the title (2015-2017), add units of data on the x-axis label that would help readers easier to understand the graph. |
||
```{r echo=FALSE} | ||
ggplot(dw_top3,aes(x = DayofWeek, y= amount, group = 1))+ | ||
geom_point(aes(color = CrimeType))+ | ||
|
@@ -228,4 +228,4 @@ ggplot(dw_top3,aes(x = DayofWeek, y= amount, group = 1))+ | |
|
||
What have you learned? Are there any broader implications? | ||
|
||
# References | ||
# References | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add references |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The introduction is very clear about what you are doing. It would be a more scientific introduction if you could add an objective or a hypothesis of your study at the end of the introduction.
There are some typos such as visulazation, and occurance, please check it again.