-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathindex.Rmd
318 lines (231 loc) · 14.6 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
---
title: "Opioid-Related Deaths in New York State"
author: "Anuwat Pengput"
subtitle: Spatial Epidemiological Analysis and Prediction of Opioid-Related Deaths
in New York State
output:
html_document:
code_folding: hide
df_print: paged
---
# Introduction
<font size="3">Opioid analgesics are pain relievers derived from opium or have an opium-like activity. There are no better drugs than opioids for treating severe pain and suffering, however, opioids are the main drugs associated with overdose deaths (Ballantyne J.C., 2012). Opioid prescription rates have increased almost threefold in association with an increase of opioid related overdoses and deaths in the last 15 years. New York has been greatly impacted by the opioid epidemic. The rate of deaths related to any opioid in New York increased by 210% from 2010 to 2016 (Stopka et al., 2019). The opioid overdose death rate in the overall state was 18 deaths per 100,000 residents, which was higher than 18 states in the United States. This project aimed to investigate the distribution of opioid related-deaths rate, and predict the prevalence of opioid-related deaths across New York State using spatial epidemiological analysis.</font>
# Materials and methods
<font size="3"> The opioid-related death dataset was downloaded from vital Statistics: Opioid-Related Deaths by County in the Health Data New York Website. The opioid-related deaths include heroin and opioid analgesics mortalities (New York State Department of Health, 2019). Socio-economic and census data were directly downloaded from the American Community Survey in the United States Census Bureau (United State Census Bureau, 2019). Two datasets were joined into a data frame for preparing and analyzing data. The trend of opioid deaths in each county was presented from 2003 -2017 by a line plot. The prevalence rates were calculated by the overall number of opioid overdose deaths divided by the estimated population. The prevalence rates were presented across New York State using an interactive map. Moreover, crude rates of opioid deaths were calculated by the number of opioid deaths in each county in 2017 divided by the population of each county in 2017. The death rates were presented by an interactive map. Lastly, a random forest method was used to predict the prevalence rates of opioid poisoning deaths in New York State.</font>
## Download Packages and Libraries for the Project
```{r setup, message=FALSE}
library(tidyverse)
library(tidycensus)
# install.packages("mapview")
library(mapview)
library(tidyr)
# install.packages("plotly")
library(plotly)
# install.packages("plyr")
library(plyr)
library(DT)
library(kableExtra)
library(sf)
library(ranger)
library(ggplot2)
knitr::opts_chunk$set(cache=TRUE) # cache the results for quick compiling
```
## Download and clean all required data
### 1. Download opioid-related deaths in New York 2003-2017
```{r wrangling data, warning=FALSE, results = 'hide'}
vital <- read.csv('data/vital.csv')
ny_area <- read.csv('data/area.csv')
v17 <- load_variables(2017, "acs5", cache = TRUE)
#ny_area
#View(v17)
```
### 2. Download socio-economic data year 2017 from American Community Survey
```{r , warning=FALSE, results = 'hide'}
# Download socio-economic data year 2017 from ACS
NY <- get_acs(geography = "county",
variables = c(medincome = "B19013_001",
urban ="B08016_002",
rural = "B08016_003",
divorce = "B06008_004",
male = "B01001_002",
female = "B01001_026",
white = "B02001_002",
african = "B02001_003",
amindian = "B02001_004",
asian = "B02001_005",
hawaii = "B02001_006",
other = "B02001_007",
livealone = "B09021_002",
pop = "B01003_001"),
state = "NY", geometry = TRUE, cache_table=T)
NY_nomoe <- NY %>%
select(-moe)
NY2017_wide <- spread(NY_nomoe, variable, estimate)
#View(NY2017_wide)
# remove "new york" in county names
ny_2017_clean <- NY2017_wide %>%
separate(NAME, c("County"))
#head(ny_2017_clean)
## Change the names of counties
ny_2017_clean[45,"County"] <- "St Lawrence"
ny_2017_clean[31,"County"] <- "New York"
vital_2017 <- vital %>%
filter(Year == 2017)
vital_ny_2017 <- ny_2017_clean %>%
left_join(vital_2017, by = "County")
#View(vital_ny_2017)
## Change values in ny_area dataframe
ny_area$Areas[ny_area$Areas==1] <- "Urban"
ny_area$Areas[ny_area$Areas==2] <- "Rural"
ny_area$Regions[ny_area$Regions==1] <- "Central"
ny_area$Regions[ny_area$Regions==2] <- "East"
ny_area$Regions[ny_area$Regions==3] <- "Long Island"
ny_area$Regions[ny_area$Regions==4] <- "West"
options(tigris_use_cache = TRUE)
```
### 3. Download estimated decennial population in 2010\ (United State Census Bureau, 2010)
```{r , warning=FALSE, results = 'hide'}
## get estmate population 2010
ny_pop_estimate_2010 <- get_decennial(geography = "county", variables = "P001001",
state = "NY", geometry = ,
summary_var = "P001001", cache_table=T)
# View(ny_pop_estimate_2010)
ny_pop_estimate_2010 <- ny_pop_estimate_2010 %>%
separate(NAME, c("County"))
#View(ny_pop_estimate_2010)
## Change the names of counties
ny_pop_estimate_2010[28,"County"] <- "St Lawrence"
ny_pop_estimate_2010[14,"County"] <- "New York"
#View(ny_pop_estimate_2010)
## join population 10 years estimation to main dataset
vital_ny_2017 <- vital_ny_2017 %>%
left_join(ny_pop_estimate_2010, by = "County")
#View(vital_ny_2017)
```
### 4. Cumulative number of opioid deaths between 2003-2017\ for calculate the prevalence of opioid-related deaths
```{r , warning=FALSE, results = 'hide'}
# Cummuative all opioid deaths between 2003-2017
vital_overall <- spread(vital, Year, Opioid.Poisoning.Deaths)
vital_overall <- vital_overall %>%
mutate(total = vital_overall$`2003` +
vital_overall$`2004` +
vital_overall$`2005` +
vital_overall$`2006` +
vital_overall$`2007` +
vital_overall$`2008` +
vital_overall$`2009` +
vital_overall$`2010` +
vital_overall$`2011` +
vital_overall$`2012` +
vital_overall$`2013` +
vital_overall$`2014` +
vital_overall$`2015` +
vital_overall$`2016` +
vital_overall$`2017`)
#vital_overall
```
### 5. Calculate and create new columns of the prevalences and death rates of opioid-related deaths
```{r , warning=FALSE, results = 'hide'}
vital_ny_overall <- vital_ny_2017 %>%
left_join(vital_overall, by = "County") %>%
mutate(prevalence = (total/value)*100000) %>%
mutate(crude2017 = (Opioid.Poisoning.Deaths/pop)*100000)
#View(vital_ny_overall)
```
### 6. Prepare top 10 the highest prevalence and crude rate of opioid-related deaths
```{r , warning=FALSE}
## Prepare top 10 the highest prevalence of opioid-related deaths
prev_top10 <- vital_ny_overall %>%
st_set_geometry(NULL) %>%
left_join(ny_area, by = "County") %>%
select(County, Regions, Areas, prevalence ) %>%
arrange(desc(prevalence))
## Prepare top 10 the crude rates of opioid-related deaths
crude_top10 <- vital_ny_overall %>%
st_set_geometry(NULL) %>%
left_join(ny_area, by = "County") %>%
select(County, Regions, Areas, crude2017) %>%
arrange(desc(crude2017))
```
### 7. Create plot of Annual Number of Opioid Related Deaths in New York 2003-2017
```{r , warning=FALSE}
## Create plot of Annual Number of Opioid Related Deaths in New York 2003-2017
g <- ggplot (data = vital) +
geom_line(aes(x = Year, y = Opioid.Poisoning.Deaths, group = County, col = County)) +
geom_point(aes(x = Year, y = Opioid.Poisoning.Deaths, group = County, col = County, size = Opioid.Poisoning.Deaths)) +
geom_smooth(aes(x = Year, y = Opioid.Poisoning.Deaths), col = "black") +
theme_bw() +
theme(legend.title = element_blank())+
labs (title = "Annual Number of Opioid Related Deaths in New York 2003-2017", x = "Year", y= "Number of Opioid Related Deaths")
```
### 8. Prediction of Opioid Deaths using a random forest method
```{r , warning=FALSE}
predict_deaths <- vital_ny_overall %>%
left_join(ny_area, by = "County") %>%
select(prevalence, male, female, white, divorce, african, amindian, asian, hawaii, other, livealone, pop, total, crude2017, Regions, Areas) %>%
st_set_geometry(NULL)
train.idx <- sample(nrow(predict_deaths), 2/3 * nrow(predict_deaths))
vital.train <- predict_deaths[train.idx, ]
vital.test <- predict_deaths[-train.idx, ]
rf_vital <- ranger(prevalence ~ ., data = vital.train, write.forest = TRUE)
print(rf_vital)
pred.vital <- predict(rf_vital, data = vital.test)
```
# Results
<font size="3"> The number of opioid-related deaths dramatically increased in every county from 2003 to 2017. The plot illustrated that Suffolk County had the highest opioid-related deaths in New York State with 425 deaths in 2017. Kings County was more likely to have an increase in the number of opioid deaths over the last 5 years. In addition, a linear trend shows that opioid deaths generally increased throughout New York State.</font>
## Figure 1. Trend of opioid-related deaths between 2003 - 2017 across New York State
```{r , fig.height=6, fig.width=8, warning=FALSE}
ggplotly (g, tooltip = c("County", "Opioid.Poisoning.Deaths", "Year"))
```
<font size="3"> The interactive map shows the distrbution of prevalence rates of opioid-realted deaths by county in New York State. Sullivan is the highest prevalence with 254 deaths per 100,000 residents and Greene is the second highest prevalence with 193 deaths per 100,000 residents. Schuyler is the lowest county of opioid poisioining deaths with 32 deaths per 100,000 residents (Figure 2, Table 1). </font>
## Figure 2. Distribution Map of The Prevalence Rates\ of Opioid-Related Deaths per 100,000 residents
```{r, warning=FALSE}
mapviewOptions(legend.pos = "topright")
mapviewOptions(leafletWidth = 800)
mapview(vital_ny_overall, zcol = "prevalence", legend = TRUE, alpha = 0.5, layer.name = c("Prevalences of Opioid-related Deaths "))
```
## Table 1. Top 10 Counties of The Prevalence Rates\ of Opioid-Related Deaths per 100,000 population
```{r , warning=FALSE}
datatable(prev_top10,
colnames = c('County', 'Region', 'Area', 'Prevalence Rate'),
caption = 'Table 1: The Prevalence Rate of Opioid-Related'
) %>%
formatRound('prevalence', digits = 2) %>%
formatStyle('prevalence', color = '#c23b22', fontWeight = 'bold')
```
<font size="3"> In figure 3, the interactive map illustrates the distribution of opiod-related deaths across New York State in 2017. Lewis County had the highest death rate with about 49 deaths per 100,000 residents. Sullivan had the second highest death rate with 38 deaths per 100,000 residents. Columbia had the third highest death rate with 31 deaths per 100,000 residents. Moreover, Hamiltion and Schuyler had no opioid-related deaths in 2017 (Figure 3, Table 2). </font>
## Figure 3. Distribution Map of The Crude Rates\ of Opioid-Related Deaths per 100,000 residents in 2017
```{r echo=TRUE}
mapviewOptions(legend.pos = "bottomright")
mapviewOptions(leafletWidth = 800)
mapview(vital_ny_overall, zcol = "crude2017", legend = TRUE, alpha = 0.5, layer.name = c("Opioid Death Rates in 2017"))
```
## Table 2. Top 10 Counties of The Crude Rates\ of Opioid-Related Deaths per 100,000 residents in 2017
```{r , warning=FALSE}
datatable(crude_top10,
colnames = c('County', 'Region', 'Area', 'Death Rates in 2017'),
caption = 'Table 2: Death Rates of Opioid-Related in 2017'
) %>%
formatRound('crude2017', digits = 2) %>%
formatStyle('crude2017', color = '#c23b22', fontWeight = 'bold')
```
<font size="3"> This plot shows the prediction of prevalence rates of opioid-related deaths in New York City. The project used 15 independent variables to predict prevalence rates including: male, female, divorce, white, African American, American Indian, Asian, Hawaiian, other, live alone, population, overall number of opioid overdose deaths between 2003 - 2017, crude rates in 2017, regions, and areas of the county. The model shows that most prevalence rates that are less than 100 deaths per 100,000 for a population are more likely to increase. On the other hand, prevalence rates that are higher than a hundred deaths per 100,000 are more likely to decrease dramatically. When the prevalence rates are higher, the prediciton rates decline. </font>
## Figure 4.Prediction of Opioid-Related Death Prevalence\ in New York Using Random Forest
```{r echo=TRUE}
# plot(x = vital.test$prevalence, y = pred.vital$predictions)
# table(x = vital.test$prevalence, y = pred.vital$predictions)
gpredict <- ggplot() +
geom_point(aes(vital.test$prevalence, y = pred.vital$predictions), col = "red") +
theme_bw() +
geom_abline() +
labs(title = "Prediction of Opioid-Related Death Prevalence in New York", x = "Prevalence of Opioid-related Deaths (Test)", y = "Prediction of Opioid-related Deaths")
ggplotly(gpredict)
```
# Conclusions
Opioid-related deaths tend to increase every year across New York State. The study found that opioid deaths increased in almost all counties. The highest death rates were identified in the Central and Eastern New York regions. Schuyler County and Hamilton County had the lowest death rates among all counties. Additionally, rural areas had a higher risk of opioid-related death compared to urban areas. Health professionals should focus on these regions to prevent and control the opioid crisis in these areas.
# References
Ballantyne JC. Opioids and Other Analgesics. In: Verster JC, Brady K, Galanter M, Conrod P, eds. Drug Abuse and Addiction in Medical Illness: Causes, Consequences and Treatment. New York, NY: Springer New York, 2012:241-50.
Office of Quality and Patient Safety. (2019). Vital Statistics: Opioid-Related Deaths by County: Beginning 2003. Vital Statistics. Retrieved from https://health.data.ny.gov/Health/Vital-Statistics-Opioid-Related-Deaths-by-County-B/sn5m-dv52
Stopka, T. J., Amaravadi, H., Kaplan, A. R., Hoh, R., Bernson, D., Chui, K. K. H., . . . Rose, A. J. (2019). Opioid overdose deaths and potentially inappropriate opioid prescribing practices (PIP): A spatial epidemiological study. International Journal of Drug Policy, 68, 37-45. doi:https://doi.org/10.1016/j.drugpo.2019.03.024
United State Census Bureau. (2019). American Community Survey (ACS). Retrieved from https://www.census.gov/programs-surveys/acs
U.S. Census Bureau. (2010). Decennial Census Datasets. Retrieved from https://www.census.gov/programs-surveys/decennial-census/decade.2010.html