-
Notifications
You must be signed in to change notification settings - Fork 2
/
Forecasting 3-5 - ARMA Forecasting.Rmd
198 lines (135 loc) · 4.3 KB
/
Forecasting 3-5 - ARMA Forecasting.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
---
output:
xaringan::moon_reader:
css: "my-theme.css"
lib_dir: libs
nature:
highlightStyle: github
highlightLines: true
---
layout: true
.hheader[<a href="index.html">`r fontawesome::fa("home", fill = "steelblue")`</a>]
---
```{r setup, include=FALSE, message=FALSE}
options(htmltools.dir.version = FALSE, servr.daemon = TRUE)
knitr::opts_chunk$set(fig.height=5, fig.align="center")
library(huxtable)
```
class: center, middle, inverse
# ARIMA Models
## Forecasting
.futnote[Eli Holmes, UW SAFS]
.citation[[email protected]]
---
```{r load_data, echo=FALSE, message=FALSE, warning=FALSE}
library(ggplot2)
library(gridExtra)
library(reshape2)
library(tseries)
library(forecast)
```
```{r load_data2, message=FALSE, warning=FALSE, echo=FALSE}
load("landings.RData")
landings$log.metric.tons = log(landings$metric.tons)
landings = subset(landings, Year <= 1987)
anchovy = subset(landings, Species=="Anchovy")$log.metric.tons
sardine = subset(landings, Species=="Sardine")$log.metric.tons
```
## Forecasting
The basic idea of forecasting with an ARIMA model is the same as forecasting with a time-varying regressiion model.
We estimate a model and the parameters for that model. For example, let's say we want to forecast with ARIMA(2,1,0) model:
$$y_t = \beta_1 y_{t-1} + \beta_2 y_{t-2} + e_t$$
where $y_t$ is the first difference of our anchovy data.
---
Let's estimate the $\beta$'s:
```{r}
fit <- Arima(anchovy, order=c(2,1,0))
coef(fit)
```
So we will forecast with this model:
$$y_t = -0.3348 y_{t-1} - 0.1454 y_{t-2} + e_t$$
So to get our forecast for 1988, we do this
$$(y_{1988}-y_{1987}) = -0.3348 (y_{1987}-y_{1986}) - 0.1454 (y_{1986}-y_{1985})$$
$$y_{1988} = y_{1987}-0.3348 (y_{1987}-y_{1986}) - 0.1454 (y_{1986}-y_{1985})$$
---
$$y_{1988} = y_{1987}-0.3348 (y_{1987}-y_{1986}) - 0.1454 (y_{1986}-y_{1985})$$
Here is R code to do that:
```{r}
anchovy[24]+coef(fit)[1]*(anchovy[24]-anchovy[23])+
coef(fit)[2]*(anchovy[23]-anchovy[22])
```
Thankfully, `forecast()` automates this calculation for us.
```{r}
forecast(fit, h=1)
```
---
## Forecasting
We can forecast from a fit in R using the `forecast()` function. `h` is the number of time steps forward to forecast. The upper and lower prediction intervals are shown.
```{r}
fit <- auto.arima(sardine, test="adf")
fr <- forecast(fit, h=5)
fr
```
---
We can plot our forecast with prediction intervals. Here is the sardine forecast:
```{r}
plot(fr, xlab="Year")
```
---
# Repeat for anchovy
```{r}
fit <- auto.arima(anchovy)
fr <- forecast(fit, h=5)
plot(fr)
```
---
# What happens if I have missing values?
```{r}
anchovy.miss <- anchovy
anchovy.miss[10:14] <- NA
fit <- auto.arima(anchovy.miss)
fr <- forecast(fit, h=5)
plot(fr)
```
---
# Repeat for Chub Mackerel
```{r}
dat <- subset(landings, Species=="Chub.mackerel")$log.metric.tons
fit <- auto.arima(dat)
fr <- forecast(fit, h=5)
plot(fr, ylim=c(6,10))
```
---
## The "Naive" forecast
The "naive" forecast is simply the last value observed. If we want to prediction landings in 2019, the naive forecast would be the landings in 2018. This is a difficult forecast to beat! It has the advantage of having no parameters.
In forecast, we can fit this model with the `naive()` function. Note this is the same as the `rwf()` function.
---
```{r}
fit.naive <- naive(anchovy)
fr.naive <- forecast(fit.naive, h=5)
plot(fr.naive)
```
---
## The "Naive" forecast with drift
The "naive" forecast is equivalent to a random walk with no drift. So this
$$x_t = x_{t-1} + e_t$$
As you saw with the anchovy fit, it doesn't allow an upward trend. Let's make it a little more flexible by add `drift`. This means we estimate one term, the trend.
$$x_t = \mu + x_{t-1} + e_t$$
---
```{r}
fit.rwf <- rwf(anchovy, drift=TRUE)
fr.rwf <- forecast(fit.rwf, h=5)
plot(fr.rwf)
```
---
## The "mean" forecast
The "mean" forecast is simply the mean of the data. If we want to prediction landings in 2019, the mean forecast would be the average of all our data. This is a poor forecast typically. It uses no information about the most recent values.
In forecast, we can fit this model with the `Arima()` function and `order=c(0,0,0)`. This will fit this model:
$$x_t = e_t$$
where $e_t \sim N(\mu, \sigma)$.
---
```{r}
fit.mean <- Arima(anchovy, order=c(0,0,0))
fr.mean <- forecast(fit.mean, h=5)
plot(fr.mean)
```