forked from mca91/EconometricsWithR
-
Notifications
You must be signed in to change notification settings - Fork 0
/
14-ch14.Rmd
1999 lines (1502 loc) · 101 KB
/
14-ch14.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# Introduction to Time Series Regression and Forecasting {#ittsraf}
```{r, warnings = FALSE, message=FALSE, echo=F, purl=F}
library(dynlm)
library(stargazer)
library(scales)
library(readxl)
library(urca)
```
Time series data is data is collected for a single entity over time. This is fundamentally different from cross-section data which is data on multiple entities at the same point in time. Time series data allows estimation of the effect on $Y$ of a change in $X$ *over time*. This is what econometricians call a *dynamic causal effect*. Let us go back to the application to cigarette consumption of Chapter \@ref(ivr) where we were interested in estimating the effect on cigarette demand of a price increase caused by a raise of the general sales tax. One might use time series data to assess the causal effect of a tax increase on smoking both initially and in subsequent periods.
Another application of time series data is forecasting. For example, weather services use time series data to predict tomorrow's temperature by inter alia using today's temperature and temperatures of the past. To motivate an economic example, central banks are interested in forecasting next month's unemployment rates.
The remainder of Chapters in the book deals with the econometric techniques for the analysis of time series data and applications to forecasting and estimation of dynamic causal effects. This section covers the basic concepts presented in Chapter 14 of the book, explains how to visualize time series data and demonstrates how to estimate simple autoregressive models, where the regressors are past values of the dependent variable or other variables. In this context we also discuss the concept of stationarity, an important property which has far-reaching consequences.
Most empirical applications in this chapter are concerned with forecasting and use data on U.S. macroeconomic indicators or financial time series like Gross Domestic Product (GDP), the unemployment rate or excess stock returns.
The following packages and their dependencies are needed for reproduction of the code chunks presented throughout this chapter:
+ `r ttcode("AER")` [@R-AER]
+ `r ttcode("dynlm")` [@R-dynlm]
+ `r ttcode("forecast")` [@R-forecast]
+ `r ttcode("readxl")` [@R-readxl]
+ `r ttcode("stargazer")` [@R-stargazer]
+ `r ttcode("scales")` [@R-scales]
+ `r ttcode("quantmod")` [@R-quantmod]
+ `r ttcode("urca")` [@R-urca]
Please verify that the following code chunk runs on your machine without any errors.
```{r, warning=FALSE, message=FALSE, eval=FALSE}
library(AER)
library(dynlm)
library(forecast)
library(readxl)
library(stargazer)
library(scales)
library(quantmod)
library(urca)
```
## Using Regression Models for Forecasting
What is the difference between estimating models for assessment of causal effects and forecasting? Consider again the simple example of estimating the casual effect of the student-teacher ratio on test scores introduced in Chapter \@ref(lrwor).
```{r, warning = FALSE, message=FALSE}
library(AER)
data(CASchools)
CASchools$STR <- CASchools$students/CASchools$teachers
CASchools$score <- (CASchools$read + CASchools$math)/2
mod <- lm(score ~ STR, data = CASchools)
mod
```
As has been stressed in Chapter \@ref(rmwmr), the estimate of the coefficient on the student-teacher ratio does not have a causal interpretation due to omitted variable bias. However, in terms of deciding which school to send her child to, it might nevertheless be appealing for a parent to use `r ttcode("mod")` for forecasting test scores in schooling districts where no public data about on scores are available.
As an example, assume that the average class in a district has $25$ students. This is not a perfect forecast but the following one-liner might be helpful for the parent to decide.
```{r}
predict(mod, newdata = data.frame("STR" = 25))
```
In a time series context, the parent could use data on present and past years test scores to forecast next year's test scores --- a typical application for an autoregressive model.
## Time Series Data and Serial Correlation {#tsdasc}
GDP is commonly defined as the value of goods and services produced over a given time period. The data set `r ttcode("us_macro_quarterly.xlsx")` is provided by the authors and can be downloaded [here](http://wps.pearsoned.co.uk/wps/media/objects/16103/16489878/data3eu/us_macro_quarterly.xlsx). It provides quarterly data on U.S. real (i.e. inflation adjusted) GDP from 1947 to 2004.
As before, a good starting point is to plot the data. The package `r ttcode("quantmod")` provides some convenient functions for plotting and computing with time series data. We also load the package `r ttcode("readxl")` to read the data into `r ttcode("R")`.
```{r, warning=FALSE, message=FALSE}
# attach the package 'quantmod'
library(quantmod)
```
We begin by importing the data set.
```{r}
# load US macroeconomic data
USMacroSWQ <- read_xlsx("Data/us_macro_quarterly.xlsx",
sheet = 1,
col_types = c("text", rep("numeric", 9)))
# format date column
USMacroSWQ$...1 <- as.yearqtr(USMacroSWQ$...1, format = "%Y:0%q")
# adjust column names
colnames(USMacroSWQ) <- c("Date", "GDPC96", "JAPAN_IP", "PCECTPI",
"GS10", "GS1", "TB3MS", "UNRATE", "EXUSUK", "CPIAUCSL")
```
We the first column of `r ttcode("us_macro_quarterly.xlsx")` contains text and the remaining ones are numeric. Using `r ttcode('col_types = c("text", rep("numeric", 9))')` we tell `r ttcode("read_xlsx()")` take this into account when importing the data.
It is useful to work with time-series objects that keep track of the frequency of the data and are extensible. In what follows we will use objects of the class `r ttcode("xts")`, see `?xts`. Since the data in `r ttcode("USMacroSWQ")` are in quarterly frequency we convert the first column to `r ttcode("yearqtr")` format before generating the `r ttcode("xts")` object `r ttcode("GDP")`.
```{r}
# GDP series as xts object
GDP <- xts(USMacroSWQ$GDPC96, USMacroSWQ$Date)["1960::2013"]
# GDP growth series as xts object
GDPGrowth <- xts(400 * log(GDP/lag(GDP)))
```
The following code chunks reproduce Figure 14.1 of the book.
```{r, fig.align='center'}
# reproduce Figure 14.1 (a) of the book
plot(log(as.zoo(GDP)),
col = "steelblue",
lwd = 2,
ylab = "Logarithm",
xlab = "Date",
main = "U.S. Quarterly Real GDP")
```
```{r, fig.align='center'}
# reproduce Figure 14.1 (b) of the book
plot(as.zoo(GDPGrowth),
col = "steelblue",
lwd = 2,
ylab = "Logarithm",
xlab = "Date",
main = "U.S. Real GDP Growth Rates")
```
### Notation, Lags, Differences, Logarithms and Growth Rates {-}
For observations of a variable $Y$ recorded over time, $Y_t$ denotes the value observed at time $t$. The period between two sequential observations $Y_t$ and $Y_{t-1}$ is a unit of time: hours, days, weeks, months, quarters, years etc. Key Concept 14.1 introduces the essential terminology and notation for time series data we use in the subsequent sections.
```{r, eval = my_output == "html", results='asis', echo=F, purl=F}
cat('
<div class = "keyconcept" id="KC14.1">
<h3 class = "right"> Key Concept 14.1 </h3>
<h3 class = "left"> Lags, First Differences, Logarithms and Growth Rates </h3>
- Previous values of a time series are called *lags*. The first lag of $Y_t$ is $Y_{t-1}$. The $j^{th}$ lag of $Y_t$ is $Y_{t-j}$. In <tt>r ttcode("R")</tt>, lags of univariate or multivariate time series objects are conveniently computed by <tt>lag()</tt>, see <tt>?lag</tt>.
- Sometimes we work with a differenced series. The first difference of a series is $\\Delta Y_{t} = Y_t - Y_{t-1}$, the difference between periods $t$ and $t-1$. If <tt>Y</tt> is a time series, the series of first differences is computed as <tt>diff(Y)</tt>.
- It may be convenient to work with the first difference in logarithms of a series. We denote this by $\\Delta \\log(Y_t) = \\log(Y_t) - \\log(Y_{t-1})$. For a time series <tt>Y</tt>, this is obtained using <tt>log(Y/lag(Y))</tt>.
- $100 \\Delta \\log (Y_t)$ is an approximation for the percentage change between $Y_t$ and $Y_{t-1}$.
</div>
')
```
```{r, eval = my_output == "latex", results='asis', echo=F, purl=F}
cat('\\begin{keyconcepts}[Lags, First Differences, Logarithms and Growth Rates]{14.1}
\\begin{itemize}
\\item Previous values of a time series are called \\textit{lags}. The first lag of $Y_t$ is $Y_{t-1}$. The $j^{th}$ lag of $Y_t$ is $Y_{t-j}$. In \\texttt{R}, lags of univariate or multivariate time series objects are conveniently computed by \\texttt{lag()}, see \\texttt{?lag}.
\\item Sometimes we work with differenced series. The first difference of a series is $\\Delta Y_{t} = Y_t - Y_{t-1}$, the difference between periods $t$ and $t-1$. If \\texttt{Y} is a time series, the series of first differences is computed as \\texttt{diff(Y)}.
\\item It may be convenient to work with the first difference in logarithms of a series. We denote this by $\\Delta \\log(Y_t) = \\log(Y_t) - \\log(Y_{t-1})$. For a time series \\texttt{Y}, this is obtained using \\texttt{log(Y/lag(Y))}.
\\item $100 \\Delta \\log (Y_t)$ is an approximation for the percentage change between $Y_t$ and $Y_{t-1}$.
\\end{itemize}
\\end{keyconcepts}
')
```
The definitions made in Key Concept 14.1 are useful because of two properties that are common to many economic time series:
- Exponential growth: some economic series grow approximately exponentially such that their logarithm is approximately linear.
- The standard deviation of many economic time series is approximately proportional to their level. Therefore, the standard deviation of the logarithm of such a series is approximately constant.
Furthermore, it is common to report growth rates in macroeconomic series which is why $\log$-differences are often used.
Table 14.1 of the book presents the quarterly U.S. GDP time series, its logarithm, the annualized growth rate and the first lag of the annualized growth rate series for the period 2012:Q1 - 2013:Q1. The following simple function can be used to compute these quantities for a quarterly time series `r ttcode("series")`.
```{r}
# compute logarithms, annual growth rates and 1st lag of growth rates
quants <- function(series) {
s <- series
return(
data.frame("Level" = s,
"Logarithm" = log(s),
"AnnualGrowthRate" = 400 * log(s / lag(s)),
"1stLagAnnualGrowthRate" = lag(400 * log(s / lag(s))))
)
}
```
The annual growth rate is computed using the approximation $$Annual Growth Y_t = 400 \cdot\Delta\log(Y_t)$$ since $100\cdot\Delta\log(Y_t)$ is an approximation of the quarterly percentage changes, see Key Concept 14.1.
We call `r ttcode("quants()")` on observations for the period 2011:Q3 - 2013:Q1.
```{r}
# obtain a data.frame with level, logarithm, annual growth rate and its 1st lag of GDP
quants(GDP["2011-07::2013-01"])
```
#### Autocorrelation {-}
Observations of a time series are typically correlated. This type of correlation is called *autocorrelation* or *serial correlation*. Key Concept 14.2 summarizes the concepts of population autocovariance and population autocorrelation and shows how to compute their sample equivalents.
```{r, eval = my_output == "html", results='asis', echo=F, purl=F}
cat('
<div class = "keyconcept" id="KC14.2">
<h3 class = "right"> Key Concept 14.2 </h3>
<h3 class = "left"> Autocorrelation and Autocovariance </h3>
The covariance between $Y_t$ and its $j^{th}$ lag, $Y_{t-j}$, is called the $j^{th}$ *autocovariance* of the series $Y_t$. The $j^{th}$ *autocorrelation coefficient*, also called the *serial correlation coefficient*, measures the correlation between $Y_t$ and $Y_{t-j}$.
We thus have
\\begin{align*}
j^{th} \\text{autocovariance} =& \\, Cov(Y_t,Y_{t-j}), \\\\
j^{th} \\text{autocorrelation} = \\rho_j =& \\, \\rho_{Y_t,Y_{t-j}} = \\frac{Cov(Y_t,Y_{t-j)}}{\\sqrt{Var(Y_t)Var(Y_{t-j})}}.
\\end{align*}
Population autocovariance and population autocorrelation can be estimated by $\\widehat{Cov(Y_t,Y_{t-j})}$, the sample autocovariance, and $\\widehat{\\rho}_j$, the sample autocorrelation:
\\begin{align*}
\\widehat{Cov(Y_t,Y_{t-j})} =& \\, \\frac{1}{T} \\sum_{t=j+1}^T (Y_t - \\overline{Y}_{j+1:T})(Y_{t-j} - \\overline{Y}_{1:T-j}), \\\\
\\widehat{\\rho}_j =& \\, \\frac{\\widehat{Cov(Y_t,Y_{t-j})}}{\\widehat{Var(Y_t)}}
\\end{align*}
$\\overline{Y}_{j+1:T}$ denotes the average of $Y_{j+1}, Y{j+2}, \\dots, Y_T$.
In <tt>R</tt> the function <tt>acf()</tt> from the package <tt>stats</tt> computes the sample autocovariance or the sample autocorrelation function.
</div>
')
```
```{r, eval = my_output == "latex", results='asis', echo=F, purl=F}
cat('\\begin{keyconcepts}[Autocorrelation and Autocovariance]{14.2}
The covariance between $Y_t$ and its $j^{th}$ lag, $Y_{t-j}$, is called the $j^{th}$ \\textit{autocovariance} of the series $Y_t$. The $j^{th}$ \\textit{autocorrelation coefficient}, also called the \\textit{serial correlation coefficient}, measures the correlation between $Y_t$ and $Y_{t-j}$. We thus have
\\begin{align*}
j^{th} \\text{autocovariance} =& \\, Cov(Y_t,Y_{t-j}), \\\\
j^{th} \\text{autocorrelation} = \\rho_j =& \\, \\rho_{Y_t,Y_{t-j}} = \\frac{Cov(Y_t,Y_{t-j)}}{\\sqrt{Var(Y_t)Var(Y_{t-j})}}.
\\end{align*}
Population autocovariance and population autocorrelation can be estimated by $\\widehat{Cov(Y_t,Y_{t-j})}$, the sample autocovariance, and $\\widehat{\\rho}_j$, the sample autocorrelation:
\\begin{align*}
\\widehat{Cov(Y_t,Y_{t-j})} =& \\, \\frac{1}{T} \\sum_{t=j+1}^T (Y_t - \\overline{Y}_{j+1:T})(Y_{t-j} - \\overline{Y}_{1:T-j}), \\\\
\\widehat{\\rho}_j =& \\, \\frac{\\widehat{Cov(Y_t,Y_{t-j})}}{\\widehat{Var(Y_t)}}.
\\end{align*}\\vspace{0.5cm}
$\\overline{Y}_{j+1:T}$ denotes the average of $Y_{j+1}, Y{j+2}, \\dots, Y_T$.\\newline
In \\texttt{R} the function \\texttt{acf()} from the package \\texttt{stats} computes the sample autocovariance or the sample autocorrelation function.
\\end{keyconcepts}
')
```
Using `r ttcode("acf()")` it is straightforward to compute the first four sample autocorrelations of the series `r ttcode("GDPGrowth")`.
```{r}
acf(na.omit(GDPGrowth), lag.max = 4, plot = F)
```
This is evidence that there is mild positive autocorrelation in the growth of GDP: if GDP grows faster than average in one period, there is a tendency for it to grow faster than average in the following periods.
#### Other Examples of Economic Time Series {-}
Figure 14.2 of the book presents four plots: the U.S. unemployment rate, the U.S. Dollar / British Pound exchange rate, the logarithm of the Japanese industrial production index as well as daily changes in the Wilshire 5000 stock price index, a financial time series. The next code chunk reproduces the plots of the three macroeconomic series and adds percentage changes in the daily values of the New York Stock Exchange Composite index as a fourth one (the data set `r ttcode("NYSESW")` comes with the `r ttcode("AER")` package).
```{r}
# define series as xts objects
USUnemp <- xts(USMacroSWQ$UNRATE, USMacroSWQ$Date)["1960::2013"]
DollarPoundFX <- xts(USMacroSWQ$EXUSUK, USMacroSWQ$Date)["1960::2013"]
JPIndProd <- xts(log(USMacroSWQ$JAPAN_IP), USMacroSWQ$Date)["1960::2013"]
# attach NYSESW data
data("NYSESW")
NYSESW <- xts(Delt(NYSESW))
```
```{r, fig.align='center'}
# divide plotting area into 2x2 matrix
par(mfrow = c(2, 2))
# plot the series
plot(as.zoo(USUnemp),
col = "steelblue",
lwd = 2,
ylab = "Percent",
xlab = "Date",
main = "US Unemployment Rate",
cex.main = 1)
plot(as.zoo(DollarPoundFX),
col = "steelblue",
lwd = 2,
ylab = "Dollar per pound",
xlab = "Date",
main = "U.S. Dollar / B. Pound Exchange Rate",
cex.main = 1)
plot(as.zoo(JPIndProd),
col = "steelblue",
lwd = 2,
ylab = "Logarithm",
xlab = "Date",
main = "Japanese Industrial Production",
cex.main = 1)
plot(as.zoo(NYSESW),
col = "steelblue",
lwd = 2,
ylab = "Percent per Day",
xlab = "Date",
main = "New York Stock Exchange Composite Index",
cex.main = 1)
```
The series show quite different characteristics. The unemployment rate increases during recessions and declines during economic recoveries and growth. The Dollar/Pound exchange rates shows a deterministic pattern until the end of the Bretton Woods system. Japan's industrial production exhibits an upward trend and decreasing growth. Daily changes in the New York Stock Exchange composite index seem to fluctuate randomly around the zero line. The sample autocorrelations support this conjecture.
```{r}
# compute sample autocorrelation for the NYSESW series
acf(na.omit(NYSESW), plot = F, lag.max = 10)
```
The first 10 sample autocorrelation coefficients are very close to zero. The default plot generated by `acf()` provides further evidence.
```{r, fig.align='center'}
# plot sample autocorrelation for the NYSESW series
acf(na.omit(NYSESW), main = "Sample Autocorrelation for NYSESW Data")
```
The blue dashed bands represent values beyond which the autocorrelations are significantly different from zero at $5\%$ level. Even when the true autocorrelations are zero, we need to expect a few exceedences --- recall the definition of a type-I-error from Key Concept 3.5.
For most lags we see that the sample autocorrelation does not exceed the bands and there are only a few cases that lie marginally beyond the limits.
Furthermore, the `r ttcode("NYSESW")` series exhibits what econometricians call *volatility clustering*: there are periods of high and periods of low variance. This is common for many financial time series.
## Autoregressions
Autoregressive models are heavily used in economic forecasting. An autoregressive model relates a time series variable to its past values. This section discusses the basic ideas of autoregressions models, shows how they are estimated and discusses an application to forecasting GDP growth using `r ttcode("R")`.
#### The First-Order Autoregressive Model {-}
It is intuitive that the immediate past of a variable should have power to predict its near future. The simplest autoregressive model uses only the most recent outcome of the time series observed to predict future values. For a time series $Y_t$ such a model is called a first-order autoregressive model, often abbreviated AR(1), where the 1 indicates that the order of autoregression is one:
\begin{align*}
Y_t = \beta_0 + \beta_1 Y_{t-1} + u_t
\end{align*}
is the AR(1) population model of a time series $Y_t$.
For the GDP growth series, an autoregressive model of order one uses only the information on GDP growth observed in the last quarter to predict a future growth rate. The first-order autoregression model of GDP growth can be estimated by computing OLS estimates in the regression of $GDPGR_t$ on $GDPGR_{t-1}$,
\begin{align}
\widehat{GDPGR}_t = \hat\beta_0 + \hat\beta_1 GDPGR_{t-1}. (\#eq:GDPGRAR1)
\end{align}
Following the book we use data from 1962 to 2012 to estimate \@ref(eq:GDPGRAR1). This is easily done with the function `r ttcode("ar.ols()")` from the package `r ttcode("stats")`.
```{r}
# subset data
GDPGRSub <- GDPGrowth["1962::2012"]
# estimate the model
ar.ols(GDPGRSub,
order.max = 1,
demean = F,
intercept = T)
```
We can check that the computations done by `r ttcode("ar.ols()")` are the same as done by `r ttcode("lm()")`.
```{r}
# length of data set
N <-length(GDPGRSub)
GDPGR_level <- as.numeric(GDPGRSub[-1])
GDPGR_lags <- as.numeric(GDPGRSub[-N])
# estimate the model
armod <- lm(GDPGR_level ~ GDPGR_lags)
armod
```
As usual, we may use `r ttcode("coeftest()")` to obtain a robust summary on the estimated regression coefficients.
```{r}
# robust summary
coeftest(armod, vcov. = vcovHC, type = "HC1")
```
Thus the estimated model is
\begin{align}
\widehat{GDPGR}_t = \underset{(0.351)}{1.995} + \underset{(0.076)}{0.338} GDPGR_{t-1} (\#eq:gdpgrar1).
\end{align}
We omit the first observation for $GDPGR_{1962 \ Q1}$ from the vector of the dependent variable since $GDPGR_{1962 \ Q1 - 1} = GDPGR_{1961 \ Q4}$, is not included in the sample. Similarly, the last observation, $GDPGR_{2012 \ Q4}$, is excluded from the predictor vector since the data does not include $GDPGR_{2012 \ Q4 + 1} = GDPGR_{2013 \ Q1}$. Put differently, when estimating the model, one observation is lost because of the time series structure of the data.
#### Forecasts and Forecast Errors {-}
Suppose $Y_t$ follows an AR(1) model with an intercept and that you have an OLS estimate of the model on the basis of observations for $T$ periods. Then you may use the AR(1) model to obtain $\widehat{Y}_{T+1\vert T}$, a forecast for $Y_{T+1}$ using data up to period $T$ where
\begin{align*}
\widehat{Y}_{T+1\vert T} = \hat{\beta}_0 + \hat{\beta}_1 Y_T.
\end{align*}
The forecast error is
\begin{align*}
\text{Forecast error} = Y_{T+1} - \widehat{Y}_{T+1\vert T}.
\end{align*}
#### Forecasts and Predicted Values {-}
Forecasted values of $Y_t$ are *not* what we refer to as *OLS predicted values* of $Y_t$. Also, the forecast error is *not* an OLS residual. Forecasts and forecast errors are obtained using *out-of-sample* values while predicted values and residuals are computed for *in-sample* values that were actually observed and used in estimating the model.
The root mean squared forecast error (RMSFE) measures the typical size of the forecast error and is defined as
\begin{align*}
RMSFE = \sqrt{E\left[\left(Y_{T+1} - \widehat{Y}_{T+1\vert T}\right)^2\right]}.
\end{align*}
The $RMSFE$ is composed of the future errors $u_t$ and the error made when estimating the coefficients. When the sample size is large, the former may be much larger than the latter so that $RMSFE \approx \sqrt{Var()u_t}$ which can be estimated by the standard error of the regression.
#### Application to GDP Growth {-}
Using \@ref(eq:gdpgrar1), the estimated AR(1) model of GDP growth, we perform the forecast for GDP growth for 2013:Q1 (remember that the model was estimated using data for periods 1962:Q1 - 2012:Q4, so 2013:Q1 is an out-of-sample period). Plugging $GDPGR_{2012:Q4} \approx 0.15$ into \@ref(eq:gdpgrar1),
\begin{align*}
\widehat{GDPGR}_{2013:Q1} = 1.995 + 0.348 \cdot 0.15 = 2.047.
\end{align*}
The function `r ttcode("forecast()")` from the `r ttcode("forecast")` package has some useful features for forecasting time series data.
```{r, message=FALSE}
library(forecast)
# assign GDP growth rate in 2012:Q4
new <- data.frame("GDPGR_lags" = GDPGR_level[N-1])
# forecast GDP growth rate in 2013:Q1
forecast(armod, newdata = new)
```
Using `r ttcode("forecast()")`produces the same point forecast of about 2.0, along with $80\%$ and $95\%$ forecast intervals, see section \@ref(apatadlm). We conclude that our AR(1) model forecasts GDP growth to be $2\%$ in 2013:Q1.
How accurate is this forecast? The forecast error is quite large: $GDPGR_{2013:Q1} \approx 1.1\%$ while our forecast is $2\%$.
Second, by calling `summary(armod)` shows that the model explains only little of the variation in the growth rate of GDP and the $SER$ is about $3.16$. Leaving aside forecast uncertainty due to estimation of the model coefficients $\beta_0$ and $\beta_1$, the $RMSFE$ must be at least $3.16\%$, the estimate of the standard deviation of the errors. We conclude that this forecast is pretty inaccurate.
```{r}
# compute the forecast error
forecast(armod, newdata = new)$mean - GDPGrowth["2013"][1]
# R^2
summary(armod)$r.squared
# SER
summary(armod)$sigma
```
### Autoregressive Models of Order $p$ {-}
For forecasting GDP growth, the AR($1$) model \@ref(eq:gdpgrar1) disregards any information in the past of the series that is more distant than one period. An AR($p$) model incorporates the information of $p$ lags of the series. The idea is explained in Key Concept 14.3.
```{r, eval = my_output == "html", results='asis', echo=F, purl=F}
cat('
<div class = "keyconcept" id="KC14.3">
<h3 class = "right"> Key Concept 14.3 </h3>
<h3 class = "left"> Autoregressions </h3>
<p>
An AR($p$) model assumes that a time series $Y_t$ can be modeld by a linear function of the first $p$ of its lagged values.
\\begin{align*}
Y_t = \\beta_0 + \\beta_1 Y_{t-1} + \\beta_2 Y_{t-2} + \\dots + \\beta_p Y_{t-p} + u_t
\\end{align*}
is an autoregressive model of order $p$ where $E(u_t\\vert Y_{t-1}, Y_{t-2}, \\dots,Y_{t-p})=0$.
</p>
</div>
')
```
```{r, eval = my_output == "latex", results='asis', echo=F, purl=F}
cat('\\begin{keyconcepts}[Autoregressions]{14.3}
An AR($p$) model assumes that a time series $Y_t$ can be modeled by a linear function of the first $p$ of its lagged values.
\\begin{align*}
Y_t = \\beta_0 + \\beta_1 Y_{t-1} + \\beta_2 Y_{t-2} + \\dots + \\beta_p Y_{t-p} + u_t
\\end{align*}
is an autoregressive model of order $p$ where $E(u_t\\vert Y_{t-1}, Y_{t-2}, \\dots,Y_{t-p})=0$.
\\end{keyconcepts}
')
```
Following the book, we estimate an AR($2$) model of the GDP growth series from 1962:Q1 to 2012:Q4.
```{r}
# estimate the AR(2) model
GDPGR_AR2 <- dynlm(ts(GDPGR_level) ~ L(ts(GDPGR_level)) + L(ts(GDPGR_level), 2))
coeftest(GDPGR_AR2, vcov. = sandwich)
```
The estimation yields
\begin{align}
\widehat{GDPGR}_t = \underset{(0.40)}{1.63} + \underset{(0.08)}{0.28} GDPGR_{t-1} + \underset{(0.08)}{0.18} GDPGR_{t-1}. (\#eq:GDPGRAR2)
\end{align}
We see that the coefficient on the second lag is significantly different from zero. The fit improves slightly: $\bar{R}^2$ grows from $0.11$ for the AR($1$) model to about $0.14$ and the $SER$ reduces to $3.13$.
```{r}
# R^2
summary(GDPGR_AR2)$r.squared
# SER
summary(GDPGR_AR2)$sigma
```
We may use the AR($2$) model to obtain a forecast for GDP growth in 2013:Q1 in the same manner as for the AR(1) model.
```{r}
# AR(2) forecast of GDP growth in 2013:Q1
forecast <- c("2013:Q1" = coef(GDPGR_AR2) %*% c(1, GDPGR_level[N-1], GDPGR_level[N-2]))
```
This leads to a forecast error of roughly $-1\%$.
```{r}
# compute AR(2) forecast error
GDPGrowth["2013"][1] - forecast
```
## Can You Beat the Market? (Part I) {#cybtmpi}
The theory of efficient capital markets states that stock prices embody all currently available information. If this hypothesis holds, it should not be possible to estimate a useful model for forecasting future stock returns using publicly available information on past returns (this is also referred to as the weak-form efficiency hypothesis): if it was possible to forecast the market, traders would be able to arbitrage, e.g., by relying on an AR($2$) model, they would use information that is not already priced-in which would push prices until the expected return is zero.
This idea is presented in the box *Can You Beat the Market? (Part I)* on p. 582 of the book. This section reproduces the estimation results.
We start by importing monthly data from 1931:1 to 2002:12 on excess returns of a broad-based index of stock prices, the CRSP value-weighted index. The data are provided by the authors of the book as an excel sheet which can be downloaded [here](http://wps.aw.com/wps/media/objects/11422/11696965/data3eu/Stock_Returns_1931_2002.xlsx).
```{r}
# read in data on stock returns
SReturns <- read_xlsx("Data/Stock_Returns_1931_2002.xlsx",
sheet = 1,
col_types = "numeric")
```
We continue by converting the data to an object of class `r ttcode("ts")`.
```{r}
# convert to ts object
StockReturns <- ts(SReturns[, 3:4],
start = c(1931, 1),
end = c(2002, 12),
frequency = 12)
```
Next, we estimate AR($1$), AR($2$) and AR($4$) models of excess returns for the time period 1960:1 to 2002:12.
```{r}
# estimate AR models:
# AR(1)
SR_AR1 <- dynlm(ExReturn ~ L(ExReturn),
data = StockReturns, start = c(1960, 1), end = c(2002, 12))
# AR(2)
SR_AR2 <- dynlm(ExReturn ~ L(ExReturn) + L(ExReturn, 2),
data = StockReturns, start = c(1960, 1), end = c(2002, 12))
# AR(4)
SR_AR4 <- dynlm(ExReturn ~ L(ExReturn) + L(ExReturn, 1:4),
data = StockReturns, start = c(1960, 1), end = c(2002, 12))
```
After computing robust standard errors, we gather the results in a table generated by `r ttcode("stargazer()")`.
```{r}
# compute robust standard errors
rob_se <- list(sqrt(diag(sandwich(SR_AR1))),
sqrt(diag(sandwich(SR_AR2))),
sqrt(diag(sandwich(SR_AR4))))
```
```{r, message=F, warning=F, results='asis', eval=F}
# generate table using 'stargazer()'
stargazer(SR_AR1, SR_AR2, SR_AR4,
title = "Autoregressive Models of Monthly Excess Stock Returns",
header = FALSE,
model.numbers = F,
omit.table.layout = "n",
digits = 3,
column.labels = c("AR(1)", "AR(2)", "AR(4)"),
dep.var.caption = "Dependent Variable: Excess Returns on the CSRP Value-Weighted Index",
dep.var.labels.include = FALSE,
covariate.labels = c("$excess return_{t-1}$", "$excess return_{t-2}$",
"$excess return_{t-3}$", "$excess return_{t-4}$",
"Intercept"),
se = rob_se,
omit.stat = "rsq")
```
<!--html_preserve-->
```{r, message=F, warning=F, results='asis', echo=F, purl=F, eval=my_output=="html"}
stargazer(SR_AR1, SR_AR2, SR_AR4,
header = FALSE,
type = "html",
model.numbers = F,
omit.table.layout = "n",
digits = 3,
column.labels = c("AR(1)", "AR(2)", "AR(4)"),
dep.var.caption = "Dependent Variable: Excess returns on the CSRP Value-Weighted Index",
dep.var.labels.include = FALSE,
covariate.labels = c("$excess return_{t-1}$", "$excess return_{t-2}$", "$excess return_{t-3}$", "$excess return_{t-4}$", "Intercept"),
se = rob_se,
omit.stat = c("rsq")
)
stargazer_html_title("Autoregressive Models of Monthly Excess Stock Returns", "amomesr")
```
<!--/html_preserve-->
```{r, message=F, warning=F, results='asis', echo=F, purl=F, eval=my_output=="latex"}
stargazer(SR_AR1, SR_AR2, SR_AR4,
title = "\\label{tab:amomesr} Autoregressive Models of Monthly Excess Stock Returns",
header = FALSE,
type = "latex",
model.numbers = F,
omit.table.layout = "n",
digits = 3,
column.labels = c("AR(1)", "AR(2)", "AR(4)"),
dep.var.caption = "Dependent Variable: Excess returns on the CSRP Value-Weighted Index",
dep.var.labels.include = FALSE,
covariate.labels = c("$excess return_{t-1}$", "$excess return_{t-2}$", "$excess return_{t-3}$", "$excess return_{t-4}$", "Intercept"),
se = rob_se,
omit.stat = c("rsq")
)
```
The results are consistent with the hypothesis of efficient financial markets: there are no statistically significant coefficients in any of the estimated models and the hypotheses that all coefficients are zero cannot be rejected. $\bar{R}^2$ is almost zero in all models and even negative for the AR($4$) model. This suggests that none of the models are useful for forecasting stock returns.
## Additional Predictors and The ADL Model {#apatadlm}
Instead of only using the dependent variable's lags as predictors, an autoregressive distributed lag (ADL) model also uses lags of other variables for forecasting. The general ADL model is summarized in Key Concept 14.4:
```{r, eval = my_output == "html", results='asis', echo=F, purl=F}
cat('
<div class = "keyconcept" id="KC14.4">
<h3 class = "right"> Key Concept 14.4 </h3>
<h3 class = "left"> The Autoregressive Distributed Lag Model </h3>
<p>
An ADL($p$,$q$) model assumes that a time series $Y_t$ can be represented by a linear function of $p$ of its lagged values and $q$ lags of another time series $X_t$:
\\begin{align*}
Y_t =& \\, \\beta_0 + \\beta_1 Y_{t-1} + \\beta_2 Y_{t-2} + \\dots + \\beta_p Y_{t-p} \\\\
&+ \\, \\delta_1 X_{t-1} + \\delta_2 X_{t-2} + \\dots + \\delta_q X_{t-q} X_{t-q} + u_t.
\\end{align*}
is an *autoregressive distributed lag model* with $p$ lags of $Y_t$ and $q$ lags of $X_t$ where $$E(u_t\\vert Y_{t-1}, Y_{t-2}, \\dots, X_{t-1}, X_{t-2}, \\dots)=0.$$
</p>
</div>
')
```
```{r, eval = my_output == "latex", results='asis', echo=F, purl=F}
cat('\\begin{keyconcepts}[The Autoregressive Distributed Lag Model]{14.4}
An ADL($p$,$q$) model assumes that a time series $Y_t$ can be represented by a linear function of $p$ of its lagged values and $q$ lags of another time series $X_t$:
\\begin{align*}
Y_t =& \\, \\beta_0 + \\beta_1 Y_{t-1} + \\beta_2 Y_{t-2} + \\dots + \\beta_p Y_{t-p} \\\\
&+ \\, \\delta_1 X_{t-1} + \\delta_2 X_{t-2} + \\dots + \\delta_q X_{t-q} X_{t-q} + u_t.
\\end{align*}
is an \\textit{autoregressive distributed lag model} with $p$ lags of $Y_t$ and $q$ lags of $X_t$ where $$E(u_t\\vert Y_{t-1}, Y_{t-2}, \\dots, X_{t-1}, X_{t-2}, \\dots)=0.$$
\\end{keyconcepts}
')
```
#### Forecasting GDP Growth Using the Term Spread {-}
Interest rates on long-term and short term treasury bonds are closely linked to macroeconomic conditions. While interest rates on both types of bonds have the same long-run tendencies, they behave quite differently in the short run.
The difference in interest rates of two bonds with distinct maturity is called the *term spread*.
The following code chunks reproduce Figure 14.3 of the book which displays interest rates of 10-year U.S. Treasury bonds and 3-months U.S. Treasury bills from 1960 to 2012.
```{r}
# 3-months Treasury bills interest rate
TB3MS <- xts(USMacroSWQ$TB3MS, USMacroSWQ$Date)["1960::2012"]
# 10-years Treasury bonds interest rate
TB10YS <- xts(USMacroSWQ$GS10, USMacroSWQ$Date)["1960::2012"]
# term spread
TSpread <- TB10YS - TB3MS
```
```{r, fig.align='center'}
# reproduce Figure 14.2 (a) of the book
plot(merge(as.zoo(TB3MS), as.zoo(TB10YS)),
plot.type = "single",
col = c("darkred", "steelblue"),
lwd = 2,
xlab = "Date",
ylab = "Percent per annum",
main = "Interest Rates")
# define function that transform years to class 'yearqtr'
YToYQTR <- function(years) {
return(
sort(as.yearqtr(sapply(years, paste, c("Q1", "Q2", "Q3", "Q4"))))
)
}
# recessions
recessions <- YToYQTR(c(1961:1962, 1970, 1974:1975, 1980:1982, 1990:1991, 2001, 2007:2008))
# add color shading for recessions
xblocks(time(as.zoo(TB3MS)),
c(time(TB3MS) %in% recessions),
col = alpha("steelblue", alpha = 0.3))
# add a legend
legend("topright",
legend = c("TB3MS", "TB10YS"),
col = c("darkred", "steelblue"),
lwd = c(2, 2))
# reproduce Figure 14.2 (b) of the book
plot(as.zoo(TSpread),
col = "steelblue",
lwd = 2,
xlab = "Date",
ylab = "Percent per annum",
main = "Term Spread")
# add color shading for recessions
xblocks(time(as.zoo(TB3MS)),
c(time(TB3MS) %in% recessions),
col = alpha("steelblue", alpha = 0.3))
```
Before recessions, the gap between interest rates on long-term bonds and short term bills narrows and consequently the term spread declines drastically towards zero or even becomes negative in times of economic stress. This information might be used to improve GDP growth forecasts of future.
We check this by estimating an ADL($2$, $1$) model and an ADL($2$, $2$) model of the GDP growth rate using lags of GDP growth and lags of the term spread as regressors. We then use both models for forecasting GDP growth in 2013:Q1.
```{r}
# convert growth and spread series to ts objects
GDPGrowth_ts <- ts(GDPGrowth,
start = c(1960, 1),
end = c(2013, 4),
frequency = 4)
TSpread_ts <- ts(TSpread,
start = c(1960, 1),
end = c(2012, 4),
frequency = 4)
# join both ts objects
ADLdata <- ts.union(GDPGrowth_ts, TSpread_ts)
```
```{r}
# estimate the ADL(2,1) model of GDP growth
GDPGR_ADL21 <- dynlm(GDPGrowth_ts ~ L(GDPGrowth_ts) + L(GDPGrowth_ts, 2) + L(TSpread_ts),
start = c(1962, 1), end = c(2012, 4))
coeftest(GDPGR_ADL21, vcov. = sandwich)
```
The estimated equation of the ADL($2$, $1$) model is
\begin{align}
\widehat{GDPGR}_t = \underset{(0.49)}{0.96} + \underset{(0.08)}{0.26} GDPGR_{t-1} + \underset{(0.08)}{0.19} GDPGR_{t-2} + \underset{(0.18)}{0.44} TSpread_{t-1} (\#eq:gdpgradl21)
\end{align}
All coefficients are significant at the level of $5\%$.
```{r}
# 2012:Q3 / 2012:Q4 data on GDP growth and term spread
subset <- window(ADLdata, c(2012, 3), c(2012, 4))
# ADL(2,1) GDP growth forecast for 2013:Q1
ADL21_forecast <- coef(GDPGR_ADL21) %*% c(1, subset[2, 1], subset[1, 1], subset[2, 2])
ADL21_forecast
# compute the forecast error
window(GDPGrowth_ts, c(2013, 1), c(2013, 1)) - ADL21_forecast
```
Model \@ref(eq:gdpgradl21) predicts the GDP growth in 2013:Q1 to be $2.24\%$ which leads to a forecast error of $-1.10\%$.
We estimate the ADL($2$,$2$) specification to see whether adding additional information on past term spread improves the forecast.
```{r}
# estimate the ADL(2,2) model of GDP growth
GDPGR_ADL22 <- dynlm(GDPGrowth_ts ~ L(GDPGrowth_ts) + L(GDPGrowth_ts, 2)
+ L(TSpread_ts) + L(TSpread_ts, 2),
start = c(1962, 1), end = c(2012, 4))
coeftest(GDPGR_ADL22, vcov. = sandwich)
```
We obtain
\begin{align}
\begin{split}
\widehat{GDPGR}_t =& \underset{(0.47)}{0.98} + \underset{(0.08)}{0.24} GDPGR_{t-1} \\
& + \underset{(0.08)}{0.18} GDPGR_{t-2} -\underset{(0.42)}{0.14} TSpread_{t-1} + \underset{(0.43)}{0.66} TSpread_{t-2}.
\end{split} (\#eq:gdpgradl22)
\end{align}
The coefficients on both lags of the term spread are not significant at the $10\%$ level.
```{r}
# ADL(2,2) GDP growth forecast for 2013:Q1
ADL22_forecast <- coef(GDPGR_ADL22) %*% c(1, subset[2, 1], subset[1, 1], subset[2, 2], subset[1, 2])
ADL22_forecast
# compute the forecast error
window(GDPGrowth_ts, c(2013, 1), c(2013, 1)) - ADL22_forecast
```
The ADL($2$,$2$) forecast of GDP growth in 2013:Q1 is $2.27\%$ which implies a forecast error of $1.14\%$.
Do the ADL models \@ref(eq:gdpgradl21) and \@ref(eq:gdpgradl22) improve upon the simple AR($2$) model \@ref(eq:GDPGRAR2)? The answer is yes: while $SER$ and $\bar{R}^2$ improve only slightly, an $F$-test on the term spread coefficients in \@ref(eq:gdpgradl22) provides evidence that the model does better in explaining GDP growth than the AR($2$) model as the hypothesis that both coefficients are zero cannot be rejected at the level of $5\%$.
```{r}
# compare adj. R2
c("Adj.R2 AR(2)" = summary(GDPGR_AR2)$r.squared,
"Adj.R2 ADL(2,1)" = summary(GDPGR_ADL21)$r.squared,
"Adj.R2 ADL(2,2)" = summary(GDPGR_ADL22)$r.squared)
# compare SER
c("SER AR(2)" = summary(GDPGR_AR2)$sigma,
"SER ADL(2,1)" = summary(GDPGR_ADL21)$sigma,
"SER ADL(2,2)" = summary(GDPGR_ADL22)$sigma)
# F-test on coefficients of term spread
linearHypothesis(GDPGR_ADL22,
c("L(TSpread_ts)=0", "L(TSpread_ts, 2)=0"),
vcov. = sandwich)
```
#### Stationarity {-}
In general, forecasts can be improved by using multiple predictors --- just as in cross-sectional regression. When constructing time series models one should take into account whether the variables are *stationary* or *nonstationary*. Key Concept 14.5 explains what stationarity is.
```{r, eval = my_output == "html", results='asis', echo=F, purl=F}
cat('
<div class = "keyconcept" id="KC14.5">
<h3 class = "right"> Key Concept 14.5 </h3>
<h3 class = "left"> Stationarity </h3>
A time series $Y_t$ is stationary if its probability distribution is time independent, that is the joint distribution of $Y_{s+1}, Y_{s+2},\\dots,Y_{s+T}$ does not change as $s$ is varied, regardless of $T$.
Similarly, two time series $X_t$ and $Y_t$ are *jointly stationary* if the joint distribution of $(X_{s+1},Y_{s+1}, X_{s+2},Y_{s+2} \\dots, X_{s+T},Y_{s+T})$ does not depend on $s$, regardless of $T$.
Stationarity makes it easier to learn about the characteristics of past data.
</div>
')
```
```{r, eval = my_output == "latex", results='asis', echo=F, purl=F}
cat('\\begin{keyconcepts}[Stationarity]{14.5}
A time series $Y_t$ is stationary if its probability distribution is time independent, that is the joint distribution of $Y_{s+1}, Y_{s+2},\\dots,Y_{s+T}$ does not change as $s$ is varied, regardless of $T$.\\newline
Similarly, two time series $X_t$ and $Y_t$ are \\textit{jointly stationary} if the joint distribution of $(X_{s+1},Y_{s+1}, X_{s+2},Y_{s+2} \\dots, X_{s+T}, Y_{s+T})$ does not depend on $s$, regardless of $T$.\\newline
In a probabilistic sense, stationarity means that information about how a time series evolves in the future is inherent to its past. If this is not the case, we cannot use the past of a series as a reliable guideline for its future.\\vspace{0.5cm}
Stationarity makes it easier to learn about the characteristics of past data.
\\end{keyconcepts}
')
```
#### Time Series Regression with Multiple Predictors {-}
The concept of stationarity is a key assumption in the general time series regression model with multiple predictors. Key Concept 14.6 lays out this model and its assumptions.
```{r, eval = my_output == "html", results='asis', echo=F, purl=F}
cat('
<div class = "keyconcept" id="KC14.6">
<h3 class = "right"> Key Concept 14.6 </h3>
<h3 class = "left"> Time Series Regression with Multiple Predictors </h3>
The general time series regression model extends the ADL model such that multiple regressors and their lags are included. It uses $p$ lags of the dependent variable and $q_l$ lags of $l$ additional predictors where $l=1,\\dots,k$:
\\begin{equation}
\\begin{aligned}
Y_t =& \\beta_0 + \\beta_1 Y_{t-1} + \\beta_2 Y_{t-2} + \\dots + \\beta_{p} Y_{t-p} \\\\
&+ \\delta_{11} X_{1,t-1} + \\delta_{12} X_{1,t-2} + \\dots + \\delta_{1q} X_{1,t-q} \\\\
&+ \\dots \\\\
&+ \\delta_{k1} X_{k,t-1} + \\delta_{k2} X_{k,t-2} + \\dots + \\delta_{kq} X_{k,t-q} \\\\
&+ u_t
\\end{aligned}
\\end{equation}
For estimation we make the following assumptions:
1. The error term $u_t$ has conditional mean zero given all regressors and their lags: $$E(u_t\\vert Y_{t-1}, Y_{t-2}, \\dots, X_{1,t-1}, X_{1,t-2} \\dots, X_{k,t-1}, X_{k,t-2}, \\dots)$$ This assumption is an extension of the conditional mean zero assumption used for AR and ADL models and guarantees that the general time series regression model stated above gives the best forecast of $Y_t$ given its lags, the additional regressors $X_{1,t},\\dots,X_{k,t}$ and their lags.
2. The i.i.d. assumption for cross-sectional data is not (entirely) meaningful for time series data. We replace it by the following assumption witch consists of two parts:
(a) The $(Y_{t}, X_{1,t}, \\dots, X_{k,t})$ have a stationary distribution (the "identically distributed" part of the i.i.d. assumption for cross-setional data). If this does not hold, forecasts *may* be biased and inference *can* be strongly misleading.
(b) $(Y_{t}, X_{1,t}, \\dots, X_{k,t})$ and $(Y_{t-j}, X_{1,t-j}, \\dots, X_{k,t-j})$ become independent as $j$ gets large (the "idependently" distributed part of the i.i.d. assumption for cross-sectional data). This assumption is also called *weak dependence*. It ensures that the WLLN and the CLT hold in large samples.
3. Large outliers are unlikely: $E(X_{1,t}^4), E(X_{2,t}^4), \\dots, E(X_{k,t}^4)$ and $E(Y_t^4)$ have nonzero, finite fourth moments.
4. No perfect multicollinearity.
</div>
')
```
```{r, eval = my_output == "latex", results='asis', echo=F, purl=F}
cat('\\begin{keyconcepts}[Time Series Regression with Multiple Predictors]{14.6}
The general time series regression model extends the ADL model such that multiple regressors and their lags are included. It uses $p$ lags of the dependent variable and $q_l$ lags of $l$ additional predictors where $l=1,\\dots,k$:
\\begin{equation}
\\begin{aligned}
Y_t =& \\, \\beta_0 + \\beta_1 Y_{t-1} + \\beta_2 Y_{t-2} + \\dots + \\beta_{p} Y_{t-p} \\\\
+& \\, \\delta_{11} X_{1,t-1} + \\delta_{12} X_{1,t-2} + \\dots + \\delta_{1q} X_{1,t-q} \\\\
+& \\, \\dots \\\\
+& \\, \\delta_{k1} X_{k,t-1} + \\delta_{k2} X_{k,t-2} + \\dots + \\delta_{kq} X_{k,t-q} \\\\
+& \\, u_t
\\end{aligned}
\\end{equation}
For estimation we make the following assumptions:\\newline
\\begin{enumerate}
\\item The error term $u_t$ has conditional mean zero given all regressors and their lags: $$E(u_t\\vert Y_{t-1}, Y_{t-2}, \\dots, X_{1,t-1}, X_{1,t-2} \\dots, X_{k,t-1}, X_{k,t-2}, \\dots)$$ This assumption is an extension of the conditional mean zero assumption used for AR and ADL models and guarantees that the general time series regression model stated above gives the best forecast of $Y_t$ given its lags, the additional regressors $X_{1,t},\\dots,X_{k,t}$ and their lags.
\\item The i.i.d. assumption for cross-sectional data is not (entirely) meaningful for time series data. We replace it by the following assumption witch consists of two parts:\\newline
\\begin{itemize}
\\item[(a)] The $(Y_{t}, X_{1,t}, \\dots, X_{k,t})$ have a stationary distribution (the "identically distributed" part of the i.i.d. assumption for cross-setional data). If this does not hold, forecasts may be biased and inference can be strongly misleading.
\\item[(b)] $(Y_{t}, X_{1,t}, \\dots, X_{k,t})$ and $(Y_{t-j}, X_{1,t-j}, \\dots, X_{k,t-j})$ become independent as $j$ gets large (the "idependently" distributed part of the i.i.d. assumption for cross-sectional data). This assumption is also called \\textit{weak dependence}. It ensures that the WLLN and the CLT hold in large samples.
\\end{itemize}
\\item Large outliers are unlikely: $E(X_{1,t}^4), E(X_{2,t}^4), \\dots, E(X_{k,t}^4)$ and $E(Y_t^4)$ have nonzero, finite fourth moments.
\\item No perfect multicollinearity.
\\end{enumerate}
\\end{keyconcepts}
')
```
Since many economic time series appear to be nonstationary, assumption two of Key Concept 14.6 is a crucial one in applied macroeconomics and finance which is why statistical test for stationarity or nonstationarity have been developed. Chapters \@ref(llsuic) and \@ref(nit) are devoted to this topic.
#### Statistical inference and the Granger causality test {-}
If a $X$ is a useful predictor for $Y$, in a regression of $Y_t$ on lags of its own and lags of $X_t$, not all of the coefficients on the lags on $X_t$ are zero. This concept is called *Granger causality* and is an interesting hypothesis to test. Key Concept 14.7 summarizes the idea.
```{r, eval = my_output == "html", results='asis', echo=F, purl=F}
cat('
<div class = "keyconcept" id="KC14.7">
<h3 class = "right"> Key Concept 14.7 </h3>
<h3 class = "left"> Granger Causality Tests </h3>
The Granger causality test @granger1969 is an $F$ test of the null hypothesis that *all* lags of a variable $X$ included in a time series regression model do not have predictive power for $Y_t$. The Granger causality test does not test whether $X$ actually *causes* $Y$ but whether the included lags are informative in terms of predicting $Y$.
</div>
')
```
```{r, eval = my_output == "latex", results='asis', echo=F, purl=F}
cat('\\begin{keyconcepts}[Granger Causality Tests]{14.7}
The Granger causality test \\citep{granger1969} is an $F$ test of the null hypothesis that \\textit{all} lags of a variable $X$ included in a time series regression model do not have predictive power for $Y_t$. The Granger causality test does not test whether $X$ actually \\textit{causes} $Y$ but whether the included lags are informative in terms of predicting $Y$.
\\end{keyconcepts}
')
```
We have already performed a Granger causality test on the coefficients of term spread in \@ref(eq:gdpgradl22), the ADL($2$,$2$) model of GDP growth and concluded that at least one of the first two lags of term spread has predictive power for GDP growth.
### Forecast Uncertainty and Forecast Intervals {-}
In general, it is good practice to report a measure of the uncertainty when presenting results that are affected by the latter. Uncertainty is particularly of interest when forecasting a time series. For example, consider a simple ADL$(1,1)$ model
\begin{align*}
Y_t = \beta_0 + \beta_1 Y_{t-1} + \delta_1 X_{t-1} + u_t
\end{align*}
where $u_t$ is a homoskedastic error term. The forecast error is
\begin{align*}
Y_{T+1} - \widehat{Y}_{T+1\vert T} = u_{T+1} - \left[(\widehat{\beta}_0 - \beta_0) + (\widehat{\beta}_1 - \beta_1) Y_T + (\widehat{\delta_1} - \delta_1) X_T \right].
\end{align*}
The mean squared forecast error (MSFE) and the RMFSE are
\begin{align*}
MFSE =& \, E\left[(Y_{T+1} - \widehat{Y}_{T+1\vert T})^2 \right] \\
=& \, \sigma_u^2 + Var\left[ (\widehat{\beta}_0 - \beta_0) + (\widehat{\beta}_1 - \beta_1) Y_T + (\widehat{\delta_1} - \delta_1) X_T \right], \\
RMFSE =& \, \sqrt{\sigma_u^2 + Var\left[ (\widehat{\beta}_0 - \beta_0) + (\widehat{\beta}_1 - \beta_1) Y_T + (\widehat{\delta_1} - \delta_1) X_T \right]}.
\end{align*}
A $95\%$ forecast interval is an interval that covers the true value of $Y_{T+1}$ in $95\%$ of repeated applications. There is a major difference in computing a confidence interval and a forecast interval: when computing a confidence interval of a point estimate we use large sample approximations that are justified by the CLT and thus are valid for a large range of error term distributions. For computation of a forecast interval of $Y_{T+1}$, however, we must make an additional assumption about the distribution of $u_{T+1}$, the error term in period $T+1$. Assuming that $u_{T+1}$ is normally distributed one can construct a $95\%$ *forecast interval* for $Y_{T+1}$ using $SE(Y_{T+1} - \widehat{Y}_{T+1\vert T})$, an estimate of the RMSFE:
\begin{align*}
\widehat{Y}_{T+1\vert T} \pm 1.96 \cdot SE(Y_{T+1} - \widehat{Y}_{T+1\vert T})
\end{align*}
Of course, the computation gets more complicated when the error term is heteroskedastic or if we are interested in computing a forecast interval for $T+s, s>1$.
In some applications it is useful to report multiple forecast intervals for subsequent periods, see the box *The River of Blood* on p. 592 of the book. These can be visualized in a so-called fan chart. We will not replicate the fan chart presented in Figure 14.2 of book because the underlying model is by far more complex than the simple AR and ADL models treated here. Instead, in the example below we use simulated time series data and estimate an AR($2$) model which is then used for forecasting the subsequent $25$ future outcomes of the series.
```{r, fig.align='center'}
# set seed
set.seed(1234)
# simulate the time series
Y <- arima.sim(list(order = c(2, 0, 0), ar = c(0.2, 0.2)), n = 200)
# estimate an AR(2) model using 'arima()', see ?arima
model <- arima(Y, order = c(2, 0, 0))
# compute points forecasts and prediction intervals for the next 25 periods
fc <- forecast(model, h = 25, level = seq(5, 99, 10))
# plot a fan chart
plot(fc,
main = "Forecast Fan Chart for AR(2) Model of Simulated Data",
showgap = F,
fcol = "red",
flty = 2)
```