diff --git a/DESCRIPTION b/DESCRIPTION index c4742abc..87f56278 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -2,5 +2,7 @@ Package: placeholder Title: Does not matter. Version: 0.0.1 Imports: bookdown -Suggests: MARSS, forecast, Hmisc, xtable +Suggests: MARSS, forecast, Hmisc, xtable, + coda, rjags, ggplot2, R2jags, ggmap, + atsalibrary Remotes: rstudio/bookdown diff --git a/docs/Applied_Time_Series_Analysis.pdf b/docs/Applied_Time_Series_Analysis.pdf index 36508cbf..f88f654e 100644 Binary files a/docs/Applied_Time_Series_Analysis.pdf and b/docs/Applied_Time_Series_Analysis.pdf differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-anchovy-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-anchovy-plot-1.png index 6e55e790..9db52014 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-anchovy-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-anchovy-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-ggplot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-ggplot-1.png index 562e9f12..df5819a3 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-ggplot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-ggplot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-plot-1.png index 9e107330..d139a8fe 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-trend-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-trend-plot-1.png index 43419742..560778d7 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-trend-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-ar1-trend-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-arima-sim-miss-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-arima-sim-miss-1.png index f9e19e22..7069e65b 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-arima-sim-miss-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-arima-sim-miss-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-checkresiduals-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-checkresiduals-1.png index 025ac780..b30fbf35 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-checkresiduals-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-checkresiduals-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-anchovy-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-anchovy-1.png index 6f0c8db0..b15550a2 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-anchovy-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-anchovy-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-chinook-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-chinook-1.png index 1c52981d..f8b7e919 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-chinook-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-forecast-chinook-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-Arima-fit-miss-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-Arima-fit-miss-1.png index 318e7805..b9ea8957 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-Arima-fit-miss-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-Arima-fit-miss-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-chinook-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-chinook-1.png index 19817866..a2d239b0 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-chinook-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-plot-chinook-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-1.png index b103d5fd..494d1ab1 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-ggplot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-ggplot-1.png index 8906a9a8..fa8f206f 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-ggplot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-white-noise-ggplot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-ggplot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-ggplot-1.png index 2292c280..b5a3aa70 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-ggplot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-ggplot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-plot-1.png index 470ff187..2f5de6e0 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/bj-wnt-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-fits-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-fits-1.png index a7feb874..8e1dab4f 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-fits-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-fits-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-temp-fits-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-temp-fits-1.png index f0c4f456..e7581c1d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-temp-fits-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa-temp-fits-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa1-1.png index 1e626eb5..8cabfb84 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-dfa1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-phytos-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-phytos-1.png index d035a297..2fe3e0ea 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-phytos-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-plot-phytos-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-xy-states12-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-xy-states12-1.png index a681f6e4..ac1ea364 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-xy-states12-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dfa-xy-states12-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-nile-fit-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-nile-fit-plot-1.png index aa2a5ee6..2b6d516e 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-nile-fit-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-nile-fit-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdata-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdata-1.png index f269fbb2..46e93205 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdata-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdata-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlm1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlm1-1.png index 014a406c..c63b1b30 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlm1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlm1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmACF-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmACF-1.png index 66d32b0f..01304920 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmACF-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmACF-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeLogit-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeLogit-1.png index 5d15062b..9eef12ad 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeLogit-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeLogit-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeRaw-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeRaw-1.png index 940d9e14..45fb338b 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeRaw-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmForeRaw-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmQQ-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmQQ-1.png index b768ab9c..5828ad94 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmQQ-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/dlm-plotdlmQQ-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-lm1ar-plot1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-lm1ar-plot1-1.png index 73b34be6..9979e2ac 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-lm1ar-plot1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-lm1ar-plot1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-make-myList-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-make-myList-1.png index 4abbc011..ee842ace 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-make-myList-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-make-myList-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-hist-post-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-hist-post-1.png index 06a12f04..7c23d368 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-hist-post-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-hist-post-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-myList-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-myList-1.png index 20536076..f609696b 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-myList-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/jags-plot-myList-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ln_errors-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ln_errors-1.png index f2fe5a7f..f5095eb3 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ln_errors-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ln_errors-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/marss-stan-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/marss-stan-plot-1.png index 494002b7..f2862953 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/marss-stan-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/marss-stan-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mean_seas_effects-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mean_seas_effects-1.png index da35d696..38ddbe71 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mean_seas_effects-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mean_seas_effects-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-Cs02-fig1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-Cs02-fig1-1.png index a96094b8..7927034c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-Cs02-fig1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-Cs02-fig1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig1-1.png index 054d990e..373c774e 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig2-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig2-plot-1.png index ab9b70b1..3cad5149 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig2-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-fig2-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-2-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-2-1.png index b57e1041..25834d24 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-2-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-2-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-plot-1.png index 3caa9421..d5491841 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-model-resids-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-plot-jags-states-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-plot-jags-states-1.png index c5656725..b9adaff2 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mss-plot-jags-states-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mss-plot-jags-states-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-diagnostic-fig-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-diagnostic-fig-1.png index 6ff6df69..96900162 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-diagnostic-fig-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-diagnostic-fig-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-mon-effects-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-mon-effects-1.png index dbd0a1ab..34a7241a 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-mon-effects-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-mon-effects-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-plank-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-plank-plot-1.png index 954c4d4d..22796f4d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-plank-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/msscov-plank-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-modelresids-fit-dfa-model-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-modelresids-fit-dfa-model-1.png index d632c387..14c0a337 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-modelresids-fit-dfa-model-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-modelresids-fit-dfa-model-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotfit-seas-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotfit-seas-1.png index 4d2bd1f0..8bc437fe 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotfit-seas-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotfit-seas-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotel-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotel-1.png index 56b5d670..86d20700 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotel-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotel-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotelts-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotelts-1.png index 847551d7..767503f0 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotelts-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-plotsnotelts-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seas-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seas-1.png index 5da3b0ca..e20c816f 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seas-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seas-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seasonal-swe-plot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seasonal-swe-plot-1.png index 95e03f65..0c77c18c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seasonal-swe-plot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-seasonal-swe-plot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-ar1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-ar1-1.png index ece84dfc..a5a7b56c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-ar1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-ar1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-corr-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-corr-1.png index 5a7d1954..820a0c79 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-corr-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotfits-corr-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotstates-dfa-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotstates-dfa-1.png index 0fda649e..c63d8199 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotstates-dfa-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-snotelplotstates-dfa-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-ar1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-ar1-1.png index 5c5c45a3..9a6963bf 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-ar1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-ar1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-model-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-model-1.png index e54375f6..a0252262 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-model-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-model-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-states-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-states-1.png index ddb55b7d..636dc923 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-states-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresids-fit-corr-states-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresit-fit-dfa-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresit-fit-dfa-1.png index 0bdd021d..e5898929 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresit-fit-dfa-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/mssmiss-stateresit-fit-dfa-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/seas_ln_dat-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/seas_ln_dat-1.png index c743c8e8..d5fd22a6 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/seas_ln_dat-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/seas_ln_dat-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-dfa-plot-trends-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-dfa-plot-trends-1.png index b0b04dc7..524b36d1 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-dfa-plot-trends-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-dfa-plot-trends-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-burnin-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-burnin-1.png index 9fe5dc93..1a18930b 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-burnin-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-burnin-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-lm-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-lm-1.png index 9d4bc383..e092d24b 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-lm-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-fig-lm-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-harborseal-data-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-harborseal-data-1.png index 3b7912de..ddc834e6 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-harborseal-data-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-harborseal-data-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-hist-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-hist-1.png index 03446ed4..fff0764d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-hist-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-hist-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-dfa-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-dfa-1.png index f40ce645..f51cbbe3 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-dfa-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-dfa-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-seal-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-seal-1.png index 13322a94..9843784d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-seal-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/stan-plot-seal-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-errors-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-errors-1.png index 155fb14c..b75070a2 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-errors-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-errors-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-RW-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-RW-1.png index e7d14e44..464e856d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-RW-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-RW-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-WN-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-WN-1.png index a444895f..5a252bcc 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-WN-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-ex-WN-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-load-quantmod-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-load-quantmod-1.png index 8fb82107..f31a16fe 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-load-quantmod-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-load-quantmod-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-mean-seasonal-effects-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-mean-seasonal-effects-1.png index af3e6a17..9d84c7c5 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-mean-seasonal-effects-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-mean-seasonal-effects-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-1.png index 1840de27..ba680b60 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-decomp-seas-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-decomp-seas-1.png index edf9231e..ecdeb7b5 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-decomp-seas-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-decomp-seas-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr1-1.png index 836c07de..9778b862 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr2-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr2-1.png index fb5b9b20..f474c7d6 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr2-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr2-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr3-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr3-1.png index c99a2076..7d690365 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr3-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-airpass-fltr3-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-1.png index fe558d43..d6afc3c5 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-2-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-2-1.png index c31d46e7..e2f69a38 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-2-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-joint-dist-2-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lin-trend-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lin-trend-1.png index e4f80372..3a3f0f13 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lin-trend-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lin-trend-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-ln-airpass-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-ln-airpass-1.png index 26f8185a..a64062d1 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-ln-airpass-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-ln-airpass-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lynx-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lynx-1.png index a4f69ce9..26799822 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lynx-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-lynx-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-www-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-www-1.png index ec3daca9..94ba92a7 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-www-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/ts-plot-www-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-DoOurACF-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-DoOurACF-1.png index 627c7884..13a57d0e 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-DoOurACF-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-DoOurACF-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-LinearACF-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-LinearACF-1.png index 2bb15dcb..a36ae949 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-LinearACF-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-LinearACF-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFb-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFb-1.png index f319b72e..d2b195cd 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFb-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFb-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFdwn-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFdwn-1.png index 50902600..e971a231 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFdwn-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotACFdwn-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1contrast-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1contrast-1.png index 85585e75..d9b40d2c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1contrast-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1contrast-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1opps-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1opps-1.png index 1fc27c29..ede0e71f 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1opps-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotAR1opps-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotARpComps-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotARpComps-1.png index fef4b531..5c28f8a9 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotARpComps-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotARpComps-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCCFb-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCCFb-1.png index 38476362..93e4418c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCCFb-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCCFb-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff12-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff12-1.png index dfa6af0c..8bbacb5c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff12-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff12-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff2-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff2-1.png index b53f5153..9df53626 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff2-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotCO2diff2-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDWNsims-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDWNsims-1.png index db294fdf..79d239b6 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDWNsims-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDWNsims-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDecompB-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDecompB-1.png index 7f130b67..bd782762 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDecompB-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotDecompB-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotLinearACF-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotLinearACF-1.png index 025070a2..cf3afe84 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotLinearACF-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotLinearACF-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMA1opps-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMA1opps-1.png index 8e1f549d..848e59c5 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMA1opps-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMA1opps-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMApComps-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMApComps-1.png index 470d68eb..524fb67c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMApComps-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotMApComps-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotPACFb-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotPACFb-1.png index f54dbd62..90e0aa6a 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotPACFb-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotPACFb-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRW-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRW-1.png index 1a00d69b..83527d6a 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRW-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRW-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRWalt-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRWalt-1.png index 017d532f..d3e685d1 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRWalt-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotRWalt-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasMean-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasMean-1.png index 9d291945..44c9342d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasMean-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasMean-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasTSb-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasTSb-1.png index 8a803974..3a66cd03 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasTSb-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSeasTSb-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSiLiACF-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSiLiACF-1.png index ba2eec6c..f2002aaf 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSiLiACF-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSiLiACF-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSineACF-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSineACF-1.png index 506f2577..bd2aca9b 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSineACF-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSineACF-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSunsLynx-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSunsLynx-1.png index 2ed3114f..ea9f718f 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSunsLynx-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotSunsLynx-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrSeas-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrSeas-1.png index b13b0f90..68045e33 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrSeas-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrSeas-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrendTSb-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrendTSb-1.png index 48d47cfd..33e674c0 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrendTSb-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotTrendTSb-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotbetterACF-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotbetterACF-1.png index 260d7ba6..ca94f84a 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotbetterACF-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotbetterACF-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata1-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata1-1.png index b5a7cce5..874ce7f3 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata1-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata1-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata2-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata2-1.png index 46c05f54..176f6c6c 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata2-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/tslab-plotdata2-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-acfs-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-acfs-1.png index 93d63500..3407444d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-acfs-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-acfs-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-bayesian-states-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-bayesian-states-1.png index 72840b2a..6e768b88 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-bayesian-states-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-bayesian-states-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-posteriors-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-posteriors-1.png index 9926ec15..61453d0d 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-posteriors-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-fig-posteriors-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotdata-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotdata-1.png index 1aed183c..fa457689 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotdata-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotdata-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotfit-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotfit-1.png index 7b88bdcb..33132503 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotfit-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-plotfit-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-resids-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-resids-1.png index 8ed7c9d9..b0fb91c3 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-resids-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-resids-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-1.png index 1ff6cfe3..2ff03bb8 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-ggplot-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-ggplot-1.png index 9e774982..64438b49 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-ggplot-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-ar-level-ggplot-1.png differ diff --git a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-hist-pars-1.png b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-hist-pars-1.png index 885d50ec..096dd2a5 100644 Binary files a/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-hist-pars-1.png and b/docs/Applied_Time_Series_Analysis_files/figure-html/uss-stan-hist-pars-1.png differ diff --git a/docs/chap-basicmat.html b/docs/chap-basicmat.html index d803c0d7..18cb8fd0 100644 --- a/docs/chap-basicmat.html +++ b/docs/chap-basicmat.html @@ -1,13 +1,12 @@ - - +
- - + +We will use the catch landings from Greek waters (greeklandings
) and the Chinook landings (chinook
) in Washington data sets for this chapter. These datasets are in the atsalibrary package on GitHub. Install using the devtools package.
library(devtools)
-devtools::install_github("nwfsc-timeseries/atsalibrary")
Load the data.
-data(greeklandings, package = "atsalibrary")
-landings <- greeklandings
-# Use the monthly data
-data(chinook, package = "atsalibrary")
-chinook <- chinook.month
data(greeklandings, package="atsalibrary")
+landings <- greeklandings
+# Use the monthly data
+data(chinook, package="atsalibrary")
+chinook <- chinook.month
Ensure you have the necessary packages.
-library(ggplot2)
-library(gridExtra)
-library(reshape2)
-library(tseries)
-library(urca)
-library(forecast)
Dynamic linear models (DLMs) are a type of linear regression model, wherein the parameters are treated as time-varying rather than static. DLMs are used commonly in econometrics, but have received less attention in the ecological literature (c.f. Lamon, Carpenter, and Stow 1998; Scheuerell and Williams 2005). Our treatment of DLMs is rather cursory—we direct the reader to excellent textbooks by Pole, West, and Harrison (1994) and Petris, Petrone, and Campagnoli (2009) for more in-depth treatments of DLMs. The former focuses on Bayesian estimation whereas the latter addresses both likelihood-based and Bayesian estimation methods.
+Dynamic linear models (DLMs) are a type of linear regression model, wherein the parameters are treated as time-varying rather than static. DLMs are used commonly in econometrics, but have received less attention in the ecological literature (c.f. Lamon, Carpenter, and Stow 1998; Scheuerell and Williams 2005). Our treatment of DLMs is rather cursory—we direct the reader to excellent textbooks by Pole, West, and Harrison (1994) and Petris, Petrone, and Campagnoli (2009) for more in-depth treatments of DLMs. The former focuses on Bayesian estimation whereas the latter addresses both likelihood-based and Bayesian estimation methods.
A script with all the R code in the chapter can be downloaded here. The Rmd for this chapter can be downloaded here.
Most of the data used in the chapter are from the MARSS package. Install the package, if needed, and load:
-library(MARSS)
The problem set uses an additional data set on spawners and recruits (KvichakSockeye
) in the atsalibrary
package.
In this lab, we will work through using Bayesian methods to estimate parameters in time series models. There are a variety of software tools to do time series analysis using Bayesian methods. R lists a number of packages available on the R Cran TimeSeries task view.
Software to implement more complicated models is also available, and many of you are probably familiar with these options (AD Model Builder and Template Model Builder, WinBUGS, OpenBUGS, JAGS, Stan, to name a few). In this chapter, we will show you how to write state-space models in JAGS and fit these models.
After updating to the latest version of R, install JAGS for your operating platform using the instructions here. Click on JAGS, then the most recent folder, then the platform of your machine. You will also need the coda, rjags and R2jags packages.
-library(coda)
-library(rjags)
-library(R2jags)
This chapter uses the stats, MARSS and datasets packages. Install those packages, if needed, and load:
-library(stats)
-library(MARSS)
-library(datasets)
We will work with the stackloss
dataset available in the datasets package. The dataset consists of 21 observations on the efficiency of a plant that produces nitric acid as a function of three explanatory variables: air flow, water temperature and acid concentration. We are going to use just the first 4 datapoints so that it is easier to write the matrices, but the concepts extend to as many datapoints as you have.
data(stackloss, package="datasets")
-dat = stackloss[1:4,] #subsetted first 4 rows
-dat
Air.Flow Water.Temp Acid.Conc. stack.loss
1 80 27 89 42
2 80 27 88 37
@@ -465,24 +491,24 @@ Data and packages
All the data used in the chapter are in the MARSS package. For most examples, we will use the MARSS()
function to fit models via maximum-likelihood. We also show how to fit a Bayesian model using JAGS and Stan. For these sectiosn you will need the R2jags, coda and rstan packages. To run the JAGS code, you will also need JAGS installed. See Chapter 12 for more details on JAGS and Chapter 13 for more details on Stan.
library(MARSS)
-library(R2jags)
-library(coda)
-library(rstan)
For the chapter examples, we will use the green and bluegreen algae in the Lake Washington plankton data set and the covariates in that dataset. This is a 32-year time series (1962-1994) of monthly plankton counts (cells per mL) from Lake Washington, Washington, USA with the covariates total phosphorous and pH. lakeWAplanktonTrans
is a transformed version of the raw data used for teaching purposes. Zeros have been replaced with NAs (missing). The logged (natural log) raw plankton counts have been standardized to a mean of zero and variance of 1 (so logged and then z-scored). Temperature, TP and pH were also z-scored but not logged (so z-score of the untransformed values for these covariates). The single missing temperature value was replaced with -1 and the single missing TP value was replaced with -0.3.
We will use the 10 years of data from 1965-1974 (Figure 8.1), a decade with particularly high green and bluegreen algae levels.
-data(lakeWAplankton, package="MARSS")
-# lakeWA
-fulldat = lakeWAplanktonTrans
-years = fulldat[,"Year"]>=1965 & fulldat[,"Year"]<1975
-dat = t(fulldat[years,c("Greens", "Bluegreens")])
-covariates = t(fulldat[years,c("Temp", "TP")])
data(lakeWAplankton, package="MARSS")
+# lakeWA
+fulldat = lakeWAplanktonTrans
+years = fulldat[,"Year"]>=1965 & fulldat[,"Year"]<1975
+dat = t(fulldat[years,c("Greens", "Bluegreens")])
+covariates = t(fulldat[years,c("Temp", "TP")])
This chapter will use a SNOTEL dataset. These are data on snow water equivalency at locations throughtout the state of Washington. The data are in the atsalibrary package.
-data(snotel, package = "atsalibrary")
The main packages used in this chapter are MARSS and forecast.
-library(MARSS)
-library(forecast)
-library(ggplot2)
-library(ggmap)
-library(broom)
You will need the atsar package we have written for fitting state-space time series models with Stan. This is hosted on Github safs-timeseries. Install using the devtools package.
-library(devtools)
-devtools::install_github("nwfsc-timeseries/atsar")
In addition, you will need the rstan, datasets, parallel and loo packages. After installing, if needed, load the packages:
-library(atsar)
-library(rstan)
-library(loo)
Once you have Stan and rstan installed, optimize Stan on your machine:
-rstan_options(auto_write = TRUE)
-options(mc.cores = parallel::detectCores())
For this lab, we will use a data set on airquality in New York from the datasets package. Load the data and create a couple new variables for future use.
-data(airquality, package="datasets")
-Wind <- airquality$Wind # wind speed
-Temp <- airquality$Temp # air temperature
This chapter uses the stats package, which is often loaded by default when you start R, the MARSS package and the forecast package. The problems use a dataset in the datasets package. After installing the packages, if needed, load:
-library(stats)
-library(MARSS)
-library(forecast)
-library(datasets)
The chapter uses data sets which are in the atsalibrary package. If needed, install using the devtools package.
-library(devtools)
-devtools::install_github("nwfsc-timeseries/atsalibrary")
The main one is a time series of the atmospheric concentration of CO\(_2\) collected at the Mauna Loa Observatory in Hawai’i (MLCO2
). The second is Northern Hemisphere land and ocean temperature anomalies from NOAA. (NHTemp
). The problems use a data set on hourly phytoplankton counts (hourlyphyto
). Use ?MLCO2
, ?NHTemp
and ?hourlyphyto
for information on these datasets.
Load the data.
-data(NHTemp, package = "atsalibrary")
-Temp <- NHTemp
-data(MLCO2, package = "atsalibrary")
-CO2 <- MLCO2
-data(hourlyphyto, package = "atsalibrary")
-pDat <- hourlyphyto
All the data used in the chapter are in the MARSS package. The other required packages are stats (normally loaded by default when starting R), datasets and forecast. Install the packages, if needed, and load:
-library(stats)
-library(MARSS)
-library(forecast)
-library(datasets)
To run the JAGS code example (optional), you will also need JAGS installed and the R2jags, rjags and coda R packages. To run the Stan code example (optional), you will need the rstan package.
Let’s see an example using the Washington SNOTEL data. The data we will use is the snow water equivalent percent of normal. This represents the snow water equivalent compared to the average value for that site on the same day. We will look at a subset of sites in the Central Cascades in our snotel
dataset (Figure 11.1).
load("snotel.RData")
y <- snotelmeta
-# Just use a subset
-y = y[which(y$Longitude < -121.4), ]
-y = y[which(y$Longitude > -122.5), ]
-y = y[which(y$Latitude < 47.5), ]
-y = y[which(y$Latitude > 46.5), ]
y <- snotelmeta
+# Just use a subset
+y = y[which(y$Longitude < -121.4),]
+y = y[which(y$Longitude > -122.5),]
+y = y[which(y$Latitude < 47.5),]
+y = y[which(y$Latitude > 46.5),]
For the first analysis, we are just going to look at February Snow Water Equivalent (SWE). Our subset of stations is y$Station.Id
. There are many missing years among some of our stations (Figure 11.2).
swe.feb <- snotel
-swe.feb <- swe.feb[swe.feb$Station.Id %in% y$Station.Id & swe.feb$Month ==
- "Feb", ]
-p <- ggplot(swe.feb, aes(x = Date, y = SWE)) + geom_line()
-p + facet_wrap(~Station)
swe.feb <- snotel
+swe.feb <- swe.feb[swe.feb$Station.Id %in% y$Station.Id & swe.feb$Month=="Feb",]
+p <- ggplot(swe.feb, aes(x=Date, y=SWE)) + geom_line()
+p + facet_wrap(~Station)
Imagine that for our study we need an estimate of SWE for all sites. We will use the information from the sites with full data to estimate the missing SWE for other sites. We will use a MARSS model to use all the available data.
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
x_1 \\ x_2 \\ \dots \\ x_{15}
\end{bmatrix}_t =
@@ -494,36 +519,35 @@ 11.2.1 Estimate Feb SWE using AR(
v_1 \\ v_2 \\ \dots \\ v_{15}
\end{bmatrix}_t
\tag{11.5}
-\end{equation}\]
+\end{equation}\]
We will use an unconstrained variance-covariance structure for \(\mathbf{w}\) and assume that \(\mathbf{v}\) is identical and independent and very low (SNOTEL instrument variability). The \(a_i\) determine the level of the \(x_i\).
We need our data to be in rows. We will use reshape2::acast()
.
dat.feb <- reshape2::acast(swe.feb, Station ~ Year, value.var = "SWE")
We set up the model for MARSS so that it is the same as (11.5). We will fix the measurement error to be small; we could use 0 but the fitting is more stable if we use a small variance instead. When estimating \(\mathbf{B}\), setting the initial value to be at \(t=1\) instead of \(t=0\) works better.
-ns <- length(unique(swe.feb$Station))
-B <- "diagonal and equal"
-Q <- "unconstrained"
-R <- diag(0.01, ns)
-U <- "zero"
-A <- "unequal"
-x0 <- "unequal"
-mod.list.ar1 = list(B = B, Q = Q, R = R, U = U, x0 = x0, A = A,
- tinitx = 1)
ns <- length(unique(swe.feb$Station))
+B <- "diagonal and equal"
+Q <- "unconstrained"
+R <- diag(0.01,ns)
+U <- "zero"
+A <- "unequal"
+x0 <- "unequal"
+mod.list.ar1 = list(B=B, Q=Q, R=R, U=U, x0=x0, A=A, tinitx=1)
Now we can fit a MARSS model and get estimates of the missing SWEs. Convergence is slow. We set \(\mathbf{a}\) equal to the mean of the time series to speed convergence.
-library(MARSS)
-m <- apply(dat.feb, 1, mean, na.rm = TRUE)
-fit.ar1 <- MARSS(dat.feb, model = mod.list.ar1, control = list(maxit = 5000),
- inits = list(A = matrix(m, ns, 1)))
The \(b\) estimate is ````.
+library(MARSS)
+m <- apply(dat.feb, 1, mean, na.rm=TRUE)
+fit.ar1 <- MARSS(dat.feb, model=mod.list.ar1, control=list(maxit=5000), inits=list(A=matrix(m,ns,1)))
The \(b\) estimate is 0.4494841
.
Let’s plot the estimated SWEs for the missing years (Figure 11.3). These estimates use all the information about the correlation with other sites and uses information about correlation with the prior and subsequent years. We will use the tidy()
function from the broom package to get the estimated 95% confidence intervals for the estimated states. Notice that for some sites, CIs are low in early years as these sites are highly correlated with site for which there are data. In other sites, the uncertainty is high in early years because the sites with data in those years are not highly correlated.
fit <- fit.ar1
-d <- augment(fit, interval = "confidence")
-d$Year <- d$t + 1980
-d$Station <- d$.rownames
-p <- ggplot(data = d) + geom_line(aes(Year, .fitted)) + geom_ribbon(aes(x = Year,
- ymin = .conf.low, ymax = .conf.up), linetype = 2, alpha = 0.5)
-p <- p + geom_point(data = swe.feb, mapping = aes(x = Year, y = SWE))
-p + facet_wrap(~Station) + xlab("") + ylab("SWE (demeaned)")
fit <- fit.ar1
+d <- augment(fit, interval="confidence")
+d$Year <- d$t + 1980
+d$Station <- d$.rownames
+p <- ggplot(data = d) +
+ geom_line(aes(Year, .fitted)) +
+ geom_ribbon(aes(x=Year, ymin=.conf.low, ymax=.conf.up), linetype=2, alpha=0.5)
+p <- p + geom_point(data=swe.feb, mapping = aes(x=Year, y=SWE))
+p + facet_wrap(~Station) + xlab("") + ylab("SWE (demeaned)")
The state residuals have a tendency for negative autocorrelation at lag-1 (Figure 11.4).
-fit <- fit.ar1
-par(mfrow = c(4, 4), mar = c(2, 2, 1, 1))
-apply(residuals(fit)$state.residuals[, 1:30], 1, acf)
fit <- fit.ar1
+par(mfrow=c(4,4),mar=c(2,2,1,1))
+apply(residuals(fit)$state.residuals[,1:30], 1, acf)
Another approach is to treat the February data as temporally uncorrelated. The two longest time series (Paradise and Olallie Meadows) show minimal autocorrelation so we might decide to just use the correlation across stations for our estimates. In this case, the state of the missing SWE values at time \(t\) is the expected value conditioned on all the stations with data at time \(t\) given the estimated variance-covariance matrix \(\mathbf{Q}\).
-We could set this model up as +We could set this model up as
\[\begin{equation}
\begin{bmatrix}
y_1 \\ y_2 \\ \dots \\ y_{15}
@@ -568,9 +592,9 @@ 11.2.2 Estimate Feb SWE using onl
\zeta_{15,1}&\zeta_{15,2}&\dots&\sigma_{15}
\end{bmatrix}
\tag{11.6}
-\end{equation}\]
+\end{equation}\]
However the EM algorithm used by MARSS()
runs into numerical issues. Instead we will set the model up as follows. Allowing a hidden state observed with small error makes the estimation more stable.
\[\begin{equation}
\begin{bmatrix}
x_1 \\ x_2 \\ \dots \\ x_{15}
\end{bmatrix}_t =
@@ -605,30 +629,29 @@ 11.2.2 Estimate Feb SWE using onl
\end{bmatrix}
\tag{11.7}
\end{equation}\]
-
Again \(\mathbf{a}\) is the mean level in the time series. Note that the expected value of \(\mathbf{x}\) is zero if there are no data, so \(E(\mathbf{x}_0)=0\).
-ns <- length(unique(swe.feb$Station))
-B <- "zero"
-Q <- "unconstrained"
-R <- diag(0.01, ns)
-U <- "zero"
-A <- "unequal"
-x0 <- "zero"
-mod.list.corr = list(B = B, Q = Q, R = R, U = U, x0 = x0, A = A,
- tinitx = 0)
ns <- length(unique(swe.feb$Station))
+B <- "zero"
+Q <- "unconstrained"
+R <- diag(0.01,ns)
+U <- "zero"
+A <- "unequal"
+x0 <- "zero"
+mod.list.corr = list(B=B, Q=Q, R=R, U=U, x0=x0, A=A, tinitx=0)
Now we can fit a MARSS model and get estimates of the missing SWEs. Convergence is slow. We set \(\mathbf{a}\) equal to the mean of the time series to speed convergence.
-m <- apply(dat.feb, 1, mean, na.rm = TRUE)
-fit.corr <- MARSS(dat.feb, model = mod.list.corr, control = list(maxit = 5000),
- inits = list(A = matrix(m, ns, 1)))
m <- apply(dat.feb, 1, mean, na.rm=TRUE)
+fit.corr <- MARSS(dat.feb, model=mod.list.corr, control=list(maxit=5000), inits=list(A=matrix(m,ns,1)))
The estimated SWEs for the missing years uses the information about the correlation with other sites only.
-fit <- fit.corr
-d <- broom::augment(fit, interval = "confidence")
-d$Year <- d$t + 1980
-d$Station <- d$.rownames
-p <- ggplot(data = d) + geom_line(aes(Year, .fitted)) + geom_ribbon(aes(x = Year,
- ymin = .conf.low, ymax = .conf.up), linetype = 2, alpha = 0.5)
-p <- p + geom_point(data = swe.feb, mapping = aes(x = Year, y = SWE))
-p + facet_wrap(~Station) + xlab("") + ylab("SWE (demeaned)")
fit <- fit.corr
+d <- broom::augment(fit, interval="confidence")
+d$Year <- d$t + 1980
+d$Station <- d$.rownames
+p <- ggplot(data = d) +
+ geom_line(aes(Year, .fitted)) +
+ geom_ribbon(aes(x=Year, ymin=.conf.low, ymax=.conf.up), linetype=2, alpha=0.5)
+p <- p + geom_point(data=swe.feb, mapping = aes(x=Year, y=SWE))
+p + facet_wrap(~Station) + xlab("") + ylab("SWE (demeaned)")
The state and model residuals have no tendency towards negative autocorrelation now that we removed the autoregressive component from the process (\(x\)) model.
-fit <- fit.corr
-par(mfrow = c(4, 4), mar = c(2, 2, 1, 1))
-apply(residuals(fit)$state.residuals[, 1:30], 1, acf)
-mtext("State Residuals ACF", outer = TRUE, side = 3)
fit <- fit.corr
+par(mfrow=c(4,4),mar=c(2,2,1,1))
+apply(residuals(fit)$state.residuals[,1:30], 1, acf)
+mtext("State Residuals ACF", outer=TRUE, side=3)
fit <- fit.corr
-par(mfrow = c(4, 4), mar = c(2, 2, 1, 1))
-apply(residuals(fit)$model.residuals[, 1:30], 1, acf)
-mtext("Model Residuals ACF", outer = TRUE, side = 3)
The model is set up as follows:
-ns <- dim(dat.feb)[1]
-B <- matrix(list(0), 2, 2)
-B[1, 1] <- "b1"
-B[2, 2] <- "b2"
-Q <- diag(1, 2)
-R <- "diagonal and unequal"
-U <- "zero"
-x0 <- "zero"
-Z <- matrix(list(0), ns, 2)
-Z[1:(ns * 2)] <- c(paste0("z1", 1:ns), paste0("z2", 1:ns))
-Z[1, 2] <- 0
-A <- "unequal"
-mod.list.dfa = list(B = B, Z = Z, Q = Q, R = R, U = U, A = A,
- x0 = x0)
ns <- dim(dat.feb)[1]
+B <- matrix(list(0),2,2)
+B[1,1] <- "b1"; B[2,2] <- "b2"
+Q <- diag(1,2)
+R <- "diagonal and unequal"
+U <- "zero"
+x0 <- "zero"
+Z <- matrix(list(0),ns,2)
+Z[1:(ns*2)] <- c(paste0("z1",1:ns),paste0("z2",1:ns))
+Z[1,2] <- 0
+A <- "unequal"
+mod.list.dfa = list(B=B, Z=Z, Q=Q, R=R, U=U, A=A, x0=x0)
Now we can fit a MARSS model and get estimates of the missing SWEs. We pass in the initial value for \(\mathbf{a}\) as the mean level so it fits easier.
-library(MARSS)
-m <- apply(dat.feb, 1, mean, na.rm = TRUE)
-fit.dfa <- MARSS(dat.feb, model = mod.list.dfa, control = list(maxit = 1000),
- inits = list(A = matrix(m, ns, 1)))
The state residuals are uncorrelated.
-fit <- fit.dfa
-par(mfrow = c(1, 2), mar = c(2, 2, 1, 1))
-apply(residuals(fit)$state.residuals[, 1:30, drop = FALSE], 1,
- acf)
fit <- fit.dfa
+par(mfrow=c(1,2),mar=c(2,2,1,1))
+apply(residuals(fit)$state.residuals[,1:30,drop=FALSE], 1, acf)
As are the model residuals:
-par(mfrow = c(4, 4), mar = c(2, 2, 1, 1))
-apply(residuals(fit)$model.residual, 1, function(x) {
- acf(na.omit(x))
-})
2019-03-28
+2019-11-26
The book uses a number of R packages and a variety of fisheries data sets. The packages and data sets can be installed by installing our atsalibrary package which is hosted on GitHub:
-library(devtools)
-devtools::install_github("nwfsc-timeseries/atsalibrary")
Holmes, E. E., M. D. Scheuerell, and E. J. Ward. Applied time series analysis for fisheries and environmental data. NOAA Fisheries, Northwest Fisheries Science Center, 2725 Montlake Blvd E., Seattle, WA 98112. Contacts eli.holmes@noaa.gov, eric.ward@noaa.gov, and mark.scheuerell@noaa.gov
+Holmes, E. E., M. D. Scheuerell, and E. J. Ward. Applied time series analysis for fisheries and environmental data. NOAA Fisheries, Northwest Fisheries Science Center, 2725 Montlake Blvd E., Seattle, WA 98112. Contacts eli.holmes@noaa.gov, eric.ward@noaa.gov, and mark.scheuerell@noaa.gov
+When we look at all months, we see that SWE is highly seasonal. Note October and November are missing for all years.
-swe.yr <- snotel
-swe.yr <- swe.yr[swe.yr$Station.Id %in% y$Station.Id, ]
-swe.yr$Station <- droplevels(swe.yr$Station)
swe.yr <- snotel
+swe.yr <- swe.yr[swe.yr$Station.Id %in% y$Station.Id,]
+swe.yr$Station <- droplevels(swe.yr$Station)
Set up the data matrix of monthly SNOTEL data:
-dat.yr <- snotel
-dat.yr <- dat.yr[dat.yr$Station.Id %in% y$Station.Id, ]
-dat.yr$Station <- droplevels(dat.yr$Station)
-dat.yr$Month <- factor(dat.yr$Month, level = month.abb)
-dat.yr <- reshape2::acast(dat.yr, Station ~ Year + Month, value.var = "SWE")
dat.yr <- snotel
+dat.yr <- dat.yr[dat.yr$Station.Id %in% y$Station.Id,]
+dat.yr$Station <- droplevels(dat.yr$Station)
+dat.yr$Month <- factor(dat.yr$Month, level=month.abb)
+dat.yr <- reshape2::acast(dat.yr, Station ~ Year+Month, value.var="SWE")
We will model the seasonal differences using a periodic model. The covariates are
-period <- 12
-TT <- dim(dat.yr)[2]
-cos.t <- cos(2 * pi * seq(TT)/period)
-sin.t <- sin(2 * pi * seq(TT)/period)
-c.seas <- rbind(cos.t, sin.t)
period <- 12
+TT <- dim(dat.yr)[2]
+cos.t <- cos(2 * pi * seq(TT) / period)
+sin.t <- sin(2 * pi * seq(TT) / period)
+c.seas <- rbind(cos.t,sin.t)
We will create a state for the seasonal cycle and each station will have a scaled effect of that seasonal cycle. The observations will have the seasonal effect plus a mean and residuals (observation - season - mean) will be allowed to correlate across stations.
-ns <- dim(dat.yr)[1]
-B <- "zero"
-Q <- matrix(1)
-R <- "unconstrained"
-U <- "zero"
-x0 <- "zero"
-Z <- matrix(paste0("z", 1:ns), ns, 1)
-A <- "unequal"
-mod.list.dfa = list(B = B, Z = Z, Q = Q, R = R, U = U, A = A,
- x0 = x0)
-C <- matrix(c("c1", "c2"), 1, 2)
-c <- c.seas
-mod.list.seas <- list(B = B, U = U, Q = Q, A = A, R = R, Z = Z,
- C = C, c = c, x0 = x0, tinitx = 0)
ns <- dim(dat.yr)[1]
+B <- "zero"
+Q <- matrix(1)
+R <- "unconstrained"
+U <- "zero"
+x0 <- "zero"
+Z <- matrix(paste0("z",1:ns),ns,1)
+A <- "unequal"
+mod.list.dfa = list(B=B, Z=Z, Q=Q, R=R, U=U, A=A, x0=x0)
+C <- matrix(c("c1","c2"),1,2)
+c <- c.seas
+mod.list.seas <- list(B=B, U=U, Q=Q, A=A, R=R, Z=Z, C=C, c=c, x0=x0, tinitx=0)
Now we can fit the model:
-m <- apply(dat.yr, 1, mean, na.rm = TRUE)
-fit.seas <- MARSS(dat.yr, model = mod.list.seas, control = list(maxit = 500),
- inits = list(A = matrix(m, ns, 1)))
m <- apply(dat.yr, 1, mean, na.rm=TRUE)
+fit.seas <- MARSS(dat.yr, model=mod.list.seas, control=list(maxit=500), inits=list(A=matrix(m,ns,1)))
Figure () shows the seasonal estimate plus mean for each station. This is \(z_i x_i + a_i\).
The estimated SWE at each station is \(E(y_i|y_{1:T})\). This is the estimate conditioned on all the data and includes the seasonal component plus the information from the data from other stations. Because we estimated a \(\mathbf{R}\) matrix with covariance, stations with data at time \(t\) help inform the value of stations without data at time \(t\). Only years up to 1990 are shown, but the model is fit to all years.
-# this is the estimate of y conditioned on all the data
-dd <- MARSShatyt(fit.seas)$ytT
-rownames(dd) <- rownames(dat.yr)
-colnames(dd) <- colnames(dat.yr)
-ddd <- reshape2::melt(dd)
-ddd$Var2 <- factor(ddd$Var2, levels = paste0(rep(1981:2013, each = 12),
- "_", month.abb))
-colnames(ddd) <- c("Station", "Year_Month", "SWE")
-ddd <- ddd[order(ddd$Station, ddd$Year_Month), ]
-ddd$Date <- swe.yr$Date
-ddd$Year <- swe.yr$Year
-ddd <- subset(ddd, Year < 1990)
-p <- ggplot(data = ddd) + geom_line(aes(x = Date, y = SWE))
-p <- p + geom_point(data = subset(swe.yr, Year < 1990), mapping = aes(x = Date,
- y = SWE))
-p + facet_wrap(~Station) + xlab("") + ylab("SWE")
#this is the estimate of y conditioned on all the data
+dd <- MARSShatyt(fit.seas)$ytT
+rownames(dd) <- rownames(dat.yr)
+colnames(dd) <- colnames(dat.yr)
+ddd <- reshape2::melt(dd)
+ddd$Var2 <- factor(ddd$Var2, levels=paste0(rep(1981:2013,each=12),"_", month.abb))
+colnames(ddd) <- c("Station", "Year_Month", "SWE")
+ddd <- ddd[order(ddd$Station, ddd$Year_Month),]
+ddd$Date <- swe.yr$Date
+ddd$Year <- swe.yr$Year
+ddd <- subset(ddd, Year<1990)
+p <- ggplot(data = ddd) +
+ geom_line(aes(x=Date, y=SWE))
+p <- p + geom_point(data=subset(swe.yr, Year<1990),
+ mapping = aes(x=Date, y=SWE))
+p + facet_wrap(~Station) + xlab("") + ylab("SWE")
Warning: Removed 1250 rows containing missing values (geom_point).
@@ -530,24 +553,24 @@ Jorgensen, Jeff C., Eric J. Ward, Mark D. Scheuerell, and Richard W. Zabel. 2016. “Assessing Spatial Covariance Among Time Series of Abundance.” Journal Article. Ecology and Evolution 6: 2472–85. doi:10.1002/ece3.2031.
+Jorgensen, Jeff C., Eric J. Ward, Mark D. Scheuerell, and Richard W. Zabel. 2016. “Assessing Spatial Covariance Among Time Series of Abundance.” Journal Article. Ecology and Evolution 6: 2472–85. https://doi.org/10.1002/ece3.2031.
Lamon, E.C. III, S.R. Carpenter, and C.A. Stow. 1998. “Forecasting Pcb Concentrations in Lake Michigan Salmonids: A Dynamic Linear Model Approach.” Ecological Applications 8: 659–68.
+Lamon, E. C. III, S. R. Carpenter, and C. A. Stow. 1998. “Forecasting Pcb Concentrations in Lake Michigan Salmonids: A Dynamic Linear Model Approach.” Ecological Applications 8: 659–68.
Lisi, Peter J., Daniel E. Schindler, Timothy J. Cline, Mark D. Scheuerell, and Patrick B. Walsh. 2015. “Watershed Geomorphology and Snowmelt Control Stream Thermal Sensitivity to Air Temperature.” Journal Article. Geophysical Research Letters 42 (9): 3380–8. doi:10.1002/2015gl064083.
+Lisi, Peter J., Daniel E. Schindler, Timothy J. Cline, Mark D. Scheuerell, and Patrick B. Walsh. 2015. “Watershed Geomorphology and Snowmelt Control Stream Thermal Sensitivity to Air Temperature.” Journal Article. Geophysical Research Letters 42 (9): 3380–8. https://doi.org/10.1002/2015gl064083.
Ohlberger, J., Mark D. Scheuerell, and Daniel E. Schindler. 2016. “Population coherence and environmental impacts across spatial scales: a case study of Chinook salmon.” Journal Article. Ecosphere 7: e01333. doi:10.1002/ecs2.1333.
+Ohlberger, J., Mark D. Scheuerell, and Daniel E. Schindler. 2016. “Population coherence and environmental impacts across spatial scales: a case study of Chinook salmon.” Journal Article. Ecosphere 7: e01333. https://doi.org/10.1002/ecs2.1333.
Petris, Giovanni, Sonia Petrone, and Patrizia Campagnoli. 2009. Dynamic Linear Models with R. Use R! London: Springer.
@@ -458,7 +484,7 @@Scheuerell, Mark D., and John G. Williams. 2005. “Forecasting Climate Induced Changes in the Survival of Snake River Spring/Summer Chinook Salmon (Oncorhynchus Tshawytscha).” Fisheries Oceanography 14 (6): 448–57.
Stachura, Megan M., Nathan J. Mantua, and Mark D. Scheuerell. 2014. “Oceanographic influences on patterns in North Pacific salmon abundance.” Journal Article. Canadian Journal of Fisheries and Aquatic Sciences 71 (2): 226–35. doi:10.1139/cjfas-2013-0367.
+Stachura, Megan M., Nathan J. Mantua, and Mark D. Scheuerell. 2014. “Oceanographic influences on patterns in North Pacific salmon abundance.” Journal Article. Canadian Journal of Fisheries and Aquatic Sciences 71 (2): 226–35. https://doi.org/10.1139/cjfas-2013-0367.
Zuur, A. F., R. J. Fryer, I. T. Jolliffe, R. Dekker, and J. J. Beukema. 2003. “Estimating Common Trends in Multivariate Time Series Using Dynamic Factor Analysis.” Environmetrics 14 (7): 665–85.
@@ -476,24 +502,24 @@Create a \(3 \times 4\) matrix, meaning 3 row and 4 columns, that is all 1s:
-matrix(1, 3, 4)
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 1 1 1 1
[3,] 1 1 1 1
Create a \(3 \times 4\) matrix filled in with the numbers 1 to 12 by column (default) and by row:
-matrix(1:12, 3, 4)
[,1] [,2] [,3] [,4]
[1,] 1 4 7 10
[2,] 2 5 8 11
[3,] 3 6 9 12
-matrix(1:12, 3, 4, byrow=TRUE)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 5 6 7 8
[3,] 9 10 11 12
Create a matrix with one column:
-matrix(1:4, ncol=1)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
Create a matrix with one row:
-matrix(1:4, nrow=1)
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
Check the dimensions of a matrix
-A=matrix(1:6, 2,3)
-A
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
-dim(A)
[1] 2 3
Get the number of rows in a matrix:
-dim(A)[1]
[1] 2
-nrow(A)
[1] 2
Create a 3D matrix (called array):
-A=array(1:6, dim=c(2,3,2))
-A
, , 1
[,1] [,2] [,3]
@@ -489,25 +515,25 @@ 1.1 Creating matrices in R
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
-dim(A)
[1] 2 3 2
Check if an object is a matrix. A data frame is not a matrix. A vector is not a matrix.
-A=matrix(1:4, 1, 4)
-A
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
-class(A)
[1] "matrix"
-B=data.frame(A)
-B
X1 X2 X3 X4
1 1 2 3 4
-class(B)
[1] "data.frame"
-C=1:4
-C
[1] 1 2 3 4
-class(C)
[1] "integer"
A diagonal matrix is one that is square, meaning number of rows equals number of columns, and it has 0s on the off-diagonal and non-zeros on the diagonal. In R, you form a diagonal matrix with the diag()
function:
diag(1,3) #put 1 on diagonal of 3x3 matrix
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
-diag(2, 3) #put 2 on diagonal of 3x3 matrix
[,1] [,2] [,3]
[1,] 2 0 0
[2,] 0 2 0
[3,] 0 0 2
-diag(1:4) #put 1 to 4 on diagonal of 4x4 matrix
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 2 0 0
[3,] 0 0 3 0
[4,] 0 0 0 4
The diag()
function can also be used to replace elements on the diagonal of a matrix:
A=matrix(3, 3, 3)
-diag(A)=1
-A
[,1] [,2] [,3]
[1,] 1 3 3
[2,] 3 1 3
[3,] 3 3 1
-A=matrix(3, 3, 3)
-diag(A)=1:3
-A
[,1] [,2] [,3]
[1,] 1 3 3
[2,] 3 2 3
[3,] 3 3 3
-A=matrix(3, 3, 4)
-diag(A[1:3,2:4])=1
-A
[,1] [,2] [,3] [,4]
[1,] 3 1 3 3
[2,] 3 3 1 3
[3,] 3 3 3 1
The diag()
function is also used to get the diagonal of a matrix.
A=matrix(1:9, 3, 3)
-diag(A)
[1] 1 5 9
The identity matrix is a special kind of diagonal matrix with 1s on the diagonal. It is denoted \(\mathbf{I}\). \(\mathbf{I}_3\) would mean a \(3 \times 3\) diagonal matrix. A identity matrix has the property that \(\mathbf{A}\mathbf{I}=\mathbf{A}\) and \(\mathbf{I}\mathbf{A}=\mathbf{A}\) so it is like a 1.
-A=matrix(1:9, 3, 3)
-I=diag(3) #shortcut for 3x3 identity matrix
-A%*%I
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
@@ -497,24 +523,24 @@ 1.5 Diagonal matrices and identit
The inverse of a matrix is denoted \(\mathbf{A}^{-1}\). You can think of the inverse of a matrix like \(1/a\). \(1/a \times a = 1\). \(\mathbf{A}^{-1}\mathbf{A} = \mathbf{A}\mathbf{A}^{-1} = \mathbf{I}\). The inverse of a matrix does not always exist; for one it has to be square. We’ll be using inverses for variance-covariance matrices and by definition (of a variance-covariance matrix), the inverse of those exist. In R, there are a couple way common ways to take the inverse of a variance-covariance matrix (or something with the same properties). solve()
is the most common probably:
A=diag(3,3)+matrix(1,3,3)
-invA=solve(A)
-invA%*%A
[,1] [,2] [,3]
[1,] 1.000000e+00 -6.938894e-18 0
[2,] 2.081668e-17 1.000000e+00 0
[3,] 0.000000e+00 0.000000e+00 1
-A%*%invA
[,1] [,2] [,3]
[1,] 1.000000e+00 -6.938894e-18 0
[2,] 2.081668e-17 1.000000e+00 0
[3,] 0.000000e+00 0.000000e+00 1
Another option is to use chol2inv()
which uses a Cholesky decomposition:
A=diag(3,3)+matrix(1,3,3)
-invA=chol2inv(chol(A))
-invA%*%A
[,1] [,2] [,3]
[1,] 1.000000e+00 6.938894e-17 0.000000e+00
[2,] 2.081668e-17 1.000000e+00 -2.775558e-17
[3,] -5.551115e-17 0.000000e+00 1.000000e+00
-A%*%invA
[,1] [,2] [,3]
[1,] 1.000000e+00 2.081668e-17 -5.551115e-17
[2,] 6.938894e-17 1.000000e+00 0.000000e+00
@@ -474,24 +500,24 @@ 1.6 Taking the inverse of a squar
You will need to be very solid in matrix multiplication for the course. If you haven’t done it in awhile, google `matrix multiplication youtube’ and you find lots of 5min videos to remind you.
-In R, you use the %*%
operation to do matrix multiplication. When you do matrix multiplication, the columns of the matrix on the left must equal the rows of the matrix on the right. The result is a matrix that has the number of rows of the matrix on the left and number of columns of the matrix on the right. \[(n \times m)(m \times p) = (n \times p)\]
A=matrix(1:6, 2, 3) #2 rows, 3 columns
-B=matrix(1:6, 3, 2) #3 rows, 2 columns
-A%*%B #this works
In R, you use the %*%
operation to do matrix multiplication. When you do matrix multiplication, the columns of the matrix on the left must equal the rows of the matrix on the right. The result is a matrix that has the number of rows of the matrix on the left and number of columns of the matrix on the right.
+\[(n \times m)(m \times p) = (n \times p)\]
[,1] [,2]
[1,] 22 49
[2,] 28 64
-B%*%A #this works
[,1] [,2] [,3]
[1,] 9 19 29
[2,] 12 26 40
[3,] 15 33 51
-try(B%*%B) #this doesn't
Error in B %*% B : non-conformable arguments
To add two matrices use +
. The matrices have to have the same dimensions.
A+A #works
[,1] [,2] [,3]
[1,] 2 6 10
[2,] 4 8 12
-A+t(B) #works
[,1] [,2] [,3]
[1,] 2 5 8
[2,] 6 9 12
-try(A+B) #does not work since A has 2 rows and B has 3
Error in A + B : non-conformable arrays
The transpose of a matrix is denoted \(\mathbf{A}^\top\) or \(\mathbf{A}^\prime\). To transpose a matrix in R, you use t()
.
A=matrix(1:6, 2, 3) #2 rows, 3 columns
-t(A) #is the transpose of A
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
-try(A%*%A) #this won't work
Error in A %*% A : non-conformable arguments
-A%*%t(A) #this will
[,1] [,2]
[1,] 35 44
[2,] 44 56
@@ -485,24 +512,24 @@ Replace 1 element.
-A=matrix(1, 3, 3)
-A[1,1]=2
-A
[,1] [,2] [,3]
[1,] 2 1 1
[2,] 1 1 1
[3,] 1 1 1
Replace a row with all 1s or a string of values
-A=matrix(1, 3, 3)
-A[1,]=2
-A
[,1] [,2] [,3]
[1,] 2 2 2
[2,] 1 1 1
[3,] 1 1 1
-A[1,]=1:3
-A
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 1 1
[3,] 1 1 1
Replace group of elements. This often does not work as one expects so be sure look at your matrix after trying something like this. Here I want to replace elements (1,3) and (3,1) with 2, but it didn’t work as I wanted.
-A=matrix(1, 3, 3)
-A[c(1,3),c(3,1)]=2
-A
[,1] [,2] [,3]
[1,] 2 1 2
[2,] 1 1 1
[3,] 2 1 2
How do I replace elements (1,1) and (3,3) with 2 then? It’s tedious. If you have a lot of elements to replace, you might want to use a for loop.
-A=matrix(1, 3, 3)
-A[1,3]=2
-A[3,1]=2
-A
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 1 1 1
@@ -485,24 +511,24 @@ 1.4 Replacing elements in a matri
To subset a matrix, we use [ ]
:
A=matrix(1:9, 3, 3) #3 rows, 3 columns
-#get the first and second rows of A
-#it's a 2x3 matrix
-A[1:2,]
A=matrix(1:9, 3, 3) #3 rows, 3 columns
+#get the first and second rows of A
+#it's a 2x3 matrix
+A[1:2,]
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
-#get the top 2 rows and left 2 columns
-A[1:2,1:2]
[,1] [,2]
[1,] 1 4
[2,] 2 5
-#What does this do?
-A[c(1,3),c(1,3)]
[,1] [,2]
[1,] 1 7
[2,] 3 9
-#This?
-A[c(1,2,1),c(2,3)]
[,1] [,2]
[1,] 4 7
[2,] 5 8
[3,] 4 7
If you have used matlab, you know you can say something like A[1,end]
to denote the element of a matrix in row 1 and the last column. R does not have `end’. To do, the same in R you do something like:
A=matrix(1:9, 3, 3)
-A[1,ncol(A)]
[1] 7
-#or
-A[1,dim(A)[2]]
[1] 7
Warning R will create vectors from subsetting matrices!
One of the really bad things that R does with matrices is create a vector if you happen to subset a matrix to create a matrix with 1 row or 1 column. Look at this:
-A=matrix(1:9, 3, 3)
-#take the first 2 rows
-B=A[1:2,]
-#everything is ok
-dim(B)
[1] 2 3
-class(B)
[1] "matrix"
-#take the first row
-B=A[1,]
-#oh no! It should be a 1x3 matrix but it is not.
-dim(B)
NULL
-#It is not even a matrix any more
-class(B)
[1] "integer"
-#and what happens if we take the transpose?
-#Oh no, it's a 1x3 matrix not a 3x1 (transpose of 1x3)
-t(B)
#and what happens if we take the transpose?
+#Oh no, it's a 1x3 matrix not a 3x1 (transpose of 1x3)
+t(B)
[,1] [,2] [,3]
[1,] 1 4 7
-#A%*%B should fail because A is (3x3) and B is (1x3)
-A%*%B
[,1]
[1,] 66
[2,] 78
[3,] 90
-#It works? That is horrible!
This will create hard to find bugs in your code because you will look at B=A[1,]
and everything looks fine. Why is R saying it is not a matrix! To stop R from doing this use drop=FALSE
.
B=A[1,,drop=FALSE]
-#Now it is a matrix as it should be
-dim(B)
[1] 1 3
-class(B)
[1] "matrix"
-#this fails as it should (alerting you to a problem!)
-try(A%*%B)
Error in A %*% B : non-conformable arguments
The Dickey-Fuller test is testing if \(\phi=0\) in this model of the data: \[y_t = \alpha + \beta t + \phi y_{t-1} + e_t\] which is written as \[\Delta y_t = y_t-y_{t-1}= \alpha + \beta t + \gamma y_{t-1} + e_t\] where \(y_t\) is your data. It is written this way so we can do a linear regression of \(\Delta y_t\) against \(t\) and \(y_{t-1}\) and test if \(\gamma\) is different from 0. If \(\gamma=0\), then we have a random walk process. If not and \(-1<1+\gamma<1\), then we have a stationary process.
+The Dickey-Fuller test is testing if \(\phi=0\) in this model of the data: +\[y_t = \alpha + \beta t + \phi y_{t-1} + e_t\] +which is written as +\[\Delta y_t = y_t-y_{t-1}= \alpha + \beta t + \gamma y_{t-1} + e_t\] +where \(y_t\) is your data. It is written this way so we can do a linear regression of \(\Delta y_t\) against \(t\) and \(y_{t-1}\) and test if \(\gamma\) is different from 0. If \(\gamma=0\), then we have a random walk process. If not and \(-1<1+\gamma<1\), then we have a stationary process.
The Augmented Dickey-Fuller test allows for higher-order autoregressive processes by including \(\Delta y_{t-p}\) in the model. But our test is still if \(\gamma = 0\). \[\Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \delta_1 \Delta y_{t-1} + \delta_2 \Delta y_{t-2} + \dots\]
+The Augmented Dickey-Fuller test allows for higher-order autoregressive processes by including \(\Delta y_{t-p}\) in the model. But our test is still if \(\gamma = 0\). +\[\Delta y_t = \alpha + \beta t + \gamma y_{t-1} + \delta_1 \Delta y_{t-1} + \delta_2 \Delta y_{t-2} + \dots\]
The null hypothesis for both tests is that the data are non-stationary. We want to REJECT the null hypothesis for this test, so we want a p-value of less that 0.05 (or smaller).
adf.te
The adf.test()
from the tseries package will do a Augmented Dickey-Fuller test (Dickey-Fuller if we set lags equal to 0) with a trend and an intercept. Use ?adf.test
to read about this function. The function is
adf.test(x, alternative = c("stationary", "explosive"),
k = trunc((length(x)-1)^(1/3)))
-x
are your data. alternative="stationary"
means that \(-2<\gamma<0\) (\(-1<\phi<1\)) and alternative="explosive"
means that is outside these bounds. k
is the number of \(\delta\) lags. For a Dickey-Fuller test, so only up to AR(1) time dependency in our stationary process, we set k=0
so we have no \(\delta\)’s in our test. Being able to control the lags in our test, allows us to avoid a stationarity test that is too complex to be supported by our data.
+x
are your data. alternative="stationary"
means that \(-2<\gamma<0\) (\(-1<\phi<1\)) and alternative="explosive"
means that is outside these bounds. k
is the number of \(\delta\) lags. For a Dickey-Fuller test, so only up to AR(1) time dependency in our stationary process, we set k=0
so we have no \(\delta\)’s in our test. Being able to control the lags in our test, allows us to avoid a stationarity test that is too complex to be supported by our data.
5.3.3.1 Test on white noise
Let’s start by doing the test on data that we know are stationary, white noise. We will use an Augmented Dickey-Fuller test where we use the default number of lags (amount of time-dependency) in our test. For a time-series of 100, this is 4.
-TT <- 100
-wn <- rnorm(TT) # white noise
-tseries::adf.test(wn)
+
Warning in tseries::adf.test(wn): p-value smaller than printed p-value
Augmented Dickey-Fuller Test
@@ -464,9 +495,8 @@ 5.3.3.1 Test on white noise
alternative hypothesis: stationary
The null hypothesis is rejected.
Try a Dickey-Fuller test. This is testing with a null hypothesis of AR(1) stationarity versus a null hypothesis with AR(4) stationarity when we used the default k
.
-tseries::adf.test(wn, k=0)
-Warning in tseries::adf.test(wn, k = 0): p-value smaller than printed p-
-value
+
+Warning in tseries::adf.test(wn, k = 0): p-value smaller than printed p-value
Augmented Dickey-Fuller Test
@@ -478,9 +508,9 @@ 5.3.3.1 Test on white noise
5.3.3.2 Test on white noise with trend
Try the test on white noise with a trend and intercept.
-intercept <- 1
-wnt <- wn + 1:TT + intercept
-tseries::adf.test(wnt)
+
Warning in tseries::adf.test(wnt): p-value smaller than printed p-value
Augmented Dickey-Fuller Test
@@ -493,8 +523,8 @@ 5.3.3.2 Test on white noise with
5.3.3.3 Test on random walk
Let’s try the test on a random walk (nonstationary).
-rw <- cumsum(rnorm(TT))
-tseries::adf.test(rw)
+
Augmented Dickey-Fuller Test
@@ -503,7 +533,7 @@ 5.3.3.3 Test on random walk
alternative hypothesis: stationary
The null hypothesis is NOT rejected as the p-value is greater than 0.05.
Try a Dickey-Fuller test.
-tseries::adf.test(rw, k=0)
+
Augmented Dickey-Fuller Test
@@ -514,7 +544,7 @@ 5.3.3.3 Test on random walk
5.3.3.4 Test the anchovy data
-tseries::adf.test(anchovyts)
+
Augmented Dickey-Fuller Test
@@ -532,15 +562,16 @@ 5.3.4 ADF test using ur.df(
The ur.df()
function allows us to specify whether to test stationarity around a zero-mean with no trend, around a non-zero mean with no trend, or around a trend with an intercept. This can be useful when we know that our data have no trend, for example if you have removed the trend already. ur.df()
allows us to specify the lags or select them using model selection.
5.3.4.1 Test on white noise
-Let’s first do the test on data we know is stationary, white noise. We have to choose the type
and lags
. If you have no particular reason to not include an intercept and trend, then use type="trend"
. This allows both intercept and trend. When you might you have a particular reason not to use "trend"
? When you have removed the trend and/or intercept.
+Let’s first do the test on data we know is stationary, white noise. We have to choose the type
and lags
. If you have no particular reason to not include an intercept and trend, then use type="trend"
. This allows both intercept and trend. When you might you have a particular reason not to use "trend"
? When you have removed the trend and/or intercept.
Next you need to chose the lags
. We will use lags=0
to do the Dickey-Fuller test. Note the number of lags you can test will depend on the amount of data that you have. adf.test()
used a default of trunc((length(x)-1)^(1/3))
for the lags, but ur.df()
requires that you pass in a value or use a fixed default of 1.
lags=0
is fitting this model to the data. You are testing if the effect for z.lag.1
is 0.
-z.diff = gamma * z.lag.1 + intercept + trend * tt
z.diff
means \(\Delta y_t\) and z.lag.1
is \(y_{t-1}\).
+z.diff = gamma * z.lag.1 + intercept + trend * tt
+z.diff
means \(\Delta y_t\) and z.lag.1
is \(y_{t-1}\).
When you use summary()
for the output from ur.df()
, you will see the estimated values for \(\gamma\) (denoted z.lag.1
), intercept and trend. If you see ***
or **
on the coefficients list for z.lag.1
, it indicates that the effect of z.lag.1
is significantly different than 0 and this supports the assumption of stationarity.
The intercept
and tt
estimates indicate where there is a non-zero level (intercept) or linear trend (tt).
-wn <- rnorm(TT)
-test <- urca::ur.df(wn, type="trend", lags=0)
-summary(test)
+
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
@@ -596,24 +627,24 @@ 5.3.4.2 When you might want to us
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,23 +461,23 @@
5.9 Check residuals
We can do a test of autocorrelation of the residuals with Box.test()
with fitdf
adjusted for the number of parameters estimated in the fit. In our case, MA(1) and drift parameters.
-res <- resid(fit)
-Box.test(res, type="Ljung-Box", lag=12, fitdf=2)
+
Box-Ljung test
data: res
X-squared = 5.1609, df = 10, p-value = 0.8802
checkresiduals()
in the forecast package will automate this test and show some standard diagnostics plots.
-forecast::checkresiduals(fit)
+
Ljung-Box test
data: Residuals from ARIMA(0,1,1) with drift
-Q* = 1.0902, df = 3.2, p-value = 0.8087
+Q* = 1.0902, df = 3, p-value = 0.7794
-Model df: 2. Total lags used: 5.2
+Model df: 2. Total lags used: 5
@@ -464,24 +490,24 @@ 5.9 Check residuals
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -437,7 +463,7 @@ 5.8 Estimating the ARMA ordersWe will use the auto.arima()
function in forecast. This function will estimate the level of differencing needed to make our data stationary and estimate the AR and MA orders using AICc (or BIC if we choose).
5.8.1 Example: model selection for AR(2) data
-forecast::auto.arima(ar2)
+
Series: ar2
ARIMA(2,0,2) with non-zero mean
@@ -449,33 +475,33 @@ 5.8.1 Example: model selection fo
sigma^2 estimated as 0.9848: log likelihood=-1409.57
AIC=2831.15 AICc=2831.23 BIC=2860.59
Works with missing data too though might not estimate very close to the true model form.
-forecast::auto.arima(ar2miss)
+
Series: ar2miss
ARIMA(0,1,0)
-sigma^2 estimated as 0.6135: log likelihood=-86.43
-AIC=174.86 AICc=174.9 BIC=177.42
+sigma^2 estimated as 0.5383: log likelihood=-82.07
+AIC=166.15 AICc=166.19 BIC=168.72
5.8.2 Fitting to 100 simulated data sets
Let’s fit to 100 simulated data sets and see how often the true (generating) model form is selected.
-save.fits <- rep(NA,100)
-for(i in 1:100){
- a2 <- arima.sim(n=100, model=list(ar=c(.8,.1)))
- fit <- auto.arima(a2, seasonal=FALSE, max.d=0, max.q=0)
- save.fits[i] <- paste0(fit$arma[1], "-", fit$arma[2])
-}
-table(save.fits)
+save.fits <- rep(NA,100)
+for(i in 1:100){
+ a2 <- arima.sim(n=100, model=list(ar=c(.8,.1)))
+ fit <- auto.arima(a2, seasonal=FALSE, max.d=0, max.q=0)
+ save.fits[i] <- paste0(fit$arma[1], "-", fit$arma[2])
+}
+table(save.fits)
save.fits
-1-0 2-0 3-0 4-0
- 73 23 2 2
-auto.arima()
uses AICc for selection by default. You can change that to AIC or BIC using ic="aic"
or ic="bic"
.
+1-0 2-0 3-0
+ 71 22 7
+auto.arima()
uses AICc for selection by default. You can change that to AIC or BIC using ic="aic"
or ic="bic"
.
Repeat the simulation using AIC and BIC to see how the choice of the information criteria affects the model that is selected.
5.8.3 Trace=TRUE
We can set Trace=TRUE
to see what models auto.arima()
fit.
-forecast::auto.arima(ar2, trace=TRUE)
+
Fitting models using approximations to speed things up...
@@ -513,7 +539,7 @@ 5.8.3 Trace=TRUE
5.8.4 stepwise=FALSE
We can set stepwise=FALSE
to use an exhaustive search. The model may be different than the result from the non-exhaustive search.
-forecast::auto.arima(ar2, trace=TRUE, stepwise=FALSE)
+
Fitting models using approximations to speed things up...
@@ -579,8 +605,8 @@ 5.8.4 stepwise=FALSE
5.8.5 Fit to the anchovy data
-fit <- auto.arima(anchovyts)
-fit
+
Series: anchovyts
ARIMA(0,1,1) with drift
@@ -609,24 +635,24 @@ 5.8.5 Fit to the anchovy data
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -439,14 +465,14 @@ 5.7 Estimating ARMA parameters5.7.1 AR(2) data
Simulate AR(2) data and add a mean level so that the data are not mean 0.
\[x_t = 0.8 x_{t-1} + 0.1 x_{t-2} + e_t\\y_t = x_t + m\]
-m <- 1
-ar2 <- arima.sim(n=1000, model=list(ar=c(.8,.1))) + m
+
To see info on arima.sim()
, type ?arima.sim
.
5.7.2 Fit with Arima()
Fit an ARMA(2) with level to the data.
-forecast::Arima(ar2, order=c(2,0,0), include.constant=TRUE)
+
Series: ar2
ARIMA(2,0,0) with non-zero mean
@@ -458,10 +484,13 @@ 5.7.2 Fit with Arima()
Note, the model being fit by Arima()
is not this model
-\[y_t = m + 0.8 y_{t-1} + 0.1 y_{t-2} + e_t\] It is this model:
-\[(y_t - m) = 0.8 (y_{t-1}-m) + 0.1 (y_{t-2}-m)+ e_t\] or as written above: \[x_t = 0.8 x_{t-1} + 0.1 x_{t-2} + e_t\\y_t = x_t + m\]
+\[y_t = m + 0.8 y_{t-1} + 0.1 y_{t-2} + e_t\]
+It is this model:
+\[(y_t - m) = 0.8 (y_{t-1}-m) + 0.1 (y_{t-2}-m)+ e_t\]
+or as written above:
+\[x_t = 0.8 x_{t-1} + 0.1 x_{t-2} + e_t\\y_t = x_t + m\]
We could also use arima()
to fit to the data.
-arima(ar2, order=c(2,0,0), include.mean=TRUE)
+
Warning in arima(ar2, order = c(2, 0, 0), include.mean = TRUE): possible
convergence problem: optim gave code = 1
@@ -479,8 +508,8 @@ 5.7.2 Fit with Arima()
5.7.3 AR(1) simulated data
-ar1 <- arima.sim(n=100, model=list(ar=c(.8)))+m
-forecast::Arima(ar1, order=c(1,0,0), include.constant=TRUE)
+ar1 <- arima.sim(n=100, model=list(ar=c(.8)))+m
+forecast::Arima(ar1, order=c(1,0,0), include.constant=TRUE)
Series: ar1
ARIMA(1,0,0) with non-zero mean
@@ -494,9 +523,10 @@ 5.7.3 AR(1) simulated data
5.7.4 ARMA(1,2) simulated data
-Simulate ARMA(1,2) \[x_t = 0.8 x_{t-1} + e_t + 0.8 e_{t-1} + 0.2 e_{t-2}\]
-arma12 = arima.sim(n=100, model=list(ar=c(0.8), ma=c(0.8, 0.2)))+m
-forecast::Arima(arma12, order=c(1,0,2), include.constant=TRUE)
+Simulate ARMA(1,2)
+\[x_t = 0.8 x_{t-1} + e_t + 0.8 e_{t-1} + 0.2 e_{t-2}\]
+arma12 = arima.sim(n=100, model=list(ar=c(0.8), ma=c(0.8, 0.2)))+m
+forecast::Arima(arma12, order=c(1,0,2), include.constant=TRUE)
Series: arma12
ARIMA(1,0,2) with non-zero mean
@@ -512,29 +542,30 @@ 5.7.4 ARMA(1,2) simulated data
5.7.5 These functions work for data with missing values
Create some AR(2) data and then add missing values (NA).
-ar2miss <- arima.sim(n=100, model=list(ar=c(.8,.1)))
-ar2miss[sample(100,50)] <- NA
-plot(ar2miss, type="l")
-title("many missing values")
+ar2miss <- arima.sim(n=100, model=list(ar=c(.8,.1)))
+ar2miss[sample(100,50)] <- NA
+plot(ar2miss, type="l")
+title("many missing values")
Fit
-fit <- forecast::Arima(ar2miss, order=c(2,0,0))
-fit
+
Series: ar2miss
ARIMA(2,0,0) with non-zero mean
Coefficients:
ar1 ar2 mean
- 0.8989 -0.0612 -0.1872
-s.e. 0.1817 0.1775 0.6477
+ 1.0625 -0.2203 -0.0586
+s.e. 0.1555 0.1618 0.6061
-sigma^2 estimated as 0.6102: log likelihood=-84.65
-AIC=177.3 AICc=177.74 BIC=187.6
+sigma^2 estimated as 0.9679: log likelihood=-79.86
+AIC=167.72 AICc=168.15 BIC=178.06
Note fitted()
does not return the expected value at time \(t\). It is the expected value of \(y_t\) given the data up to time \(t-1\).
-plot(ar2miss, type="l")
-title("many missing values")
-lines(fitted(fit), col="blue")
- It is easy enough to get the expected value of \(y_t\) for all the missing values but we’ll learn to do that when we learn the MARSS package and can apply the Kalman Smoother in that package.
+
+
+It is easy enough to get the expected value of \(y_t\) for all the missing values but we’ll learn to do that when we learn the MARSS package and can apply the Kalman Smoother in that package.
@@ -548,24 +579,24 @@ 5.7.5 These functions work for da
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,9 +461,9 @@
5.12 Forecast using a seasonal model
Forecasting works the same using the forecast()
function.
-fr <- forecast::forecast(fit, h=12)
-plot(fr)
-points(testdat)
+
@@ -452,24 +478,24 @@ 5.12 Forecast using a seasonal mo
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,8 +461,8 @@
5.10 Forecast from a fitted ARIMA model
We can create a forecast from our anchovy ARIMA model using forecast()
. The shading is the 80% and 95% prediction intervals.
-fr <- forecast::forecast(fit, h=10)
-plot(fr)
+
@@ -450,24 +476,24 @@ 5.10 Forecast from a fitted ARIMA
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -455,24 +481,24 @@ 5.1 Box-Jenkins method
The null hypothesis for the KPSS test is that the data are stationary. For this test, we do NOT want to reject the null hypothesis. In other words, we want the p-value to be greater than 0.05 not less than 0.05.
Let’s try the KPSS test on white noise with a trend. The default is a null hypothesis with no trend. We will change this to null="Trend"
.
tseries::kpss.test(wnt, null="Trend")
Warning in tseries::kpss.test(wnt, null = "Trend"): p-value greater than
-printed p-value
+Let’s try the KPSS test on white noise with a trend. The default is a null hypothesis with no trend. We will change this to null="Trend"
.
Warning in tseries::kpss.test(wnt, null = "Trend"): p-value greater than printed
+p-value
KPSS Test for Trend Stationarity
@@ -448,9 +474,9 @@ 5.4.1 Test on simulated data
KPSS Trend = 0.045579, Truncation lag parameter = 4, p-value = 0.1
The p-value is greater than 0.05. The null hypothesis of stationarity around a trend is not rejected.
Let’s try the KPSS test on white noise with a trend but let’s use the default of stationary with no trend.
-tseries::kpss.test(wnt, null="Level")
Warning in tseries::kpss.test(wnt, null = "Level"): p-value smaller than
-printed p-value
+
+Warning in tseries::kpss.test(wnt, null = "Level"): p-value smaller than printed
+p-value
KPSS Test for Level Stationarity
@@ -461,13 +487,12 @@ 5.4.1 Test on simulated data
5.4.2 Test the anchovy data
Let’s try the anchovy data.
-kpss.test(anchovyts, null="Trend")
+
KPSS Test for Trend Stationarity
data: anchovyts
-KPSS Trend = 0.14779, Truncation lag parameter = 2, p-value =
-0.04851
+KPSS Trend = 0.14779, Truncation lag parameter = 2, p-value = 0.04851
The null is rejected (p-value less than 0.05). Again stationarity is not supported.
The anchovy data have failed both tests for the stationarity, the Augmented Dickey-Fuller and the KPSS test. How do we fix this? The approach in the Box-Jenkins method is to use differencing.
Let’s see how this works with random walk data. A random walk is non-stationary but the difference is white noise so is stationary:
\[x_t - x_{t-1} = e_t, e_t \sim N(0,\sigma)\]
-adf.test(diff(rw))
Augmented Dickey-Fuller Test
data: diff(rw)
Dickey-Fuller = -3.8711, Lag order = 4, p-value = 0.01834
alternative hypothesis: stationary
-kpss.test(diff(rw))
Warning in kpss.test(diff(rw)): p-value greater than printed p-value
KPSS Test for Level Stationarity
@@ -453,15 +479,15 @@ 5.5 Dealing with non-stationarity
KPSS Level = 0.30489, Truncation lag parameter = 3, p-value = 0.1
If we difference random walk data, the null is rejected for the ADF test and not rejected for the KPSS test. This is what we want.
Let’s try a single difference with the anchovy data. A single difference means dat(t)-dat(t-1)
. We get this using diff(anchovyts)
.
diff1dat <- diff(anchovyts)
-adf.test(diff1dat)
Augmented Dickey-Fuller Test
data: diff1dat
Dickey-Fuller = -3.2718, Lag order = 2, p-value = 0.09558
alternative hypothesis: stationary
-kpss.test(diff1dat)
Warning in kpss.test(diff1dat): p-value greater than printed p-value
KPSS Test for Level Stationarity
@@ -469,8 +495,8 @@ 5.5 Dealing with non-stationarity
data: diff1dat
KPSS Level = 0.089671, Truncation lag parameter = 2, p-value = 0.1
If a first difference were not enough, we would try a second difference which is the difference of a first difference.
-diff2dat <- diff(diff1dat)
-adf.test(diff2dat)
Warning in adf.test(diff2dat): p-value smaller than printed p-value
Augmented Dickey-Fuller Test
@@ -480,9 +506,9 @@ 5.5 Dealing with non-stationarity
alternative hypothesis: stationary
The null hypothesis of a random walk is now rejected so you might think that a 2nd difference is needed for the anchovy data. However the actual problem is that the default for adf.test()
includes a trend but we removed the trend with our first difference. Thus we included an unneeded trend parameter in our test. Our data are not that long and this affects the result.
Let’s repeat without the trend and we’ll see that the null hypothesis is rejected. The number of lags is set to be what would be used by adf.test()
. See ?adf.test
.
k <- trunc((length(diff1dat)-1)^(1/3))
-test <- urca::ur.df(diff1dat, type="drift", lags=k)
-summary(test)
k <- trunc((length(diff1dat)-1)^(1/3))
+test <- urca::ur.df(diff1dat, type="drift", lags=k)
+summary(test)
###############################################
# Augmented Dickey-Fuller Test Unit Root Test #
@@ -521,9 +547,9 @@ 5.5 Dealing with non-stationarity
5.5.1 ndiffs()
As an alternative to trying many different differences and remembering to include or not include the trend or level, you can use the ndiffs()
function in the forecast package. This automates finding the number of differences needed.
-forecast::ndiffs(anchovyts, test="kpss")
+
[1] 1
-forecast::ndiffs(anchovyts, test="adf")
+
[1] 1
One difference is required to pass both the ADF and KPSS stationarity tests.
@@ -539,24 +565,24 @@ 5.5.1 ndiffs()
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
Authors
Citation
For these problems, use the catch landings from Greek waters (greeklandings
) and the Chinook landings (chinook
) in Washington data. Load the data as follows:
data(greeklandings, package = "atsalibrary")
-landings <- greeklandings
-data(chinook, package = "atsalibrary")
-chinook <- chinook.month
data(greeklandings, package="atsalibrary")
+landings <- greeklandings
+data(chinook, package="atsalibrary")
+chinook <- chinook.month
datdf <- subset(landings, Species == "Sardine")
-dat <- ts(datdf$log.metric.tons, start = 1964)
-dat <- window(dat, start = 1964, end = 1987)
datdf <- subset(landings, Species=="Sardine")
+dat <- ts(datdf$log.metric.tons, start=1964)
+dat <- window(dat, start=1964, end=1987)
a. Do a Dickey-Fuller (DF) test using `ur.df()` and `adf.test()`. You have to set the lags. What does the result tell you?
a. Do an Augmented Dickey-Fuller (ADF) test using `ur.df()`. How did you choose to set the lags? How is the ADF test different than the DF test?
b. Do a KPSS test using `kpss.test()`. What does the result tell you?
auto.arima()
with trace=TRUE
.forecast::auto.arima(anchovy, trace = TRUE)
a. Fit each of the models listed using `Arima()` and show that you can produce the same AICc value that is shown in the trace table.
b. What models are within $\Delta$AIC of 2? What is different about these models?
datdf <- subset(landings, Species == "Anchovy")
-dat <- ts(datdf$log.metric.tons, start = 1964)
-dat64.87 <- window(dat, start = 1964, end = 1987)
datdf <- subset(landings, Species=="Anchovy")
+dat <- ts(datdf$log.metric.tons, start=1964)
+dat64.87 <- window(dat, start=1964, end=1987)
a. Plot the time series for the two time periods. For the `kpss.test()`, which null is appropriate, "Level" or "Trend"?
a. Do the conclusions regarding stationarity and the amount of differencing needed change depending on which time period you analyze? For both time periods, use `adf.test()` with default values and `kpss.test()` with null="Trend".
c. Fit each time period using `auto.arima()`. Do the selected models change? What do the coefficients mean? Coefficients means the mean and drifts terms and the AR and MA terms.
@@ -512,24 +538,24 @@ 5.13 Problems
The Chinook data are monthly and start in January 1990. To make this into a ts object do
-chinookts <- ts(chinook$log.metric.tons, start=c(1990,1),
- frequency=12)
start
is the year and month and frequency is the number of months in the year.
Use ?ts
to see more examples of how to set up ts objects.
auto.arima()
for seasonal tsauto.arima()
will recognize that our data has season and fit a seasonal ARIMA model to our data by default. Let’s define the training data up to 1998 and use 1999 as the test data.
traindat <- window(chinookts, c(1990,10), c(1998,12))
-testdat <- window(chinookts, c(1999,1), c(1999,12))
-fit <- forecast::auto.arima(traindat)
-fit
traindat <- window(chinookts, c(1990,10), c(1998,12))
+testdat <- window(chinookts, c(1999,1), c(1999,12))
+fit <- forecast::auto.arima(traindat)
+fit
Series: traindat
ARIMA(1,0,0)(0,1,0)[12] with drift
@@ -475,24 +501,24 @@ 5.11.2 auto.arima()
We will start by looking at white noise and a stationary AR(1) process from simulated data. White noise is simply a string of random numbers drawn from a Normal distribution. rnorm()
with return random numbers drawn from a Normal distribution. Use ?rnorm
to understand what the function requires.
TT <- 100
-y <- rnorm(TT, mean=0, sd=1) # 100 random numbers
-op <- par(mfrow=c(1,2))
-plot(y, type="l")
-acf(y)
TT <- 100
+y <- rnorm(TT, mean=0, sd=1) # 100 random numbers
+op <- par(mfrow=c(1,2))
+plot(y, type="l")
+acf(y)
par(op)
Here we use ggplot()
to plot 10 white noise time series.
dat <- data.frame(t=1:TT, y=y)
-p1 <- ggplot(dat, aes(x=t, y=y)) + geom_line() +
- ggtitle("1 white noise time series") + xlab("") + ylab("value")
-ys <- matrix(rnorm(TT*10),TT,10)
-ys <- data.frame(ys)
-ys$id = 1:TT
-
-ys2 <- melt(ys, id.var="id")
-p2 <- ggplot(ys2, aes(x=id,y=value,group=variable)) +
- geom_line() + xlab("") + ylab("value") +
- ggtitle("10 white noise processes")
-grid.arrange(p1, p2, ncol = 1)
dat <- data.frame(t=1:TT, y=y)
+p1 <- ggplot(dat, aes(x=t, y=y)) + geom_line() +
+ ggtitle("1 white noise time series") + xlab("") + ylab("value")
+ys <- matrix(rnorm(TT*10),TT,10)
+ys <- data.frame(ys)
+ys$id = 1:TT
+
+ys2 <- melt(ys, id.var="id")
+p2 <- ggplot(ys2, aes(x=id,y=value,group=variable)) +
+ geom_line() + xlab("") + ylab("value") +
+ ggtitle("10 white noise processes")
+grid.arrange(p1, p2, ncol = 1)
These are stationary because the variance and mean (level) does not change with time.
An AR(1) process is also stationary.
-theta <- 0.8
-nsim <- 10
-ar1 <- arima.sim(TT, model=list(ar=theta))
-plot(ar1)
We can use ggplot to plot 10 AR(1) time series, but we need to change the data to a data frame.
-dat <- data.frame(t=1:TT, y=ar1)
-p1 <- ggplot(dat, aes(x=t, y=y)) + geom_line() +
- ggtitle("AR-1") + xlab("") + ylab("value")
-ys <- matrix(0,TT,nsim)
-for(i in 1:nsim) ys[,i] <- as.vector(arima.sim(TT, model=list(ar=theta)))
-ys <- data.frame(ys)
-ys$id <- 1:TT
-
-ys2 <- melt(ys, id.var="id")
-p2 <- ggplot(ys2, aes(x=id,y=value,group=variable)) +
- geom_line() + xlab("") + ylab("value") +
- ggtitle("The variance of an AR-1 process is steady")
-grid.arrange(p1, p2, ncol = 1)
dat <- data.frame(t=1:TT, y=ar1)
+p1 <- ggplot(dat, aes(x=t, y=y)) + geom_line() +
+ ggtitle("AR-1") + xlab("") + ylab("value")
+ys <- matrix(0,TT,nsim)
+for(i in 1:nsim) ys[,i] <- as.vector(arima.sim(TT, model=list(ar=theta)))
+ys <- data.frame(ys)
+ys$id <- 1:TT
+
+ys2 <- melt(ys, id.var="id")
+p2 <- ggplot(ys2, aes(x=id,y=value,group=variable)) +
+ geom_line() + xlab("") + ylab("value") +
+ ggtitle("The variance of an AR-1 process is steady")
+grid.arrange(p1, p2, ncol = 1)
Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
Fluctuating around a linear trend is a very common type of stationarity used in ARMA modeling and forecasting. This is just a stationary process, like white noise or AR(1), around an linear trend up or down.
-intercept <- .5
-trend <- 0.1
-sd <- 0.5
-TT <- 20
-wn <- rnorm(TT, sd=sd) #white noise
-wni <- wn+intercept #white noise witn interept
-wnti <- wn + trend*(1:TT) + intercept
intercept <- .5
+trend <- 0.1
+sd <- 0.5
+TT <- 20
+wn <- rnorm(TT, sd=sd) #white noise
+wni <- wn+intercept #white noise witn interept
+wnti <- wn + trend*(1:TT) + intercept
See how the white noise with trend is just the white noise overlaid on a linear trend.
-op <- par(mfrow=c(1,3))
-plot(wn, type="l")
-plot(trend*1:TT)
-plot(wnti, type="l")
par(op)
We can make a similar plot with ggplot.
-dat <- data.frame(t=1:TT, wn=wn, wni=wni, wnti=wnti)
-p1 <- ggplot(dat, aes(x=t, y=wn)) + geom_line() + ggtitle("White noise")
-p2 <- ggplot(dat, aes(x=t, y=wni)) + geom_line() + ggtitle("with non-zero mean")
-p3 <- ggplot(dat, aes(x=t, y=wnti)) + geom_line() + ggtitle("with linear trend")
-grid.arrange(p1, p2, p3, ncol = 3)
dat <- data.frame(t=1:TT, wn=wn, wni=wni, wnti=wnti)
+p1 <- ggplot(dat, aes(x=t, y=wn)) + geom_line() + ggtitle("White noise")
+p2 <- ggplot(dat, aes(x=t, y=wni)) + geom_line() + ggtitle("with non-zero mean")
+p3 <- ggplot(dat, aes(x=t, y=wnti)) + geom_line() + ggtitle("with linear trend")
+grid.arrange(p1, p2, p3, ncol = 3)
We can make a similar plot with AR(1) data. Ignore the warnings about not knowing how to pick the scale.
-beta1 <- 0.8
-ar1 <- arima.sim(TT, model=list(ar=beta1), sd=sd)
-ar1i <- ar1 + intercept
-ar1ti <- ar1 + trend*(1:TT) + intercept
-dat <- data.frame(t=1:TT, ar1=ar1, ar1i=ar1i, ar1ti=ar1ti)
-p4 <- ggplot(dat, aes(x=t, y=ar1)) + geom_line() + ggtitle("AR1")
-p5 <- ggplot(dat, aes(x=t, y=ar1i)) + geom_line() + ggtitle("with non-zero mean")
-p6 <- ggplot(dat, aes(x=t, y=ar1ti)) + geom_line() + ggtitle("with linear trend")
-
-grid.arrange(p4, p5, p6, ncol = 3)
beta1 <- 0.8
+ar1 <- arima.sim(TT, model=list(ar=beta1), sd=sd)
+ar1i <- ar1 + intercept
+ar1ti <- ar1 + trend*(1:TT) + intercept
+dat <- data.frame(t=1:TT, ar1=ar1, ar1i=ar1i, ar1ti=ar1ti)
+p4 <- ggplot(dat, aes(x=t, y=ar1)) + geom_line() + ggtitle("AR1")
+p5 <- ggplot(dat, aes(x=t, y=ar1i)) + geom_line() + ggtitle("with non-zero mean")
+p6 <- ggplot(dat, aes(x=t, y=ar1ti)) + geom_line() + ggtitle("with linear trend")
+
+grid.arrange(p4, p5, p6, ncol = 3)
Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
Don't know how to automatically pick scale for object of type ts. Defaulting to continuous.
@@ -527,10 +553,10 @@ We will look at the anchovy data. Notice the two ==
in the subset call not one =
. We will use the Greek data before 1989 for the lab.
anchovy <- subset(landings, Species=="Anchovy" & Year <= 1989)$log.metric.tons
-anchovyts <- ts(anchovy, start=1964)
anchovy <- subset(landings, Species=="Anchovy" & Year <= 1989)$log.metric.tons
+anchovyts <- ts(anchovy, start=1964)
Plot the data.
-plot(anchovyts, ylab="log catch")
Questions to ask.
Using these constraints, the observation equation for the DFA model above becomes
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
y_{1} \\
y_{2} \\
@@ -473,9 +499,9 @@ 10.3 Constraining a DFA model
+\end{equation}\]
and the process equation becomes
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
x_{1} \\
x_{2} \\
@@ -494,9 +520,9 @@ 10.3 Constraining a DFA model
+\end{equation}\]
The distribution of the observation errors would stay the same, such that
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
v_{1} \\
v_{2} \\
@@ -519,9 +545,9 @@ 10.3 Constraining a DFA model
+\end{equation}\]
but the distribution of the process errors would become
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
w_{1} \\
w_{2} \\
@@ -538,7 +564,7 @@ 10.3 Constraining a DFA model
+\end{equation}\]
It is standard to add covariates to the analysis so that one removes known important drivers. The DFA with covariates is written:
-\[\begin{equation} +\[\begin{equation} \begin{gathered} \mathbf{y}_t = \mathbf{Z}\mathbf{x}_t+\mathbf{a}+\mathbf{D}\mathbf{d}_t+\mathbf{v}_t \text{ where } \mathbf{v}_t \sim \text{MVN}(0,\mathbf{R}) \mathbf{x}_t = \mathbf{x}_{t-1}+\mathbf{w}_t \text{ where } \mathbf{w}_t \sim \text{MVN}(0,\mathbf{Q}) \\ \end{gathered} \tag{10.12} -\end{equation}\] -
where the \(q \times 1\) vector \(\mathbf{d}_t\) contains the covariate(s) at time \(t\), and the \(n \times q\) matrix \(\mathbf{D}\) contains the effect(s) of the covariate(s) on the observations. Using form = "dfa"
and covariates=<covariate name(s)>
, we can easily add covariates to our DFA, but this means that the covariates are input, not data, and there can be no missing values (see Chapter 6 in the MARSS User Guide for how to include covariates with missing values).
where the \(q \times 1\) vector \(\mathbf{d}_t\) contains the covariate(s) at time \(t\), and the \(n \times q\) matrix \(\mathbf{D}\) contains the effect(s) of the covariate(s) on the observations. Using form = "dfa"
and covariates=<covariate name(s)>
, we can easily add covariates to our DFA, but this means that the covariates are input, not data, and there can be no missing values (see Chapter 6 in the MARSS User Guide for how to include covariates with missing values).
Here are plots of the three hidden processes (left column) and the loadings for each of phytoplankton groups (right column).
-ylbl <- phytoplankton
-w_ts <- seq(dim(dat)[2])
-layout(matrix(c(1,2,3,4,5,6),mm,2),widths=c(2,1))
-## par(mfcol=c(mm,2), mai=c(0.5,0.5,0.5,0.1), omi=c(0,0,0,0))
-par(mai=c(0.5,0.5,0.5,0.1), omi=c(0,0,0,0))
-## plot the processes
-for(i in 1:mm) {
- ylm <- c(-1,1)*max(abs(proc_rot[i,]))
- ## set up plot area
- plot(w_ts,proc_rot[i,], type="n", bty="L",
- ylim=ylm, xlab="", ylab="", xaxt="n")
- ## draw zero-line
- abline(h=0, col="gray")
- ## plot trend line
- lines(w_ts,proc_rot[i,], lwd=2)
- lines(w_ts,proc_rot[i,], lwd=2)
- ## add panel labels
- mtext(paste("State",i), side=3, line=0.5)
- axis(1,12*(0:dim(dat_1980)[2])+1,yr_frst+0:dim(dat_1980)[2])
-}
-## plot the loadings
-minZ <- 0
-ylm <- c(-1,1)*max(abs(Z_rot))
-for(i in 1:mm) {
- plot(c(1:N_ts)[abs(Z_rot[,i])>minZ], as.vector(Z_rot[abs(Z_rot[,i])>minZ,i]), type="h",
- lwd=2, xlab="", ylab="", xaxt="n", ylim=ylm, xlim=c(0.5,N_ts+0.5), col=clr)
- for(j in 1:N_ts) {
- if(Z_rot[j,i] > minZ) {text(j, -0.03, ylbl[j], srt=90, adj=1, cex=1.2, col=clr[j])}
- if(Z_rot[j,i] < -minZ) {text(j, 0.03, ylbl[j], srt=90, adj=0, cex=1.2, col=clr[j])}
- abline(h=0, lwd=1.5, col="gray")
- }
- mtext(paste("Factor loadings on state",i),side=3,line=0.5)
-}
ylbl <- phytoplankton
+w_ts <- seq(dim(dat)[2])
+layout(matrix(c(1,2,3,4,5,6),mm,2),widths=c(2,1))
+## par(mfcol=c(mm,2), mai=c(0.5,0.5,0.5,0.1), omi=c(0,0,0,0))
+par(mai=c(0.5,0.5,0.5,0.1), omi=c(0,0,0,0))
+## plot the processes
+for(i in 1:mm) {
+ ylm <- c(-1,1)*max(abs(proc_rot[i,]))
+ ## set up plot area
+ plot(w_ts,proc_rot[i,], type="n", bty="L",
+ ylim=ylm, xlab="", ylab="", xaxt="n")
+ ## draw zero-line
+ abline(h=0, col="gray")
+ ## plot trend line
+ lines(w_ts,proc_rot[i,], lwd=2)
+ lines(w_ts,proc_rot[i,], lwd=2)
+ ## add panel labels
+ mtext(paste("State",i), side=3, line=0.5)
+ axis(1,12*(0:dim(dat_1980)[2])+1,yr_frst+0:dim(dat_1980)[2])
+}
+## plot the loadings
+minZ <- 0
+ylm <- c(-1,1)*max(abs(Z_rot))
+for(i in 1:mm) {
+ plot(c(1:N_ts)[abs(Z_rot[,i])>minZ], as.vector(Z_rot[abs(Z_rot[,i])>minZ,i]), type="h",
+ lwd=2, xlab="", ylab="", xaxt="n", ylim=ylm, xlim=c(0.5,N_ts+0.5), col=clr)
+ for(j in 1:N_ts) {
+ if(Z_rot[j,i] > minZ) {text(j, -0.03, ylbl[j], srt=90, adj=1, cex=1.2, col=clr[j])}
+ if(Z_rot[j,i] < -minZ) {text(j, 0.03, ylbl[j], srt=90, adj=0, cex=1.2, col=clr[j])}
+ abline(h=0, lwd=1.5, col="gray")
+ }
+ mtext(paste("Factor loadings on state",i),side=3,line=0.5)
+}
It looks like there are strong seasonal cycles in the data, but there is some indication of a phase difference between some of the groups. We can use ccf()
to investigate further.
par(mai=c(0.9,0.9,0.1,0.1))
-ccf(proc_rot[1,],proc_rot[2,], lag.max = 12, main="")
The general idea is that the observations \(\mathbf{y}\) are modeled as a linear combination of hidden processes \(\mathbf{x}\) and factor loadings \(\mathbf{Z}\) plus some offsets \(\mathbf{a}\). Imagine a case where we had a data set with five observed time series (\(n=5\)) and we want to fit a model with three hidden processes (\(m=3\)). If we write out our DFA model in MARSS matrix form, the observation equation would look like
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
y_{1} \\
y_{2} \\
@@ -465,9 +491,9 @@ 10.2 Example of a DFA model
v_{4} \\
v_{5} \end{bmatrix}_t.
\tag{10.2}
-\end{equation}\]
+\end{equation}\]
and the process model would look like
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
x_{1} \\
x_{2} \\
@@ -485,9 +511,9 @@ 10.2 Example of a DFA model
w_{2} \\
w_{3} \end{bmatrix}_t
\tag{10.3}
-\end{equation}\]
+\end{equation}\]
The observation errors would be
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
v_{1} \\
v_{2} \\
@@ -509,9 +535,9 @@ 10.2 Example of a DFA model
r_{15}&r_{25}&r_{35}&r_{45}&r_{55}\end{bmatrix}
\end{pmatrix}
\tag{10.4}
-\end{equation}\]
+\end{equation}\]
And the process errors would be
-\[\begin{equation} +\[\begin{equation}
\begin{bmatrix}
w_{1} \\
w_{2} \\
@@ -527,7 +553,7 @@ 10.2 Example of a DFA model
q_{13}&q_{23}&q_{33}\end{bmatrix}
\end{pmatrix}.
\tag{10.5}
-\end{equation}\]
+\end{equation}\]
Here we will fit the DFA model above where we have R N_ts
observed time series and we want 3 hidden states. Now we need to set up the observation model for MARSS
. Here are the vectors and matrices for our first model where each nutrient follows its own process. Recall that we will need to set the elements in the upper R corner of \(\mathbf{Z}\) to 0. We will assume that the observation errors have different variances and they are independent of one another.
## 'ZZ' is loadings matrix
-Z_vals <- list("z11", 0, 0, "z21", "z22", 0, "z31", "z32", "z33",
- "z41", "z42", "z43", "z51", "z52", "z53")
-ZZ <- matrix(Z_vals, nrow = N_ts, ncol = 3, byrow = TRUE)
-ZZ
## 'ZZ' is loadings matrix
+Z_vals <- list("z11", 0 , 0 ,
+ "z21","z22", 0 ,
+ "z31","z32","z33",
+ "z41","z42","z43",
+ "z51","z52","z53")
+ZZ <- matrix(Z_vals, nrow=N_ts, ncol=3, byrow=TRUE)
+ZZ
[,1] [,2] [,3]
[1,] "z11" 0 0
[2,] "z21" "z22" 0
[3,] "z31" "z32" "z33"
[4,] "z41" "z42" "z43"
[5,] "z51" "z52" "z53"
-## 'aa' is the offset/scaling
-aa <- "zero"
-## 'DD' and 'd' are for covariates
-DD <- "zero" # matrix(0,mm,1)
-dd <- "zero" # matrix(0,1,wk_last)
-## 'RR' is var-cov matrix for obs errors
-RR <- "diagonal and unequal"
We need to specify the explicit form for all of the vectors and matrices in the full form of the MARSS model we defined in Sec 3.1. Note that we do not have to specify anything for the states \((\mathbf{x})\) – those are elements that MARSS
will identify and estimate itself based on our definitions of the other vectors and matrices.
## number of processes
-mm <- 3
-## 'BB' is identity: 1's along the diagonal & 0's elsewhere
-BB <- "identity" # diag(mm)
-## 'uu' is a column vector of 0's
-uu <- "zero" # matrix(0,mm,1)
-## 'CC' and 'cc' are for covariates
-CC <- "zero" # matrix(0,mm,1)
-cc <- "zero" # matrix(0,1,wk_last)
-## 'QQ' is identity
-QQ <- "identity" # diag(mm)
## number of processes
+mm <- 3
+## 'BB' is identity: 1's along the diagonal & 0's elsewhere
+BB <- "identity" # diag(mm)
+## 'uu' is a column vector of 0's
+uu <- "zero" # matrix(0,mm,1)
+## 'CC' and 'cc' are for covariates
+CC <- "zero" # matrix(0,mm,1)
+cc <- "zero" # matrix(0,1,wk_last)
+## 'QQ' is identity
+QQ <- "identity" # diag(mm)
MARSS
will pick its own otherwise;
MARSS()
function.## list with specifications for model vectors/matrices
-mod_list <- list(Z = ZZ, A = aa, D = DD, d = dd, R = RR, B = BB,
- U = uu, C = CC, c = cc, Q = QQ)
-## list with model inits
-init_list <- list(x0 = matrix(rep(0, mm), mm, 1))
-## list with model control parameters
-con_list <- list(maxit = 3000, allow.degen = TRUE)
## list with specifications for model vectors/matrices
+mod_list <- list(Z=ZZ, A=aa, D=DD, d=dd, R=RR,
+ B=BB, U=uu, C=CC, c=cc, Q=QQ)
+## list with model inits
+init_list <- list(x0 = matrix(rep(0, mm), mm, 1))
+## list with model control parameters
+con_list <- list(maxit = 3000, allow.degen = TRUE)
Now we can fit the model.
-## fit MARSS
-dfa_1 <- MARSS(y = dat, model = mod_list, inits = init_list,
- control = con_list)
Success! abstol and log-log tests passed at 246 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -541,24 +569,24 @@ 10.6.3 Fit the model in MARSS
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -448,24 +474,24 @@ 10.7 Interpreting the MARSS outpu
DFA is conceptually different than what we have been doing in the previous applications. Here we are trying to explain temporal variation in a set of \(n\) observed time series using linear combinations of a set of \(m\) hidden random walks, where \(m << n\). A DFA model is a type of MARSS model with the following structure:
-\[\begin{equation} +\[\begin{equation} \begin{gathered} \mathbf{y}_t = \mathbf{Z}\mathbf{x}_t+\mathbf{a}+\mathbf{v}_t \text{ where } \mathbf{v}_t \sim \text{MVN}(0,\mathbf{R}) \\ \mathbf{x}_t = \mathbf{x}_{t-1}+\mathbf{w}_t \text{ where } \mathbf{w}_t \sim \text{MVN}(0,\mathbf{Q}) \\ \end{gathered} \tag{10.1} -\end{equation}\] +\end{equation}\]
This equation should look rather familiar as it is exactly the same form we used for estimating varying number of processes from a set of observations in Lesson II. The difference with DFA is that rather than fixing the elements within \(\mathbf{Z}\) at 1 or 0 to indicate whether an observation does or does not correspond to a trend, we will instead estimate them as “loadings” on each of the states/processes.
For this exercise, we will use the Lake Washington phytoplankton data contained in the MARSS package. Let’s begin by reading in the monthly values for all of the data, including metabolism, chemistry, and climate.
-## load the data (there are 3 datasets contained here)
-data(lakeWAplankton, package = "MARSS")
-## we want lakeWAplanktonTrans, which has been transformed so
-## the 0s are replaced with NAs and the data z-scored
-all_dat <- lakeWAplanktonTrans
-## use only the 10 years from 1980-1989
-yr_frst <- 1980
-yr_last <- 1989
-plank_dat <- all_dat[all_dat[, "Year"] >= yr_frst & all_dat[,
- "Year"] <= yr_last, ]
-## create vector of phytoplankton group names
-phytoplankton <- c("Cryptomonas", "Diatoms", "Greens", "Unicells",
- "Other.algae")
-## get only the phytoplankton
-dat_1980 <- plank_dat[, phytoplankton]
## load the data (there are 3 datasets contained here)
+data(lakeWAplankton, package="MARSS")
+## we want lakeWAplanktonTrans, which has been transformed
+## so the 0s are replaced with NAs and the data z-scored
+all_dat <- lakeWAplanktonTrans
+## use only the 10 years from 1980-1989
+yr_frst <- 1980
+yr_last <- 1989
+plank_dat <- all_dat[all_dat[,"Year"]>=yr_frst & all_dat[,"Year"]<=yr_last,]
+## create vector of phytoplankton group names
+phytoplankton <- c("Cryptomonas", "Diatoms", "Greens",
+ "Unicells", "Other.algae")
+## get only the phytoplankton
+dat_1980 <- plank_dat[,phytoplankton]
Next, we transpose the data matrix and calculate the number of time series and their length.
-## transpose data so time goes across columns
-dat_1980 <- t(dat_1980)
-## get number of time series
-N_ts <- dim(dat_1980)[1]
-## get length of time series
-TT <- dim(dat_1980)[2]
## transpose data so time goes across columns
+dat_1980 <- t(dat_1980)
+## get number of time series
+N_ts <- dim(dat_1980)[1]
+## get length of time series
+TT <- dim(dat_1980)[2]
It will be easier to estimate the real parameters of interest if we de-mean the data, so let’s do that.
-y_bar <- apply(dat_1980, 1, mean, na.rm = TRUE)
-dat <- dat_1980 - y_bar
-rownames(dat) <- rownames(dat_1980)
y_bar <- apply(dat_1980, 1, mean, na.rm=TRUE)
+dat <- dat_1980 - y_bar
+rownames(dat) <- rownames(dat_1980)
Here are time series plots of all five phytoplankton functional groups.
-spp <- rownames(dat_1980)
-clr <- c("brown","blue","darkgreen","darkred","purple")
-cnt <- 1
-par(mfrow=c(N_ts,1), mai=c(0.5,0.7,0.1,0.1), omi=c(0,0,0,0))
-for(i in spp){
- plot(dat[i,],xlab="",ylab="Abundance index", bty="L", xaxt="n", pch=16, col=clr[cnt], type="b")
- axis(1,12*(0:dim(dat_1980)[2])+1,yr_frst+0:dim(dat_1980)[2])
- title(i)
- cnt <- cnt + 1
- }
spp <- rownames(dat_1980)
+clr <- c("brown","blue","darkgreen","darkred","purple")
+cnt <- 1
+par(mfrow=c(N_ts,1), mai=c(0.5,0.7,0.1,0.1), omi=c(0,0,0,0))
+for(i in spp){
+ plot(dat[i,],xlab="",ylab="Abundance index", bty="L", xaxt="n", pch=16, col=clr[cnt], type="b")
+ axis(1,12*(0:dim(dat_1980)[2])+1,yr_frst+0:dim(dat_1980)[2])
+ title(i)
+ cnt <- cnt + 1
+ }
The Lake Washington dataset has two environmental covariates that we might expect to have effects on phytoplankton growth, and hence, abundance: temperature (Temp
) and total phosphorous (TP
). We need the covariate inputs to have the same number of time steps as the variate data, and thus we limit the covariate data to the years 1980-1994 also.
temp <- t(plank_dat[, "Temp", drop = FALSE])
-TP <- t(plank_dat[, "TP", drop = FALSE])
We will now fit three different models that each add covariate effects (i.e., Temp
, TP
, Temp
and TP
) to our existing model above where \(m\) = 3 and \(\mathbf{R}\) is "diagonal and unequal"
.
mod_list = list(m = 3, R = "diagonal and unequal")
-dfa_temp <- MARSS(dat, model = mod_list, form = "dfa", z.score = FALSE,
- control = con_list, covariates = temp)
-dfa_TP <- MARSS(dat, model = mod_list, form = "dfa", z.score = FALSE,
- control = con_list, covariates = TP)
-dfa_both <- MARSS(dat, model = mod_list, form = "dfa", z.score = FALSE,
- control = con_list, covariates = rbind(temp, TP))
We will now fit three different models that each add covariate effects (i.e., Temp
, TP
, Temp
and TP
) to our existing model above where \(m\) = 3 and \(\mathbf{R}\) is "diagonal and unequal"
.
mod_list=list(m=3, R="diagonal and unequal")
+dfa_temp <- MARSS(dat, model = mod_list, form = "dfa", z.score = FALSE,
+ control = con_list, covariates=temp)
+dfa_TP <- MARSS(dat, model = mod_list, form = "dfa", z.score = FALSE,
+ control = con_list, covariates=TP)
+dfa_both <- MARSS(dat, model = mod_list, form = "dfa", z.score = FALSE,
+ control = con_list, covariates=rbind(temp,TP))
Next we can compare whether the addition of the covariates improves the model fit.
-print(cbind(model = c("no covars", "Temp", "TP", "Temp & TP"),
- AICc = round(c(dfa_1$AICc, dfa_temp$AICc, dfa_TP$AICc, dfa_both$AICc))),
- quote = FALSE)
print(cbind(model=c("no covars", "Temp", "TP", "Temp & TP"),
+ AICc=round(c(dfa_1$AICc, dfa_temp$AICc, dfa_TP$AICc, dfa_both$AICc))),
+ quote=FALSE)
model AICc
[1,] no covars 1427
[2,] Temp 1356
@@ -456,11 +482,11 @@ 10.12 Example from Lake Washingto
[4,] Temp & TP 1362
This suggests that adding temperature or phosphorus to the model, either alone or in combination with one another, does seem to improve overall model fit. If we were truly interested in assessing the “best” model structure that includes covariates, however, we should examine all combinations of 1-4 trends and different structures for \(\mathbf{R}\).
Now let’s try to fit a model with a dummy variable for season, and see how that does.
-cos_t <- cos(2 * pi * seq(TT)/12)
-sin_t <- sin(2 * pi * seq(TT)/12)
-dd <- rbind(cos_t, sin_t)
-dfa_seas <- MARSS(dat_1980, model = mod_list, form = "dfa", z.score = TRUE,
- control = con_list, covariates = dd)
cos_t <- cos(2 * pi * seq(TT) / 12)
+sin_t <- sin(2 * pi * seq(TT) / 12)
+dd <- rbind(cos_t,sin_t)
+dfa_seas <- MARSS(dat_1980, model = mod_list, form = "dfa", z.score=TRUE,
+ control = con_list, covariates=dd)
Success! abstol and log-log tests passed at 384 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -504,27 +530,26 @@ 10.12 Example from Lake Washingto
Standard errors have not been calculated.
Use MARSSparamCIs to compute CIs and bias estimates.
-dfa_seas$AICc
[1] 1484.355
The model with a dummy seasonal factor does much better than the covariate models, but still not as well as the model with only 3 trends. The model fits for the seasonal effects model are shown below.
-## get model fits & CI's
-mod_fit <- get_DFA_fits(dfa_seas, dd = dd)
-## plot the fits
-ylbl <- phytoplankton
-par(mfrow = c(N_ts, 1), mai = c(0.5, 0.7, 0.1, 0.1), omi = c(0,
- 0, 0, 0))
-for (i in 1:N_ts) {
- up <- mod_fit$up[i, ]
- mn <- mod_fit$ex[i, ]
- lo <- mod_fit$lo[i, ]
- plot(w_ts, mn, xlab = "", ylab = ylbl[i], xaxt = "n", type = "n",
- cex.lab = 1.2, ylim = c(min(lo), max(up)))
- axis(1, 12 * (0:dim(dat_1980)[2]) + 1, yr_frst + 0:dim(dat_1980)[2])
- points(w_ts, dat[i, ], pch = 16, col = clr[i])
- lines(w_ts, up, col = "darkgray")
- lines(w_ts, mn, col = "black", lwd = 2)
- lines(w_ts, lo, col = "darkgray")
-}
## get model fits & CI's
+mod_fit <- get_DFA_fits(dfa_seas,dd=dd)
+## plot the fits
+ylbl <- phytoplankton
+par(mfrow=c(N_ts,1), mai=c(0.5,0.7,0.1,0.1), omi=c(0,0,0,0))
+for(i in 1:N_ts) {
+ up <- mod_fit$up[i,]
+ mn <- mod_fit$ex[i,]
+ lo <- mod_fit$lo[i,]
+ plot(w_ts,mn,xlab="",ylab=ylbl[i],xaxt="n",type="n", cex.lab=1.2,
+ ylim=c(min(lo),max(up)))
+ axis(1,12*(0:dim(dat_1980)[2])+1,yr_frst+0:dim(dat_1980)[2])
+ points(w_ts,dat[i,], pch=16, col=clr[i])
+ lines(w_ts, up, col="darkgray")
+ lines(w_ts, mn, col="black", lwd=2)
+ lines(w_ts, lo, col="darkgray")
+}
We can plot the fits for our DFA model along with the data. The following function will return the fitted values ± (1-\(\alpha\))% confidence intervals.
-get_DFA_fits <- function(MLEobj, dd = NULL, alpha = 0.05) {
- ## empty list for results
- fits <- list()
- ## extra stuff for var() calcs
- Ey <- MARSS:::MARSShatyt(MLEobj)
- ## model params
- ZZ <- coef(MLEobj, type = "matrix")$Z
- ## number of obs ts
- nn <- dim(Ey$ytT)[1]
- ## number of time steps
- TT <- dim(Ey$ytT)[2]
- ## get the inverse of the rotation matrix
- H_inv <- varimax(ZZ)$rotmat
- ## check for covars
- if (!is.null(dd)) {
- DD <- coef(MLEobj, type = "matrix")$D
- ## model expectation
- fits$ex <- ZZ %*% H_inv %*% MLEobj$states + DD %*% dd
- } else {
- ## model expectation
- fits$ex <- ZZ %*% H_inv %*% MLEobj$states
- }
- ## Var in model fits
- VtT <- MARSSkfss(MLEobj)$VtT
- VV <- NULL
- for (tt in 1:TT) {
- RZVZ <- coef(MLEobj, type = "matrix")$R - ZZ %*% VtT[,
- , tt] %*% t(ZZ)
- SS <- Ey$yxtT[, , tt] - Ey$ytT[, tt, drop = FALSE] %*%
- t(MLEobj$states[, tt, drop = FALSE])
- VV <- cbind(VV, diag(RZVZ + SS %*% t(ZZ) + ZZ %*% t(SS)))
- }
- SE <- sqrt(VV)
- ## upper & lower (1-alpha)% CI
- fits$up <- qnorm(1 - alpha/2) * SE + fits$ex
- fits$lo <- qnorm(alpha/2) * SE + fits$ex
- return(fits)
-}
get_DFA_fits <- function(MLEobj,dd=NULL,alpha=0.05) {
+ ## empty list for results
+ fits <- list()
+ ## extra stuff for var() calcs
+ Ey <- MARSS:::MARSShatyt(MLEobj)
+ ## model params
+ ZZ <- coef(MLEobj, type="matrix")$Z
+ ## number of obs ts
+ nn <- dim(Ey$ytT)[1]
+ ## number of time steps
+ TT <- dim(Ey$ytT)[2]
+ ## get the inverse of the rotation matrix
+ H_inv <- varimax(ZZ)$rotmat
+ ## check for covars
+ if(!is.null(dd)) {
+ DD <- coef(MLEobj, type="matrix")$D
+ ## model expectation
+ fits$ex <- ZZ %*% H_inv %*% MLEobj$states + DD %*% dd
+ } else {
+ ## model expectation
+ fits$ex <- ZZ %*% H_inv %*% MLEobj$states
+ }
+ ## Var in model fits
+ VtT <- MARSSkfss(MLEobj)$VtT
+ VV <- NULL
+ for(tt in 1:TT) {
+ RZVZ <- coef(MLEobj, type="matrix")$R - ZZ%*%VtT[,,tt]%*%t(ZZ)
+ SS <- Ey$yxtT[,,tt] - Ey$ytT[,tt,drop=FALSE] %*% t(MLEobj$states[,tt,drop=FALSE])
+ VV <- cbind(VV,diag(RZVZ + SS%*%t(ZZ) + ZZ%*%t(SS)))
+ }
+ SE <- sqrt(VV)
+ ## upper & lower (1-alpha)% CI
+ fits$up <- qnorm(1-alpha/2)*SE + fits$ex
+ fits$lo <- qnorm(alpha/2)*SE + fits$ex
+ return(fits)
+}
Here are time series of the five phytoplankton groups (points) with the mean of the DFA fits (black line) and the 95% confidence intervals (gray lines).
-## get model fits & CI's
-mod_fit <- get_DFA_fits(dfa_1)
-## plot the fits
-ylbl <- phytoplankton
-par(mfrow=c(N_ts,1), mai=c(0.5,0.7,0.1,0.1), omi=c(0,0,0,0))
-for(i in 1:N_ts) {
- up <- mod_fit$up[i,]
- mn <- mod_fit$ex[i,]
- lo <- mod_fit$lo[i,]
- plot(w_ts,mn,xlab="",ylab=ylbl[i],xaxt="n",type="n", cex.lab=1.2,
- ylim=c(min(lo),max(up)))
- axis(1,12*(0:dim(dat_1980)[2])+1,yr_frst+0:dim(dat_1980)[2])
- points(w_ts,dat[i,], pch=16, col=clr[i])
- lines(w_ts, up, col="darkgray")
- lines(w_ts, mn, col="black", lwd=2)
- lines(w_ts, lo, col="darkgray")
-}
## get model fits & CI's
+mod_fit <- get_DFA_fits(dfa_1)
+## plot the fits
+ylbl <- phytoplankton
+par(mfrow=c(N_ts,1), mai=c(0.5,0.7,0.1,0.1), omi=c(0,0,0,0))
+for(i in 1:N_ts) {
+ up <- mod_fit$up[i,]
+ mn <- mod_fit$ex[i,]
+ lo <- mod_fit$lo[i,]
+ plot(w_ts,mn,xlab="",ylab=ylbl[i],xaxt="n",type="n", cex.lab=1.2,
+ ylim=c(min(lo),max(up)))
+ axis(1,12*(0:dim(dat_1980)[2])+1,yr_frst+0:dim(dat_1980)[2])
+ points(w_ts,dat[i,], pch=16, col=clr[i])
+ lines(w_ts, up, col="darkgray")
+ lines(w_ts, mn, col="black", lwd=2)
+ lines(w_ts, lo, col="darkgray")
+}
For your homework this week, we will continue to investigate common trends in the Lake Washington plankton data.
Fit other DFA models to the phytoplankton data with varying numbers of trends from 1-4 (we fit a 3-trend model above). Do not include any covariates in these models. Using R="diagonal and unequal"
for the observation errors, which of the DFA models has the most support from the data?
Fit other DFA models to the phytoplankton data with varying numbers of trends from 1-4 (we fit a 3-trend model above). Do not include any covariates in these models. Using R="diagonal and unequal"
for the observation errors, which of the DFA models has the most support from the data?
Plot the model states and loadings as in Section 10.9. Describe the general patterns in the states and the ways the different taxa load onto those trends.
+Also plot the the model fits as in Section 10.10. Do they reasonable? Are there any particular problems or outliers?
How does the best model from Question 1 compare to a DFA model with the same number of trends, but with R="unconstrained"
?
Plot the model states and loadings as in Section 10.9. Describe the general patterns in the states and the ways the different taxa load onto those trends.
Also plot the the model fits as in Section 10.10. Do they reasonable? Are there any particular problems or outliers?
How does the best model from Question 1 compare to a DFA model with the same number of trends, but with R="unconstrained"
?
Fit a DFA model that includes temperature as a covariate and 3 trends (as in Section 10.12), but withR="unconstrained"
? How does this model compare to the model with R="diagonal and unequal"
? How does it compare to the model in Question 2?
Plot the model states and loadings as in Section 10.9. Describe the general patterns in the states and the ways the different taxa load onto those trends.
Also plot the the model fits as in Section 10.10. Do they reasonable? Are there any particular problems or outliers?
Fit a DFA model that includes temperature as a covariate and 3 trends (as in Section 10.12), but withR="unconstrained"
? How does this model compare to the model with R="diagonal and unequal"
? How does it compare to the model in Question 2?
Plot the model states and loadings as in Section 10.9. Describe the general patterns in the states and the ways the different taxa load onto those trends.
Also plot the the model fits as in Section 10.10. Do they reasonable? Are there any particular problems or outliers?
Before proceeding further, we need to address the constraints we placed on the DFA model in Sec 2.2. In particular, we arbitrarily constrained \(\mathbf{Z}\) in such a way to choose only one of these solutions, but fortunately the different solutions are equivalent, and they can be related to each other by a rotation matrix \(\mathbf{H}\). Let \(\mathbf{H}\) be any \(m \times m\) non-singular matrix. The following are then equivalent DFA models:
-\[\begin{equation} +\[\begin{equation} \begin{gathered} \mathbf{y}_t = \mathbf{Z}\mathbf{x}_t+\mathbf{a}+\mathbf{v}_t \mathbf{x}_t = \mathbf{x}_{t-1}+\mathbf{w}_t \\ \end{gathered} \tag{10.10} -\end{equation}\] +\end{equation}\]
and
-\[\begin{equation} +\[\begin{equation} \begin{gathered} \mathbf{y}_t = \mathbf{Z}\mathbf{H}^{-1}\mathbf{x}_t+\mathbf{a}+\mathbf{v}_t \mathbf{H}\mathbf{x}_t = \mathbf{H}\mathbf{x}_{t-1}+\mathbf{H}\mathbf{w}_t \\ \end{gathered}. \tag{10.11} -\end{equation}\] -
There are many ways of doing factor rotations, but a common method is the “varimax”" rotation, which seeks a rotation matrix \(\mathbf{H}\) that creates the largest difference between the loadings in \(\mathbf{Z}\). For example, imagine that row 3 in our estimated \(\mathbf{Z}\) matrix was (0.2, 0.2, 0.2). That would mean that green algae were a mixture of equal parts of processes 1, 2, and 3. If instead row 3 was (0.8, 0.1, 0.05), this would make our interpretation of the model fits easier because we could say that green algae followed the first process most closely. The varimax rotation would find the \(\mathbf{H}\) matrix that makes the rows in \(\mathbf{Z}\) more like (0.8, 0.1, 0.05) and less like (0.2, 0.2, 0.2).
+\end{equation}\] +There are many ways of doing factor rotations, but a common method is the “varimax”" rotation, which seeks a rotation matrix \(\mathbf{H}\) that creates the largest difference between the loadings in \(\mathbf{Z}\). For example, imagine that row 3 in our estimated \(\mathbf{Z}\) matrix was (0.2, 0.2, 0.2). That would mean that green algae were a mixture of equal parts of processes 1, 2, and 3. If instead row 3 was (0.8, 0.1, 0.05), this would make our interpretation of the model fits easier because we could say that green algae followed the first process most closely. The varimax rotation would find the \(\mathbf{H}\) matrix that makes the rows in \(\mathbf{Z}\) more like (0.8, 0.1, 0.05) and less like (0.2, 0.2, 0.2).
The varimax rotation is easy to compute because R has a built in function for this: varimax()
. Interestingly, the function returns the inverse of \(\mathbf{H}\), which we need anyway.
## get the estimated ZZ
-Z_est <- coef(dfa_1, type = "matrix")$Z
-## get the inverse of the rotation matrix
-H_inv <- varimax(Z_est)$rotmat
## get the estimated ZZ
+Z_est <- coef(dfa_1, type="matrix")$Z
+## get the inverse of the rotation matrix
+H_inv <- varimax(Z_est)$rotmat
We can now rotate both \(\mathbf{Z}\) and \(\mathbf{x}\).
-## rotate factor loadings
-Z_rot = Z_est %*% H_inv
-## rotate processes
-proc_rot = solve(H_inv) %*% dfa_1$states
Here we will use the MARSS package to do Dynamic Factor Analysis (DFA), which allows us to look for a set of common underlying processes among a relatively large set of time series (Zuur et al. 2003). There have been a number of recent applications of DFA to ecological questions surrounding Pacific salmon (Stachura, Mantua, and Scheuerell 2014; Jorgensen et al. 2016; Ohlberger, Scheuerell, and Schindler 2016) and stream temperatures (Lisi et al. 2015). For a more in-depth treatment of potential applications of MARSS models for DFA, see Chapter 9 in the MARSS User’s Guide.
+Here we will use the MARSS package to do Dynamic Factor Analysis (DFA), which allows us to look for a set of common underlying processes among a relatively large set of time series (Zuur et al. 2003). There have been a number of recent applications of DFA to ecological questions surrounding Pacific salmon (Stachura, Mantua, and Scheuerell 2014; Jorgensen et al. 2016; Ohlberger, Scheuerell, and Schindler 2016) and stream temperatures (Lisi et al. 2015). For a more in-depth treatment of potential applications of MARSS models for DFA, see Chapter 9 in the MARSS User’s Guide.
A script with all the R code in the chapter can be downloaded here. The Rmd for this chapter can be downloaded here.
All the data used in the chapter are in the MARSS package. Install the package, if needed, and load to run the code in the chapter.
-library(MARSS)
In the literature on state-space models, the set of \(e_t\) are commonly referred to as “innovations”. MARSS()
calculates the innovations as part of the Kalman filter algorithm—they are stored as Innov
in the list produced by the MARSSkfss()
function.
# forecast errors
-innov <- kf.out$Innov
Let’s see if our innovations meet the model assumptions. Beginning with (1), we can use a Q-Q plot to see whether the innovations are normally distributed with a mean of zero. We’ll use the qqnorm()
function to plot the quantiles of the innovations on the \(y\)-axis versus the theoretical quantiles from a Normal distribution on the \(x\)-axis. If the 2 distributions are similar, the points should fall on the line defined by \(y = x\).
# Q-Q plot of innovations
-qqnorm(t(innov), main="", pch=16, col="blue")
-# add y=x line for easier interpretation
-qqline(t(innov))
# Q-Q plot of innovations
+qqnorm(t(innov), main="", pch=16, col="blue")
+# add y=x line for easier interpretation
+qqline(t(innov))
The Q-Q plot (Figure 9.5) indicates that the innovations appear to be more-or-less normally distributed (i.e., most points fall on the line). Furthermore, it looks like the mean of the innovations is about 0, but we should use a more reliable test than simple visual inspection. We can formally test whether the mean of the innovations is significantly different from 0 by using a one-sample \(t\)-test. based on a null hypothesis of \(\,\text{E}(e_t)=0\). To do so, we will use the function t.test()
and base our inference on a significance value of \(\alpha = 0.05\).
# p-value for t-test of H0: E(innov) = 0
-t.test(t(innov), mu=0)$p.value
[1] 0.4840901
The \(p\)-value \(>>\) 0.05 so we cannot reject the null hypothesis that \(\,\text{E}(e_t)=0\).
Moving on to assumption (2), we can use the sample autocorrelation function (ACF) to examine whether the innovations covary with a time-lagged version of themselves. Using the acf()
function, we can compute and plot the correlations of \(e_t\) and \(e_{t-k}\) for various values of \(k\). Assumption (2) will be met if none of the correlation coefficients exceed the 95% confidence intervals defined by \(\pm \, z_{0.975} / \sqrt{n}\).
# plot ACF of innovations
-acf(t(innov), lag.max=10)
The ACF plot (Figure 9.6) shows no significant autocorrelation in the innovations at lags 1–10, so it looks like both of our model assumptions have indeed been met.
- + @@ -481,24 +507,24 @@MARSS()
Now let’s go ahead and analyze the DLM specified in Equations (9.20)–(9.22). We begin by loading the data set (which is in the MARSS package). The data set has 3 columns for 1) the year the salmon smolts migrated to the ocean (year
), 2) logit-transformed survival (logit.s
), and 3) the coastal upwelling index for April (CUI.apr
). There are 42 years of data (1964–2005).
# load the data
-data(SalmonSurvCUI, package="MARSS")
-# get time indices
-years <- SalmonSurvCUI[,1]
-# number of years of data
-TT <- length(years)
-# get response variable: logit(survival)
-dat <- matrix(SalmonSurvCUI[,2],nrow=1)
# load the data
+data(SalmonSurvCUI, package="MARSS")
+# get time indices
+years <- SalmonSurvCUI[,1]
+# number of years of data
+TT <- length(years)
+# get response variable: logit(survival)
+dat <- matrix(SalmonSurvCUI[,2],nrow=1)
As we have seen in other case studies, standardizing our covariate(s) to have zero-mean and unit-variance can be helpful in model fitting and interpretation. In this case, it’s a good idea because the variance of CUI.apr
is orders of magnitude greater than logit.s
.
# get predictor variable
-CUI <- SalmonSurvCUI[,3]
-## z-score the CUI
-CUI.z <- matrix((CUI - mean(CUI))/sqrt(var(CUI)), nrow=1)
-# number of regr params (slope + intercept)
-m <- dim(CUI.z)[1] + 1
# get predictor variable
+CUI <- SalmonSurvCUI[,3]
+## z-score the CUI
+CUI.z <- matrix((CUI - mean(CUI))/sqrt(var(CUI)), nrow=1)
+# number of regr params (slope + intercept)
+m <- dim(CUI.z)[1] + 1
Plots of logit-transformed survival and the \(z\)-scored April upwelling index are shown in Figure 9.1.
MARSS()
Next, we need to set up the appropriate matrices and vectors for MARSS. Let’s begin with those for the process equation because they are straightforward.
-# for process eqn
-B <- diag(m) ## 2x2; Identity
-U <- matrix(0,nrow=m,ncol=1) ## 2x1; both elements = 0
-Q <- matrix(list(0),m,m) ## 2x2; all 0 for now
-diag(Q) <- c("q.alpha","q.beta") ## 2x2; diag = (q1,q2)
# for process eqn
+B <- diag(m) ## 2x2; Identity
+U <- matrix(0,nrow=m,ncol=1) ## 2x1; both elements = 0
+Q <- matrix(list(0),m,m) ## 2x2; all 0 for now
+diag(Q) <- c("q.alpha","q.beta") ## 2x2; diag = (q1,q2)
Defining the correct form for the observation model is a little more tricky, however, because of how we model the effect(s) of predictor variables. In a DLM, we need to use \(\mathbf{Z}_t\) (instead of \(\mathbf{d}_t\)) as the matrix of predictor variables that affect \(\mathbf{y}_t\), and we use \(\mathbf{x}_t\) (instead of \(\mathbf{D}_t\)) as the regression parameters. Therefore, we need to set \(\mathbf{Z}_t\) equal to an \(n\times m\times T\) array, where \(n\) is the number of response variables (= 1; \(y_t\) is univariate), \(m\) is the number of regression parameters (= intercept + slope = 2), and \(T\) is the length of the time series (= 42).
-# for observation eqn
-Z <- array(NA, c(1,m,TT)) ## NxMxT; empty for now
-Z[1,1,] <- rep(1,TT) ## Nx1; 1's for intercept
-Z[1,2,] <- CUI.z ## Nx1; predictor variable
-A <- matrix(0) ## 1x1; scalar = 0
-R <- matrix("r") ## 1x1; scalar = r
# for observation eqn
+Z <- array(NA, c(1,m,TT)) ## NxMxT; empty for now
+Z[1,1,] <- rep(1,TT) ## Nx1; 1's for intercept
+Z[1,2,] <- CUI.z ## Nx1; predictor variable
+A <- matrix(0) ## 1x1; scalar = 0
+R <- matrix("r") ## 1x1; scalar = r
Lastly, we need to define our lists of initial starting values and model matrices/vectors.
-# only need starting values for regr parameters
-inits.list <- list(x0=matrix(c(0, 0), nrow=m))
-# list of model matrices & vectors
-mod.list <- list(B=B, U=U, Q=Q, Z=Z, A=A, R=R)
# only need starting values for regr parameters
+inits.list <- list(x0=matrix(c(0, 0), nrow=m))
+# list of model matrices & vectors
+mod.list <- list(B=B, U=U, Q=Q, Z=Z, A=A, R=R)
And now we can fit our DLM with MARSS.
-# fit univariate DLM
-dlm1 <- MARSS(dat, inits=inits.list, model=mod.list)
Success! abstol and log-log tests passed at 115 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -501,7 +527,8 @@ 9.7 Fitting with MARSS()
Notice that the MARSS output does not list any estimates of the regression parameters themselves. Why not? Remember that in a DLM the matrix of states \((\mathbf{x})\) contains the estimates of the regression parameters \((\boldsymbol{\theta})\). Therefore, we need to look in dlm1$states
for the MLEs of the regression parameters, and in dlm1$states.se
for their standard errors.
Time series of the estimated intercept and slope are shown in Figure 9.2. It appears as though the intercept is much more dynamic than the slope, as indicated by a much larger estimate of process variance for the former (Q.q1
). In fact, although the effect of April upwelling appears to be increasing over time, it doesn’t really become important as a predictor variable until about 1990 when the approximate 95% confidence interval for the slope no longer overlaps zero.
Time series of the estimated intercept and slope are shown in Figure 9.2. It appears as though the intercept is much more dynamic than the slope, as indicated by a much larger estimate of process variance for the former (Q.q1
). In fact, although the effect of April upwelling appears to be increasing over time, it doesn’t really become important as a predictor variable until about 1990 when the approximate 95% confidence interval for the slope no longer overlaps zero.
+
MARSS()
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -441,11 +467,12 @@ 9.8 Forecasting
9.8.1 Estimate of the regression parameters
-For step 1, we want to compute the distribution of the regression parameters at time \(t\) conditioned on the data up to time \(t-1\), also known as the one-step ahead forecasts of the regression parameters. Let’s denote \(\boldsymbol{\theta}_{t-1}\) conditioned on \(y_{1:t-1}\) as \(\boldsymbol{\theta}_{t-1|t-1}\) and denote \(\boldsymbol{\theta}_{t}\) conditioned on \(y_{1:t-1}\) as \(\boldsymbol{\theta}_{t|t-1}\). We will start by defining the distribution of \(\boldsymbol{\theta}_{t|t}\) as follows
+For step 1, we want to compute the distribution of the regression parameters at time \(t\) conditioned on the data up to time \(t-1\), also known as the one-step ahead forecasts of the regression parameters. Let’s denote \(\boldsymbol{\theta}_{t-1}\) conditioned on \(y_{1:t-1}\) as \(\boldsymbol{\theta}_{t-1|t-1}\) and denote \(\boldsymbol{\theta}_{t}\) conditioned on \(y_{1:t-1}\) as \(\boldsymbol{\theta}_{t|t-1}\). We will start by defining the distribution of \(\boldsymbol{\theta}_{t|t}\) as follows
\[\begin{equation}
\tag{9.24}
\boldsymbol{\theta}_{t|t} \sim \text{MVN}(\boldsymbol{\pi}_t,\boldsymbol{\Lambda}_t) \end{equation}\]
-where \(\boldsymbol{\pi}_t = \text{E}(\boldsymbol{\theta}_{t|t})\) and \(\mathbf{\Lambda}_t = \text{Var}(\boldsymbol{\theta}_{t|t})\). Now we can compute the distribution of \(\boldsymbol{\theta}_{t}\) conditioned on \(y_{1:t-1}\) using the process equation for \(\boldsymbol{\theta}\):
+where \(\boldsymbol{\pi}_t = \text{E}(\boldsymbol{\theta}_{t|t})\) and \(\mathbf{\Lambda}_t = \text{Var}(\boldsymbol{\theta}_{t|t})\).
+Now we can compute the distribution of \(\boldsymbol{\theta}_{t}\) conditioned on \(y_{1:t-1}\) using the process equation for \(\boldsymbol{\theta}\):
\[\begin{equation}
\boldsymbol{\theta}_{t} = \mathbf{G}_t \boldsymbol{\theta}_{t-1} + \mathbf{w}_t ~ \text{with} ~ \mathbf{w}_t \sim \text{MVN}(\mathbf{0}, \mathbf{Q}) \\
\end{equation}\]
@@ -463,12 +490,12 @@
9.8.1 Estimate of the regression
\[\begin{equation}
\tag{9.27}
\text{E}(\boldsymbol{\theta}_{t|t-1}) \sim \text{MVN}(\mathbf{G}_t \boldsymbol{\pi}_{t-1}, \mathbf{G}_t \mathbf{\Lambda}_{t-1} \mathbf{G}_t^{\top} + \mathbf{Q})
-\end{equation}\]
+\end{equation}\]
9.8.2 Prediction of the response variable \(y_t\)
For step 2, we make the prediction of \(y_{t}\) given the predictor variables at time \(t\) and the estimate of the regression parameters at time \(t\). This is called the one-step ahead prediction for the observation at time \(t\). We will denote the prediction of \(y\) as \(\hat{y}\) and we want to compute its distribution (mean and variance). We do this using the equation for \(y_t\) but substituting the expected value of \(\boldsymbol{\theta}_{t|t-1}\) for \(\boldsymbol{\theta}_t\).
-\[\begin{equation}
+\[\begin{equation}
\tag{9.28}
\hat{y}_{t|t-1} = \mathbf{F}^{\top}_{t} \text{E}(\boldsymbol{\theta}_{t|t-1}) + e_{t} ~ \text{with} ~ e_{t} \sim \text{N}(0, r) \\
\end{equation}\]
@@ -482,7 +509,7 @@
9.8.2 Prediction of the response
\tag{9.30}
\text{Var}(\hat{y}_{t|t-1}) &= \mathbf{F}^{\top}_{t} \text{Var}(\boldsymbol{\theta}_{t|t-1}) \mathbf{F}_{t} + r \\
&= \mathbf{F}^{\top}_{t} (\mathbf{G}_t \mathbf{\Lambda}_{t-1} \mathbf{G}_t^{\top} + \mathbf{Q}) \mathbf{F}_{t} + r \\
-\end{align}\]
+\end{align}\]
9.8.3 Computing the prediction
@@ -492,43 +519,48 @@ 9.8.3 Computing the prediction
9.8.4 Forecasting salmon survival
-Scheuerell and Williams (2005) were interested in how well upwelling could be used to actually forecast expected survival of salmon, so let’s look at how well our model does in that context. To do so, we need the predictive distribution for the survival at time \(t\) given the upwelling at time \(t\) and the predicted regression parameters at \(t\).
-In the salmon survival DLM, the \(\mathbf{G}_t\) matrix is the identity matrix, thus the mean and variance of the one-step ahead predictive distribution for the observation at time \(t\) reduces to (from Equations (9.29) and (9.30))
+Scheuerell and Williams (2005) were interested in how well upwelling could be used to actually forecast expected survival of salmon, so let’s look at how well our model does in that context. To do so, we need the predictive distribution for the survival at time \(t\) given the upwelling at time \(t\) and the predicted regression parameters at \(t\).
+In the salmon survival DLM, the \(\mathbf{G}_t\) matrix is the identity matrix, thus the mean and variance of the one-step ahead predictive distribution for the observation at time \(t\) reduces to (from Equations (9.29) and (9.30))
\[\begin{equation}
\tag{9.31}
\text{E}(\hat{y}_{t|t-1}) = \mathbf{F}^{\top}_{t} \text{E}(\boldsymbol{\theta}_{t|t-1}) \\
\text{Var}(\hat{y}_{t|t-1}) = \mathbf{F}^{\top}_{t} \text{Var}(\boldsymbol{\theta}_{t|t-1}) \mathbf{F}_{t} + \hat{r}
\end{equation}\]
-
where \[
+where
+\[
\mathbf{F}_{t}=\begin{bmatrix}1 \\ f_{t}\end{bmatrix}
-\] and \(f_{t}\) is the upwelling index at \(t+1\). \(\hat{r}\) is the estimated observation variance from our model fit.
+\]
+and \(f_{t}\) is the upwelling index at \(t+1\). \(\hat{r}\) is the estimated observation variance from our model fit.
9.8.5 Forecasting using MARSS
-Working from Equation (9.31), we can compute the expected value of the forecast at time \(t\) and its variance using the Kalman filter. For the expectation, we need \(\mathbf{F}_{t}^\top\text{E}(\boldsymbol{\theta}_{t|t-1})\). \(\mathbf{F}_t^\top\) is called \(\mathbf{Z}_t\) in MARSS notation. The one-step ahead forecasts of the regression parameters at time \(t\), the \(\text{E}(\boldsymbol{\theta}_{t|t-1})\), are calculated as part of the Kalman filter algorithm—they are termed \(\tilde{x}_t^{t-1}\) in MARSS notation and stored as xtt1
in the list produced by the MARSSkfss()
Kalman filter function.
+Working from Equation (9.31), we can compute the expected value of the forecast at time \(t\) and its variance using the Kalman filter. For the expectation, we need \(\mathbf{F}_{t}^\top\text{E}(\boldsymbol{\theta}_{t|t-1})\).
+\(\mathbf{F}_t^\top\) is called \(\mathbf{Z}_t\) in MARSS notation. The one-step ahead forecasts of the regression parameters at time \(t\), the \(\text{E}(\boldsymbol{\theta}_{t|t-1})\), are calculated as part of the Kalman filter algorithm—they are termed \(\tilde{x}_t^{t-1}\) in MARSS notation and stored as xtt1
in the list produced by the MARSSkfss()
Kalman filter function.
Using the Z
defined in 9.6, we compute the mean forecast as follows:
-# get list of Kalman filter output
-kf.out <- MARSSkfss(dlm1)
-## forecasts of regr parameters; 2xT matrix
-eta <- kf.out$xtt1
-## ts of E(forecasts)
-fore.mean <- vector()
-for(t in 1:TT) {
- fore.mean[t] <- Z[,,t] %*% eta[,t,drop=FALSE]
-}
-For the variance of the forecasts, we need \(\mathbf{F}^{\top}_{t} \text{Var}(\boldsymbol{\theta}_{t|t-1}) \mathbf{F}_{t} + \hat{r}\). As with the mean, \(\mathbf{F}^\top_t \equiv \mathbf{Z}_t\). The variances of the one-step ahead forecasts of the regression parameters at time \(t\), \(\text{Var}(\boldsymbol{\theta}_{t|t-1})\), are also calculated as part of the Kalman filter algorithm—they are stored as Vtt1
in the list produced by the MARSSkfss()
function. Lastly, the observation variance \(\hat{r}\) was estimated when we fit the DLM to the data using MARSS()
and can be extracted from the dlm1
fit.
+# get list of Kalman filter output
+kf.out <- MARSSkfss(dlm1)
+## forecasts of regr parameters; 2xT matrix
+eta <- kf.out$xtt1
+## ts of E(forecasts)
+fore.mean <- vector()
+for(t in 1:TT) {
+ fore.mean[t] <- Z[,,t] %*% eta[,t,drop=FALSE]
+}
+For the variance of the forecasts, we need
+\(\mathbf{F}^{\top}_{t} \text{Var}(\boldsymbol{\theta}_{t|t-1}) \mathbf{F}_{t} + \hat{r}\). As with the mean, \(\mathbf{F}^\top_t \equiv \mathbf{Z}_t\). The variances of the one-step ahead forecasts of the regression parameters at time \(t\), \(\text{Var}(\boldsymbol{\theta}_{t|t-1})\), are also calculated as part of the Kalman filter algorithm—they are stored as Vtt1
in the list produced by the MARSSkfss()
function. Lastly, the observation variance \(\hat{r}\) was estimated when we fit the DLM to the data using MARSS()
and can be extracted from the dlm1
fit.
Putting this together, we can compute the forecast variance:
-# variance of regr parameters; 1x2xT array
-Phi <- kf.out$Vtt1
-## obs variance; 1x1 matrix
-R.est <- coef(dlm1, type="matrix")$R
-## ts of Var(forecasts)
-fore.var <- vector()
-for(t in 1:TT) {
- tZ <- matrix(Z[,,t],m,1) ## transpose of Z
- fore.var[t] <- Z[,,t] %*% Phi[,,t] %*% tZ + R.est
-}
-Plots of the model mean forecasts with their estimated uncertainty are shown in Figure 9.3. Nearly all of the observed values fell within the approximate prediction interval. Notice that we have a forecasted value for the first year of the time series (1964), which may seem at odds with our notion of forecasting at time \(t\) based on data available only through time \(t-1\). In this case, however, MARSS is actually estimating the states at \(t=0\) (\(\boldsymbol{\theta}_0\)), which allows us to compute a forecast for the first time point.
+# variance of regr parameters; 1x2xT array
+Phi <- kf.out$Vtt1
+## obs variance; 1x1 matrix
+R.est <- coef(dlm1, type="matrix")$R
+## ts of Var(forecasts)
+fore.var <- vector()
+for(t in 1:TT) {
+ tZ <- matrix(Z[,,t],m,1) ## transpose of Z
+ fore.var[t] <- Z[,,t] %*% Phi[,,t] %*% tZ + R.est
+}
+Plots of the model mean forecasts with their estimated uncertainty are shown in Figure 9.3. Nearly all of the observed values fell within the approximate prediction interval. Notice that we have a forecasted value for the first year of the time series (1964), which may seem at odds with our notion of forecasting at time \(t\) based on data available only through time \(t-1\). In this case, however, MARSS is actually estimating the states at \(t=0\) (\(\boldsymbol{\theta}_0\)), which allows us to compute a forecast for the first time point.
+
@@ -536,7 +568,8 @@ 9.8.5 Forecasting using MARSS
-Although our model forecasts look reasonable in logit-space, it is worthwhile to examine how well they look when the survival data and forecasts are back-transformed onto the interval [0,1] (Figure 9.4). In that case, the accuracy does not seem to be affected, but the precision appears much worse, especially during the early and late portions of the time series when survival is changing rapidly.
+Although our model forecasts look reasonable in logit-space, it is worthwhile to examine how well they look when the survival data and forecasts are back-transformed onto the interval [0,1] (Figure 9.4). In that case, the accuracy does not seem to be affected, but the precision appears much worse, especially during the early and late portions of the time series when survival is changing rapidly.
+
@@ -558,24 +591,24 @@ 9.8.5 Forecasting using MARSS
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,44 +461,46 @@
9.10 Homework discussion and data
For the homework this week we will use a DLM to examine some of the time-varying properties of the spawner-recruit relationship for Pacific salmon. Much work has been done on this topic, particularly by Randall Peterman and his students and post-docs at Simon Fraser University. To do so, researchers commonly use a Ricker model because of its relatively simple form, such that the number of recruits (offspring) born in year \(t\) (\(R_t\)) from the number of spawners (parents) (\(S_t\)) is
-\[\begin{equation}
+\[\begin{equation}
\tag{9.32}
R_t = a S_t e^{-b S + v_t}.
-\end{equation}\]
+\end{equation}\]
The parameter \(a\) determines the maximum reproductive rate in the absence of any density-dependent effects (the slope of the curve at the origin), \(b\) is the strength of density dependence, and \(v_t \sim N(0,\sigma)\). In practice, the model is typically log-transformed so as to make it linear with respect to the predictor variable \(S_t\), such that
-\[\begin{align}
+\[\begin{align}
\tag{9.33}
\text{log}(R_t) &= \text{log}(a) + \text{log}(S_t) -b S_t + v_t \\
\text{log}(R_t) - \text{log}(S_t) &= \text{log}(a) -b S_t + v_t \\
\text{log}(R_t/S_t) &= \text{log}(a) - b S_t + v_t.
-\end{align}\]
+\end{align}\]
Substituting \(y_t = \text{log}(R_t/S_t)\), \(x_t = S_t\), and \(\alpha = \text{log}(a)\) yields a simple linear regression model with intercept \(\alpha\) and slope \(b\).
Unfortunately, however, residuals from this simple model typically show high-autocorrelation due to common environmental conditions that affect overlapping generations. Therefore, to correct for this and allow for an index of stock productivity that controls for any density-dependent effects, the model may be re-written as
-\[\begin{align}
+\[\begin{align}
\tag{9.34}
\text{log}(R_t/S_t) &= \alpha_t - b S_t + v_t, \\
\alpha_t &= \alpha_{t-1} + w_t,
-\end{align}\]
+\end{align}\]
and \(w_t \sim N(0,q)\). By treating the brood-year specific productivity as a random walk, we allow it to vary, but in an autocorrelated manner so that consecutive years are not independent from one another.
More recently, interest has grown in using covariates (\(e.g.\), sea-surface temperature) to explain the interannual variability in productivity. In that case, we can can write the model as
-\[\begin{equation}
+\[\begin{equation}
\tag{9.35}
\text{log}(R_t/S_t) = \alpha + \delta_t X_t - b S_t + v_t.
-\end{equation}\]
+\end{equation}\]
In this case we are estimating some base-level productivity (\(\alpha\)) plus the time-varying effect of some covariate \(X_t\) (\(\delta_t\)).
9.10.1 Spawner-recruit data
-The data come from a large public database begun by Ransom Myers many years ago. If you are interested, you can find lots of time series of spawning-stock, recruitment, and harvest for a variety of fishes around the globe. Here is the website: https://www.ramlegacy.org/
+The data come from a large public database begun by Ransom Myers many years ago. If you are interested, you can find lots of time series of spawning-stock, recruitment, and harvest for a variety of fishes around the globe. Here is the website:
+
+https://www.ramlegacy.org/
For this exercise, we will use spawner-recruit data for sockeye salmon (Oncorhynchus nerka) from the Kvichak River in SW Alaska that span the years 1952-1989. In addition, we’ll examine the potential effects of the Pacific Decadal Oscillation (PDO) during the salmon’s first year in the ocean, which is widely believed to be a “bottleneck” to survival.
These data are in the atsalibrary package on GitHub. If needed, install using the devtools package.
-library(devtools)
-devtools::install_github("nwfsc-timeseries/atsalibrary")
+
Load the data.
-data(KvichakSockeye, package = "atsalibrary")
-SRdata <- KvichakSockeye
+
The data are a dataframe with columns for brood year (brood.yr
), number of spawners (Sp
), number of recruits (Rec
) and PDO at year \(t-2\) (PDO.t2
) and \(t-3\) (PDO.t3
).
-# head of data file
-head(SRdata)
+
brood.yr Sp Rec PDO.t2 PDO.t3
1 1952 5970 17310 -0.61 -0.61
2 1953 320 520 -1.48 -2.66
@@ -494,24 +522,24 @@ 9.10.1 Spawner-recruit data
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -434,32 +460,32 @@
9.1 Overview
-We begin our description of DLMs with a static regression model, wherein the \(i^{th}\) observation (response variable) is a linear function of an intercept, predictor variable(s), and a random error term. For example, if we had one predictor variable (\(f\)), we could write the model as
+We begin our description of DLMs with a static regression model, wherein the \(i^{th}\) observation (response variable) is a linear function of an intercept, predictor variable(s), and a random error term. For example, if we had one predictor variable (\(f\)), we could write the model as
\[\begin{equation}
\tag{9.1}
y_i = \alpha + \beta f_i + v_i,
\end{equation}\]
-
where the \(\alpha\) is the intercept, \(\beta\) is the regression slope, \(f_i\) is the predictor variable matched to the \(i^{th}\) observation (\(y_i\)), and \(v_i \sim \text{N}(0,r)\). It is important to note here that there is no implicit ordering of the index \(i\). That is, we could shuffle any/all of the \((y_i, f_i)\) pairs in our dataset with no effect on our ability to estimate the model parameters.
-We can write Equation (9.1) using matrix notation, as
+where the \(\alpha\) is the intercept, \(\beta\) is the regression slope, \(f_i\) is the predictor variable matched to the \(i^{th}\) observation (\(y_i\)), and \(v_i \sim \text{N}(0,r)\). It is important to note here that there is no implicit ordering of the index \(i\). That is, we could shuffle any/all of the \((y_i, f_i)\) pairs in our dataset with no effect on our ability to estimate the model parameters.
+We can write Equation (9.1) using matrix notation, as
\[\begin{align}
\tag{9.2}
y_i &= \begin{bmatrix}1&f_i\end{bmatrix}
\begin{bmatrix}\alpha\\ \beta\end{bmatrix} + v_i \nonumber\\
&= \mathbf{F}_i^{\top}\boldsymbol{\theta} + v_i,
\end{align}\]
-
where \(\mathbf{F}_i^{\top} = \begin{bmatrix}1&f_i\end{bmatrix}\) and \(\boldsymbol{\theta} = \begin{bmatrix}\alpha\\ \beta\end{bmatrix}\).
-In a DLM, however, the regression parameters are dynamic in that they “evolve” over time. For a single observation at time \(t\), we can write
+where \(\mathbf{F}_i^{\top} = \begin{bmatrix}1&f_i\end{bmatrix}\) and \(\boldsymbol{\theta} = \begin{bmatrix}\alpha\\ \beta\end{bmatrix}\).
+In a DLM, however, the regression parameters are dynamic in that they “evolve” over time. For a single observation at time \(t\), we can write
\[\begin{equation}
\tag{9.3}
y_t = \mathbf{F}_{t}^{\top}\boldsymbol{\theta}_t + v_t,
\end{equation}\]
-
where \(\mathbf{F}_t\) is a column vector of predictor variables (covariates) at time \(t\), \(\boldsymbol{\theta}_t\) is a column vector of regression parameters at time \(t\) and \(v_{t}\sim\,\text{N}(0,r)\). This formulation presents two features that distinguish it from Equation (9.2). First, the observed data are explicitly time ordered (i.e., \(\mathbf{y}=\lbrace{y_1,y_2,y_3,\dots,y_T}\rbrace\)), which means we expect them to contain implicit information. Second, the relationship between the observed datum and the predictor variables are unique at every time \(t\) (i.e., \(\boldsymbol{\theta}=\lbrace{\boldsymbol{\theta}_1,\boldsymbol{\theta}_2,\boldsymbol{\theta}_3,\dots,\boldsymbol{\theta}_T}\rbrace\)).
-However, closer examination of Equation (9.3) reveals an apparent complication for parameter estimation. With only one datum at each time step \(t\), we could, at best, estimate only one regression parameter, and even then, the 1:1 correspondence between data and parameters would preclude any estimation of parameter uncertainty. To address this shortcoming, we return to the time ordering of model parameters. Rather than assume the regression parameters are independent from one time step to another, we instead model them as an autoregressive process where
+where \(\mathbf{F}_t\) is a column vector of predictor variables (covariates) at time \(t\), \(\boldsymbol{\theta}_t\) is a column vector of regression parameters at time \(t\) and \(v_{t}\sim\,\text{N}(0,r)\). This formulation presents two features that distinguish it from Equation (9.2). First, the observed data are explicitly time ordered (i.e., \(\mathbf{y}=\lbrace{y_1,y_2,y_3,\dots,y_T}\rbrace\)), which means we expect them to contain implicit information. Second, the relationship between the observed datum and the predictor variables are unique at every time \(t\) (i.e., \(\boldsymbol{\theta}=\lbrace{\boldsymbol{\theta}_1,\boldsymbol{\theta}_2,\boldsymbol{\theta}_3,\dots,\boldsymbol{\theta}_T}\rbrace\)).
+However, closer examination of Equation (9.3) reveals an apparent complication for parameter estimation. With only one datum at each time step \(t\), we could, at best, estimate only one regression parameter, and even then, the 1:1 correspondence between data and parameters would preclude any estimation of parameter uncertainty. To address this shortcoming, we return to the time ordering of model parameters. Rather than assume the regression parameters are independent from one time step to another, we instead model them as an autoregressive process where
\[\begin{equation}
\tag{9.4}
\boldsymbol{\theta}_t = \mathbf{G}_t\boldsymbol{\theta}_{t-1} + \mathbf{w}_t,
\end{equation}\]
-
\(\mathbf{G}_t\) is the parameter “evolution” matrix, and \(\mathbf{w}_t\) is a vector of process errors, such that \(\mathbf{w}_t \sim \,\text{MVN}(\mathbf{0},\mathbf{Q})\). The elements of \(\mathbf{G}_t\) may be known and fixed a priori, or unknown and estimated from the data. Although we could allow \(\mathbf{G}_t\) to be time-varying, we will typically assume that it is time invariant or assume \(\mathbf{G}_t\) is an \(m \times m\) identity matrix \(\mathbf{I}_m\).
+\(\mathbf{G}_t\) is the parameter “evolution” matrix, and \(\mathbf{w}_t\) is a vector of process errors, such that \(\mathbf{w}_t \sim \,\text{MVN}(\mathbf{0},\mathbf{Q})\). The elements of \(\mathbf{G}_t\) may be known and fixed a priori, or unknown and estimated from the data. Although we could allow \(\mathbf{G}_t\) to be time-varying, we will typically assume that it is time invariant or assume \(\mathbf{G}_t\) is an \(m \times m\) identity matrix \(\mathbf{I}_m\).
The idea is that the evolution matrix \(\mathbf{G}_t\) deterministically maps the parameter space from one time step to the next, so the parameters at time \(t\) are temporally related to those before and after. However, the process is corrupted by stochastic error, which amounts to a degradation of information over time. If the diagonal elements of \(\mathbf{Q}\) are relatively large, then the parameters can vary widely from \(t\) to \(t+1\). If \(\mathbf{Q} = \mathbf{0}\), then \(\boldsymbol{\theta}_1=\boldsymbol{\theta}_2=\boldsymbol{\theta}_T\) and we are back to the static model in Equation (9.1).
@@ -473,24 +499,24 @@ 9.1 Overview
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -438,10 +464,10 @@ 9.11 Problems
- Begin by fitting a reduced form of Equation (9.34) that includes only a time-varying level (\(\alpha_t\)) and observation error (\(v_t\)). That is,
-\[\begin{align*}
+\[\begin{align*}
\text{log}(R_t) &= \alpha_t + \text{log}(S_t) + v_t \\
\text{log}(R_t/S_t) &= \alpha_t + v_t
-\end{align*}\]
+\end{align*}\]
This model assumes no density-dependent survival in that the number of recruits is an ascending function of spawners. Plot the ts of \(\alpha_t\) and note the AICc for this model. Also plot appropriate model diagnostics.
Fit the full model specified by Equation (9.34). For this model, obtain the time series of \(\alpha_t\), which is an estimate of the stock productivity in the absence of density-dependent effects. How do these estimates of productivity compare to those from the previous question? Plot the ts of \(\alpha_t\) and note the AICc for this model. Also plot appropriate model diagnostics. (\(Hint\): If you don’t want a parameter to vary with time, what does that say about its process variance?)
@@ -463,24 +489,24 @@ 9.11 Problems
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,7 +461,7 @@
9.4 Stochastic regression model
The stochastic level models in Section 9.3 do not have predictor variables (covariates). Let’s add one predictor variable \(f_t\) and write a simple DLM where the intercept \(\alpha\) and slope \(\beta\) are stochastic. We will specify that \(\alpha\) and \(\beta\) evolve according to a simple random walk. Normally \(x\) is used for the predictor variables in a regression model, but we will avoid that since we are using \(x\) for the state equation in a state-space model. This model is
-\[\begin{equation}
+\[\begin{equation}
\tag{9.10}
y_t = \alpha_t + \beta_t f_t + v_t \\
\alpha_t = \alpha_{t-1} + w_{\alpha,t} \\
@@ -463,13 +489,13 @@ 9.4 Stochastic regression model
w_{\alpha} \\
w_{\beta}
\end{bmatrix}_t
-\end{equation}\]
-Equation (9.11) is a MARSS model:
+\end{equation}\]
+Equation (9.11) is a MARSS model:
\[\begin{equation}
y_t = \mathbf{Z}\mathbf{x}_t + v_t \\
\mathbf{x}_t = \mathbf{x}_{t-1} + \mathbf{w}_t
\end{equation}\]
-
where \(\mathbf{x}=\begin{bmatrix}\alpha \\ \beta\end{bmatrix}\) and \(\mathbf{Z}=\begin{bmatrix}1&f_t\end{bmatrix}\).
+where \(\mathbf{x}=\begin{bmatrix}\alpha \\ \beta\end{bmatrix}\) and \(\mathbf{Z}=\begin{bmatrix}1&f_t\end{bmatrix}\).
@@ -482,24 +508,24 @@ 9.4 Stochastic regression model
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -434,14 +460,14 @@
9.6 Analysis of salmon survival
-Let’s see an example of a DLM used to analyze real data from the literature. Scheuerell and Williams (2005) used a DLM to examine the relationship between marine survival of Chinook salmon and an index of ocean upwelling strength along the west coast of the USA. Upwelling brings cool, nutrient-rich waters from the deep ocean to shallower coastal areas. Scheuerell & Williams hypothesized that stronger upwelling in April should create better growing conditions for phytoplankton, which would then translate into more zooplankton. In turn, juvenile salmon (“smolts”) entering the ocean in May and June should find better foraging opportunities. Thus, for smolts entering the ocean in year \(t\),
+Let’s see an example of a DLM used to analyze real data from the literature. Scheuerell and Williams (2005) used a DLM to examine the relationship between marine survival of Chinook salmon and an index of ocean upwelling strength along the west coast of the USA. Upwelling brings cool, nutrient-rich waters from the deep ocean to shallower coastal areas. Scheuerell & Williams hypothesized that stronger upwelling in April should create better growing conditions for phytoplankton, which would then translate into more zooplankton. In turn, juvenile salmon (“smolts”) entering the ocean in May and June should find better foraging opportunities. Thus, for smolts entering the ocean in year \(t\),
\[\begin{equation}
\tag{9.20}
survival_t = \alpha_t + \beta_t f_t + v_t \text{ with } v_{t}\sim\,\text{N}(0,r),
\end{equation}\]
-
and \(f_t\) is the coastal upwelling index (cubic meters of seawater per second per 100 m of coastline) for the month of April in year \(t\).
+and \(f_t\) is the coastal upwelling index (cubic meters of seawater per second per 100 m of coastline) for the month of April in year \(t\).
Both the intercept and slope are time varying, so
-\[\begin{align}
+\[\begin{align}
\tag{9.21}
\alpha_t &= \alpha_{t-1} + w_{\alpha,t} \text{ with } w_{\alpha,t} \sim \,\text{N}(0,q_{\alpha})\\
\beta_t &= \beta_{t-1} + w_{\beta,t} \text{ with } w_{\beta,t} \sim \,\text{N}(0,q_{\beta}).
@@ -452,15 +478,15 @@ 9.6 Analysis of salmon survival
y_t = \mathbf{F}_t^{\top}\boldsymbol{\theta}_t + v_t \text{ with } v_t\sim\,\text{N}(0,r)\\
\boldsymbol{\theta}_t = \mathbf{G}_t\boldsymbol{\theta}_{t-1} + \mathbf{w}_t \text{ with } \mathbf{w}_t \sim \,\text{MVN}(\mathbf{0},\mathbf{Q})\\
\boldsymbol{\theta}_0 = \boldsymbol{\pi}_0.
-\end{equation}\]
-Equation (9.22) is equivalent to our standard MARSS model:
+\end{equation}\]
+Equation (9.22) is equivalent to our standard MARSS model:
\[\begin{equation}
\tag{9.23}
\mathbf{y}_t = \mathbf{Z}_t\mathbf{x}_t + \mathbf{a} + \mathbf{v}_t \text{ with } \mathbf{v}_t \sim \,\text{MVN}(0,\mathbf{R}_t)\\
\mathbf{x}_t = \mathbf{B}_t\mathbf{x}_{t-1} + \mathbf{u} + \mathbf{w}_t \text{ with } \mathbf{w}_t \sim \,\text{MVN}(0,\mathbf{Q}_t)\\
\mathbf{x}_0 = \boldsymbol{\pi}
\end{equation}\]
-
where \(\mathbf{x}_t = \boldsymbol{\theta}_t\), \(\mathbf{B}_t = \mathbf{G}_t\), \(\mathbf{y}_t = y_t\) (i.e., \(\mathbf{y}_t\) is 1 \(\times\) 1), \(\mathbf{Z}_t = \mathbf{F}_t^{\top}\), \(\mathbf{a} = \mathbf{u} = \mathbf{0}\), and \(\mathbf{R}_t = r\) (i.e., \(\mathbf{R}_t\) is 1 \(\times\) 1).
+where \(\mathbf{x}_t = \boldsymbol{\theta}_t\), \(\mathbf{B}_t = \mathbf{G}_t\), \(\mathbf{y}_t = y_t\) (i.e., \(\mathbf{y}_t\) is 1 \(\times\) 1), \(\mathbf{Z}_t = \mathbf{F}_t^{\top}\), \(\mathbf{a} = \mathbf{u} = \mathbf{0}\), and \(\mathbf{R}_t = r\) (i.e., \(\mathbf{R}_t\) is 1 \(\times\) 1).
@@ -473,24 +499,24 @@ 9.6 Analysis of salmon survival
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,19 +461,19 @@
9.3 Stochastic level models
The most simple DLM is a stochastic level model, where the level is a random walk without drift, and this level is observed with error. We will write it first in using regression notation where the intercept is \(\alpha\) and then in MARSS notation. In the latter, \(\alpha_t=x_t\).
-\[\begin{equation}
+\[\begin{equation}
\tag{9.6}
y_t = \alpha_t + e_t \\
\alpha_t = \alpha_{t-1} + w_t \\
\Downarrow \\
y_t = x_t + v_t \\
x_t = x_{t-1} + w_t
-\end{equation}\]
+\end{equation}\]
Using this model, we can model the Nile River level and fit the model using MARSS()
.
-data(Nile, package="datasets")
-mod_list <- list(B = "identity", U = "zero", Q = matrix("q"),
- Z = "identity", A = matrix("a"), R = matrix("r"))
-fit <- MARSS(matrix(Nile, nrow = 1), mod_list)
+data(Nile, package="datasets")
+mod_list <- list(B = "identity", U = "zero", Q = matrix("q"),
+ Z = "identity", A = matrix("a"), R = matrix("r"))
+fit <- MARSS(matrix(Nile, nrow = 1), mod_list)
Success! abstol and log-log tests passed at 82 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -472,16 +498,16 @@ 9.3 Stochastic level models
9.3.1 Stochastic level with drift
We can add a drift term to the level model to allow the level to tend upward or downward with a deterministic rate \(\eta\). This is a random walk with bias.
-\[\begin{equation}
+\[\begin{equation}
\tag{9.7}
y_t = \alpha_t + e_t \\
\alpha_t = \alpha_{t-1} + \eta + w_t \\
\Downarrow \\
y_t = x_t + v_t \\
x_t = x_{t-1} + u + w_t
-\end{equation}\]
+\end{equation}\]
We can allow that the drift term \(\eta\) evolves over time along with the level. In this case, \(\eta\) is modeled as a random walk along with \(\alpha\). This model is
-\[\begin{equation}
+\[\begin{equation}
\tag{9.8}
y_t = \alpha_t + e_t \\
\alpha_t = \alpha_{t-1} + \eta_{t-1} + w_{\alpha,t} \\
@@ -508,11 +534,11 @@ 9.3.1 Stochastic level with drift
w_{\eta}
\end{bmatrix}_t
\end{equation}\]
-
Equation (9.9) is a MARSS model.
-\[\begin{equation}
+Equation (9.9) is a MARSS model.
+\[\begin{equation}
y_t = \mathbf{Z}\mathbf{x}_t + v_t \\
\mathbf{x}_t = \mathbf{B}\mathbf{x}_{t-1} + \mathbf{w}_t
-\end{equation}\]
+\end{equation}\]
where \(\mathbf{B}=\begin{bmatrix} 1 & 1 \\ 0 & 1\end{bmatrix}\), \(\mathbf{x}=\begin{bmatrix}\alpha \\ \eta\end{bmatrix}\) and \(\mathbf{Z}=\begin{bmatrix}1&0\end{bmatrix}\).
See Section 6.2 for more discussion of stochastic level models and Section @ref() to see how to fit this model with the StructTS(sec-uss-the-structts-function)
function in the stats package.
@@ -528,24 +554,24 @@ 9.3.1 Stochastic level with drift
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -434,7 +460,7 @@
9.2 DLM in state-space form
-A DLM is a state-space model and can be written in MARSS form:
+A DLM is a state-space model and can be written in MARSS form:
\[\begin{equation}
\tag{9.5}
y_t = \mathbf{F}^{\top}_t \boldsymbol{\theta}_t + e_t \\
@@ -442,15 +468,15 @@ 9.2 DLM in state-space form
\Downarrow \\
y_t = \mathbf{Z}_t \mathbf{x}_t + v_t \\
\mathbf{x}_t = \mathbf{B} \mathbf{x}_{t-1} + \mathbf{w}_t
-\end{equation}\]
+\end{equation}\]
Note that DLMs include predictor variables (covariates) in the observation equation much differently than other forms of MARSS models. In a DLM, \(\mathbf{Z}\) is a matrix of predictor variables and \(\mathbf{x}_t\) are the time-evolving regression parameters.
-\[\begin{equation}
+\[\begin{equation}
y_t = \boxed{\mathbf{Z}_t \mathbf{x}_t} + v_t.
-\end{equation}\]
+\end{equation}\]
In many other MARSS models, \(\mathbf{d}_t\) is a time-varying column vector of covariates and \(\mathbf{D}\) is the matrix of covariate-effect parameters.
-\[\begin{equation}
+\[\begin{equation}
y_t = \mathbf{Z}_t \mathbf{x}_t + \boxed{\mathbf{D} \mathbf{d}_t} +v_t.
-\end{equation}\]
+\end{equation}\]
@@ -463,24 +489,24 @@ 9.2 DLM in state-space form
Let’s add a simple fixed quarter effect to the regression model:
-\[\begin{equation} +\[\begin{equation}
\tag{9.12}
y_t = \alpha_t + \beta_t x_t + \gamma_{qtr} + e_t \\
\gamma_{qtr} =
@@ -445,9 +471,9 @@ 9.5 DLM with seasonal effect
\gamma_{3} & \text{if } qtr = 3 \\
\gamma_{4} & \text{if } qtr = 4
\end{cases}
-\end{equation}\]
+\end{equation}\]
We can write Equation (9.12) in matrix form. In our model for \(\gamma\), we will set the variance to 0 so that the \(\gamma\) does not change with time.
-\[\begin{equation} +\[\begin{equation}
\tag{9.13}
y_t =
\begin{bmatrix}
@@ -469,9 +495,9 @@ 9.5 DLM with seasonal effect
\Downarrow \\
y_t = \mathbf{Z}_{t}\mathbf{x}_{t}+v_t \\
\mathbf{x}_{t} = \mathbf{x}_{t-1}+\mathbf{w}_{t}
-\end{equation}\]
+\end{equation}\]
How do we select the right quarterly effect? Let’s separate out the quarterly effects and add them to \(\mathbf{x}\). We could then select the right \(\gamma\) using 0s and 1s in the \(\mathbf{Z}_t\) matrix. For example, if \(t\) is in quarter 1, our model would be
-\[\begin{equation} +\[\begin{equation}
\tag{9.14}
y_t =
\begin{bmatrix}
@@ -480,9 +506,9 @@ 9.5 DLM with seasonal effect
\begin{bmatrix}
\alpha_t \\ \beta_t \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4
\end{bmatrix} \\
-\end{equation}\]
+\end{equation}\]
While if \(t\) is in quarter 2, the model is
-\[\begin{equation} +\[\begin{equation}
\tag{9.15}
y_t =
\begin{bmatrix}
@@ -491,10 +517,10 @@ 9.5 DLM with seasonal effect
\begin{bmatrix}
\alpha_t \\ \beta_t \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4
\end{bmatrix} \\
-\end{equation}\]
+\end{equation}\]
This would work, but we would have to have a different \(\mathbf{Z}_t\) matrix and it might get cumbersome to keep track of the 0s and 1s. If we wanted the \(\gamma\) to evolve with time, we might need to do this. However, if the \(\gamma\) are fixed, i.e. the quarterly effect does not change over time, a less cumbersome approach is possible.
We could instead keep the \(\mathbf{Z}_t\) matrix the same, but reorder the \(\gamma_i\) within \(\mathbf{x}\). If \(t\) is in quarter 1,
-\[\begin{equation} +\[\begin{equation}
\tag{9.16}
y_t =
\begin{bmatrix}
@@ -504,8 +530,8 @@ 9.5 DLM with seasonal effect
\alpha_t \\ \beta_t \\ \gamma_1 \\ \gamma_2 \\ \gamma_3 \\ \gamma_4
\end{bmatrix} \\
\end{equation}\]
-
While if \(t\) is in quarter 2,
-\[\begin{equation} +While if \(t\) is in quarter 2, +\[\begin{equation}
\tag{9.17}
y_t =
\begin{bmatrix}
@@ -514,7 +540,7 @@ 9.5 DLM with seasonal effect
\begin{bmatrix}
\alpha_t \\ \beta_t \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \\ \gamma_1
\end{bmatrix} \\
-\end{equation}\]
+\end{equation}\]
We can use a non-diagonal \(\mathbf{G}\) to to shift the correct quarter effect within \(\mathbf{x}\).
\[
\mathbf{G} =
@@ -528,7 +554,7 @@ 9.5 DLM with seasonal effect
\end{bmatrix}
\]
With this \(\mathbf{G}\), the \(\gamma\) rotate within \(\mathbf{x}\) with each time step. If \(t\) is in quarter 1, then \(t+1\) is in quarter 2, and we want \(\gamma_2\) to be in the 3rd row.
-\[\begin{equation} +\[\begin{equation}
\tag{9.18}
\begin{bmatrix} \alpha \\ \beta \\ \gamma_2 \\ \gamma_3 \\ \gamma_4 \\ \gamma_1
\end{bmatrix}_{t+1} =
@@ -546,9 +572,9 @@ 9.5 DLM with seasonal effect
\begin{bmatrix}
w_{\alpha} \\ w_{\beta} \\ 0 \\0 \\ 0 \\ 0
\end{bmatrix}_{t}
-\end{equation}\]
+\end{equation}\]
At \(t+2\), we are in quarter 3 and \(\gamma_3\) will be in row 3.
-\[\begin{equation} +\[\begin{equation}
\tag{9.19}
\begin{bmatrix} \alpha \\ \beta \\ \gamma_3 \\ \gamma_4 \\ \gamma_1 \\ \gamma_2
\end{bmatrix}_{t+2} =
@@ -566,7 +592,7 @@ 9.5 DLM with seasonal effect
\begin{bmatrix}
w_{\alpha} \\ w_{\beta} \\ 0 \\0 \\ 0 \\ 0
\end{bmatrix}_{t}
-\end{equation}\]
+\end{equation}\]
A variation of the random walk model described previously is the autoregressive time series model of order 1, AR(1). This model introduces a coefficient, which we will call \(\phi\). The parameter \(\phi\) controls the degree to which the random walk reverts to the mean—when \(\phi = 1\), the model is identical to the random walk, but at smaller values, the model will revert back to the mean (which in this case is zero). Also, \(\phi\) can take on negative values, which we will discuss more in future lectures. The math to describe the AR(1) time series model is:
-\[\begin{equation} +\[\begin{equation} E[y_t] = \phi * y_{t-1} + e_{t-1} \tag{12.6} -\end{equation}\] +\end{equation}\]
The JAGS random walk model and R script to run the AR(1) model is below:
-# 4. AR(1) MODEL WITH AND ESTIMATED AR COEFFICIENT
-# We're introducting a new AR coefficient 'phi', so the model is
-# y[t] ~ N(mu + phi*y[n-1], sigma^2)
-
-model.loc=("ar1_intercept.txt")
-jagsscript = cat("
-model {
- mu ~ dnorm(0, 0.01);
- tau.pro ~ dgamma(0.001,0.001);
- sd.pro <- 1/sqrt(tau.pro);
- phi ~ dnorm(0, 1);
-
- predY[1] <- Y[1];
- for(i in 2:N) {
- predY[i] <- mu + phi * Y[i-1];
- Y[i] ~ dnorm(predY[i], tau.pro);
- }
-}
-",file=model.loc)
-
-jags.data = list("Y"=Wind,"N"=N)
-jags.params=c("sd.pro","predY","mu","phi")
-mod_ar1_intercept = jags(jags.data, parameters.to.save=jags.params,
- model.file=model.loc, n.chains = 3, n.burnin=5000, n.thin=1,
- n.iter=10000, DIC=TRUE)
# 4. AR(1) MODEL WITH AND ESTIMATED AR COEFFICIENT
+# We're introducting a new AR coefficient 'phi', so the model is
+# y[t] ~ N(mu + phi*y[n-1], sigma^2)
+
+model.loc=("ar1_intercept.txt")
+jagsscript = cat("
+model {
+ mu ~ dnorm(0, 0.01);
+ tau.pro ~ dgamma(0.001,0.001);
+ sd.pro <- 1/sqrt(tau.pro);
+ phi ~ dnorm(0, 1);
+
+ predY[1] <- Y[1];
+ for(i in 2:N) {
+ predY[i] <- mu + phi * Y[i-1];
+ Y[i] ~ dnorm(predY[i], tau.pro);
+ }
+}
+",file=model.loc)
+
+jags.data = list("Y"=Wind,"N"=N)
+jags.params=c("sd.pro","predY","mu","phi")
+mod_ar1_intercept = jags(jags.data, parameters.to.save=jags.params,
+ model.file=model.loc, n.chains = 3, n.burnin=5000, n.thin=1,
+ n.iter=10000, DIC=TRUE)
For data for this lab, we will include a dataset on air quality in New York. We will load the data and create a couple new variables for future use. For the majority of our models, we are going to treat wind speed as the response variable for our time series models.
-data(airquality, package="datasets")
-Wind = airquality$Wind # wind speed
-Temp = airquality$Temp # air temperature
-N = dim(airquality)[1] # number of data points
There are a number of different approaches to using Bayesian time series models to perform forecasting. One approach might be to fit a model, and use those posterior distributions to forecast as a secondary step (say within R). A more streamlined approach is to do this within the JAGS code itself. We can take advantage of the fact that JAGS allows you to include NAs in the response variable (but never in the predictors). Let’s use the same Wind dataset, and the univariate state-space model described above to forecast three time steps into the future. We can do this by including 3 more NAs in the dataset, and incrementing the variable N
by 3.
jags.data = list("Y"=c(Wind,NA,NA,NA),"N"=(N+3))
-jags.params=c("sd.q","sd.r","predY","mu")
-model.loc=("ss_model.txt")
-mod_ss_forecast = jags(jags.data, parameters.to.save=jags.params,
- model.file=model.loc, n.chains = 3, n.burnin=5000, n.thin=1,
- n.iter=10000, DIC=TRUE)
jags.data = list("Y"=c(Wind,NA,NA,NA),"N"=(N+3))
+jags.params=c("sd.q","sd.r","predY","mu")
+model.loc=("ss_model.txt")
+mod_ss_forecast = jags(jags.data, parameters.to.save=jags.params,
+ model.file=model.loc, n.chains = 3, n.burnin=5000, n.thin=1,
+ n.iter=10000, DIC=TRUE)
We can inspect the fitted model object, and see that predY
contains the 3 new predictions for the forecasts from this model.
We will start with the simplest time series model possible: linear regression with only an intercept, so that the predicted values of all observations are the same. There are several ways we can write this equation. First, the predicted values can be written as \(E[{y}_{t}] = \mu\). Assuming that the residuals are normally distributed, the model linking our predictions to observed data are written as
\[\begin{equation}
y_t = \mu + e_t, e_t \sim \,\text{N}(0,\sigma^2)
\tag{12.1}
@@ -444,66 +470,61 @@ 12.2 Linear regression with no co
y \sim \,\text{N}(E[y_t],\sigma^2)
\tag{12.2}
\end{equation}\]
-
Remember that in linear regression models, the residual error is interpreted as independent and identically distributed observation error.
+Remember that in linear regression models, the residual error is interpreted as independent and identically distributed observation error.To run the JAGS model, we will need to start by writing the model in JAGS notation. For our linear regression model, one way to construct the model is
-# 1. LINEAR REGRESSION with no covariates
-# no covariates, so intercept only. The parameters are
-# mean 'mu' and precision/variance parameter 'tau.obs'
-
-model.loc="lm_intercept.txt" # name of the txt file
-jagsscript = cat("
-model {
- # priors on parameters
- mu ~ dnorm(0, 0.01); # mean = 0, sd = 1/sqrt(0.01)
- tau.obs ~ dgamma(0.001,0.001); # This is inverse gamma
- sd.obs <- 1/sqrt(tau.obs); # sd is treated as derived parameter
-
- for(i in 1:N) {
- Y[i] ~ dnorm(mu, tau.obs);
- }
-}
-",file=model.loc)
# 1. LINEAR REGRESSION with no covariates
+# no covariates, so intercept only. The parameters are
+# mean 'mu' and precision/variance parameter 'tau.obs'
+
+model.loc="lm_intercept.txt" # name of the txt file
+jagsscript = cat("
+model {
+ # priors on parameters
+ mu ~ dnorm(0, 0.01); # mean = 0, sd = 1/sqrt(0.01)
+ tau.obs ~ dgamma(0.001,0.001); # This is inverse gamma
+ sd.obs <- 1/sqrt(tau.obs); # sd is treated as derived parameter
+
+ for(i in 1:N) {
+ Y[i] ~ dnorm(mu, tau.obs);
+ }
+}
+",file=model.loc)
A couple things to notice: JAGS is not vectorized so we need to use for loops (instead of matrix multiplication) and the dnorm
notation means that we assume that value (on the left) is normally distributed around a particular mean with a particular precision (1 over the square root of the variance).
The model can briefly be summarized as follows: there are 2 parameters in the model (the mean and variance of the observation error). JAGS is a bit funny in that instead of giving a normal distribution the standard deviation or variance, you pass in the precision (1/variance), so our prior on \(\mu\) is pretty vague. The precision receives a gamma prior, which is equivalent to the variance receiving an inverse gamma prior (fairly common for standard Bayesian regression models). We will treat the standard deviation as derived (if we know the variance or precision, which we are estimating, we automatically know the standard deviation). Finally, we write a model for the data \(y_t\) (Y[i]
). Again we use the dnorm
distribution to say that the data are normally distributed (equivalent to our likelihood).
The function from the R2jags package that we actually use to run the model is jags()
. There is a parallel version of the function called jags.parallel()
which is useful for larger, more complex models. The details of both can be found with ?jags
or ?jags.parallel
.
To actually run the model, we need to create several new objects, representing (1) a list of data that we will pass to JAGS, (2) a vector of parameters that we want to monitor in JAGS and have returned back to R, and (3) the name of our text file that contains the JAGS model we wrote above. With those three things, we can call the jags()
function.
jags.data = list("Y"=Wind,"N"=N) # named list of inputs
-jags.params=c("sd.obs","mu") # parameters to be monitored
-mod_lm_intercept = jags(jags.data, parameters.to.save=jags.params,
- model.file=model.loc, n.chains = 3, n.burnin=5000,
- n.thin=1, n.iter=10000, DIC=TRUE)
module glm loaded
+jags.data = list("Y"=Wind,"N"=N) # named list of inputs
+jags.params=c("sd.obs","mu") # parameters to be monitored
+mod_lm_intercept = jags(jags.data, parameters.to.save=jags.params,
+ model.file=model.loc, n.chains = 3, n.burnin=5000,
+ n.thin=1, n.iter=10000, DIC=TRUE)
Notice that the jags()
function contains a number of other important arguments. In general, larger is better for all arguments: we want to run multiple MCMC chains (maybe 3 or more), and have a burn-in of at least 5000. The total number of samples after the burn-in period is n.iter-n.burnin, which in this case is 5000 samples. Because we are doing this with 3 MCMC chains, and the thinning rate equals 1 (meaning we are saving every sample), we will retain a total of 1500 posterior samples for each parameter.
The saved object storing our model diagnostics can be accessed directly, and includes some useful summary output.
-mod_lm_intercept
Inference for Bugs model at "lm_intercept.txt", fit using jags,
3 chains, each with 10000 iterations (first 5000 discarded)
n.sims = 15000 iterations saved
- mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat
-mu 9.948 0.287 9.380 9.758 9.949 10.138 10.515 1.001
-sd.obs 3.540 0.203 3.172 3.400 3.529 3.669 3.967 1.001
-deviance 820.546 2.030 818.590 819.115 819.919 821.298 825.947 1.001
- n.eff
-mu 5200
-sd.obs 15000
-deviance 15000
+ mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
+mu 9.947 0.285 9.390 9.756 9.946 10.137 10.506 1.001 6600
+sd.obs 3.541 0.206 3.162 3.398 3.533 3.673 3.972 1.001 15000
+deviance 820.557 1.995 818.594 819.126 819.937 821.348 825.964 1.001 14000
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
DIC info (using the rule, pD = var(deviance)/2)
-pD = 2.1 and DIC = 822.6
+pD = 2.0 and DIC = 822.5
DIC is an estimate of expected predictive error (lower deviance is better).
The last 2 columns in the summary contain Rhat
(which we want to be close to 1.0), and neff
(the effective sample size of each set of posterior draws). To examine the output more closely, we can pull all of the results directly into R,
attach.jags(mod_lm_intercept)
The following object is masked _by_ .GlobalEnv:
mu
Attaching the R2jags object allows us to work with the named parameters directly in R. For example, we could make a histogram of the posterior distributions of the parameters mu
and sd.obs
with the following code,
# Now we can make plots of posterior values
-par(mfrow = c(2,1))
-hist(mu,40,col="grey",xlab="Mean",main="")
-hist(sd.obs,40,col="grey",xlab=expression(sigma[obs]),main="")
# Now we can make plots of posterior values
+par(mfrow = c(2,1))
+hist(mu,40,col="grey",xlab="Mean",main="")
+hist(sd.obs,40,col="grey",xlab=expression(sigma[obs]),main="")
Finally, we can run some useful diagnostics from the coda package on this model output. We have written a small function to make the creation of mcmc lists (an argument required for many of the diagnostics). The function
-createMcmcList = function(jagsmodel) {
-McmcArray = as.array(jagsmodel$BUGSoutput$sims.array)
-McmcList = vector("list",length=dim(McmcArray)[2])
-for(i in 1:length(McmcList)) McmcList[[i]] = as.mcmc(McmcArray[,i,])
-McmcList = mcmc.list(McmcList)
-return(McmcList)
-}
createMcmcList = function(jagsmodel) {
+McmcArray = as.array(jagsmodel$BUGSoutput$sims.array)
+McmcList = vector("list",length=dim(McmcArray)[2])
+for(i in 1:length(McmcList)) McmcList[[i]] = as.mcmc(McmcArray[,i,])
+McmcList = mcmc.list(McmcList)
+return(McmcList)
+}
Creating the MCMC list preserves the random samples generated from each chain and allows you to extract the samples for a given parameter (such as \(\mu\)) from any chain you want. To extract \(\mu\) from the first chain, for example, you could use the following code. Because createMcmcList()
returns a list of mcmc objects, we can summarize and plot these directly. Figure 12.2 shows the plot from plot(myList[[1]])
.
myList = createMcmcList(mod_lm_intercept)
-summary(myList[[1]])
Iterations = 1:5000
Thinning interval = 1
@@ -532,17 +553,17 @@ 12.2 Linear regression with no co
plus standard error of the mean:
Mean SD Naive SE Time-series SE
-deviance 820.565 2.0657 0.029213 0.029213
-mu 9.953 0.2897 0.004096 0.004096
-sd.obs 3.540 0.2042 0.002888 0.002888
+deviance 820.589 2.0090 0.028412 0.028412
+mu 9.945 0.2870 0.004059 0.004163
+sd.obs 3.539 0.2075 0.002934 0.002934
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
-deviance 818.597 819.115 819.913 821.300 826.194
-mu 9.387 9.761 9.953 10.148 10.515
-sd.obs 3.178 3.401 3.528 3.669 3.974
-plot(myList[[1]])
For more quantitative diagnostics of MCMC convergence, we can rely on the coda package in R. There are several useful statistics available, including the Gelman-Rubin diagnostic (for one or several chains), autocorrelation diagnostics (similar to the ACF you calculated above), the Geweke diagnostic, and Heidelberger-Welch test of stationarity.
-# Run the majority of the diagnostics that CODA() offers
-library(coda)
-gelmanDiags = gelman.diag(createMcmcList(mod_lm_intercept),multivariate=F)
-autocorDiags = autocorr.diag(createMcmcList(mod_lm_intercept))
-gewekeDiags = geweke.diag(createMcmcList(mod_lm_intercept))
-heidelDiags = heidel.diag(createMcmcList(mod_lm_intercept))
For more quantitative diagnostics of MCMC convergence, we can rely on the coda package in R. There +are several useful statistics available, including the Gelman-Rubin diagnostic (for one or several chains), autocorrelation diagnostics (similar to the ACF you calculated above), the Geweke diagnostic, and Heidelberger-Welch test of stationarity.
+# Run the majority of the diagnostics that CODA() offers
+library(coda)
+gelmanDiags = gelman.diag(createMcmcList(mod_lm_intercept),multivariate=F)
+autocorDiags = autocorr.diag(createMcmcList(mod_lm_intercept))
+gewekeDiags = geweke.diag(createMcmcList(mod_lm_intercept))
+heidelDiags = heidel.diag(createMcmcList(mod_lm_intercept))
Increase the MCMC burn-in for the model in question 1 to a value that you think is reasonable. After the model has converged, calculate the Gelman-Rubin diagnostic for the fitted model object.
Compare the results of the plotModelOutput()
function for the intercept only model from section 12.2. You will to add “predY” to your JAGS model and to the list of parameters to monitor, and re-run the model.
Modify the random walk model without drift from section 12.4 to a random walk model with drift. The equation for this model is \[\begin{equation*} E[y_t] = y_{t-1} + \mu + e_{t-1} \end{equation*}\] -
where \(\mu\) is interpreted as the average daily trend in wind speed. What might be a reasonable prior on \(\mu\)?
Plot the posterior distribution of \(\phi\) for the AR(1) model in section 12.5. Can this parameter be well estimated for this dataset?
Plot the posteriors for the process and observation variances (not standard deviation) for the univariate state-space model in section 12.6. Which is larger for this dataset?
Add the effect of temperature to the AR(1) model in section 12.5. Plot the posterior for beta
and compare to the posterior for beta
from the model in section 12.6.1.
Plot the fitted values from the model in section 12.7, including the forecasts, with the 95% credible intervals for each data point.
The following is a dataset from the Upper Skagit River (Puget Sound, 1952-2005) on salmon spawners and recruits:
-Spawners = c(2662,1806,1707,1339,1686,2220,3121,5028,9263,4567,1850,3353,2836,3961,4624,3262,3898,3039,5966,5931,7346,4911,3116,3185,5590,2485,2987,3829,4921,2348,1932,3151,2306,1686,4584,2635,2339,1454,3705,1510,1331,942,884,666,1521,409,2388,1043,3262,2606,4866,1161,3070,3320)
-Recruits = c(12741,15618,23675,37710,62260,32725,8659,28101,17054,29885,33047,20059,35192,11006,48154,35829,46231,32405,20782,21340,58392,21553,27528,28246,35163,15419,16276,32946,11075,16909,22359,8022,16445,2912,17642,2929,7554,3047,3488,577,4511,1478,3283,1633,8536,7019,3947,2789,4606,3545,4421,1289,6416,3647)
-logRS = log(Recruits/Spawners)
Spawners = c(2662,1806,1707,1339,1686,2220,3121,5028,9263,4567,1850,3353,2836,3961,4624,3262,3898,3039,5966,5931,7346,4911,3116,3185,5590,2485,2987,3829,4921,2348,1932,3151,2306,1686,4584,2635,2339,1454,3705,1510,1331,942,884,666,1521,409,2388,1043,3262,2606,4866,1161,3070,3320)
+Recruits = c(12741,15618,23675,37710,62260,32725,8659,28101,17054,29885,33047,20059,35192,11006,48154,35829,46231,32405,20782,21340,58392,21553,27528,28246,35163,15419,16276,32946,11075,16909,22359,8022,16445,2912,17642,2929,7554,3047,3488,577,4511,1478,3283,1633,8536,7019,3947,2789,4606,3545,4421,1289,6416,3647)
+logRS = log(Recruits/Spawners)
Fit the following Ricker model to these data using the following linear form of this model with normally distributed errors:
\[\begin{equation*}
log(R_t/S_t) = a + b \times S_t + e_t,\text{ where } e_t \sim \,\text{N}(0,\sigma^2)
\end{equation*}\]
-
You will recognize that this form is exactly the same as linear regression, with independent errors (very similar to the intercept only model of Wind we fit in section 12.2).
Within the constraints of the Ricker model, think about other ways you might want to treat the errors. The basic model described above has independent errors that are not correlated in time. Approaches to analyzing this dataset might involve
In our first model, the errors were independent in time. We are going to modify this to model autocorrelated errors. Autocorrelated errors are widely used in ecology and other fields – for a greater discussion, see Morris and Doak (2002) Quantitative Conservation Biology. To make the deviations autocorrelated, we start by defining the deviation in the first time step, \(e_1 = Y_1 - u\). The expectation of \(y_t\) in each time step is then written as
-\[\begin{equation} +\[\begin{equation} E[y_t] = \mu + \phi * e_{t-1} \tag{12.3} -\end{equation}\] -In addition to affecting the expectation, the correlation parameter \(\phi\) also affects the variance of the errors, so that +\end{equation}\]
+In addition to affecting the expectation, the correlation parameter \(\phi\) also affects the variance of the errors, so that \[\begin{equation} \sigma^2 = \psi^2\left( 1-\phi^2 \right) \tag{12.4} -\end{equation}\] +\end{equation}\]
Like in our first model, we assume that the data follow a normal likelihood (or equivalently that the residuals are normally distributed), \({y_t} = E[{y_t}] + {e}_{t}\), or \({y_t} \sim \,\text{N}(E[y_t], \sigma^2)\). Thus, it is possible to express the subsequent deviations as \(e_t = y_t - E[y_t]\), or equivalently as \(e_t = y_t - \mu -\phi \times e_{t-1}\). The JAGS script for this model is:
-# 2. LINEAR REGRESSION WITH AUTOCORRELATED ERRORS
-# no covariates, so intercept only.
-
-model.loc=("lmcor_intercept.txt")
-jagsscript = cat("
-model {
- # priors on parameters
- mu ~ dnorm(0, 0.01);
- tau.obs ~ dgamma(0.001,0.001);
- sd.obs <- 1/sqrt(tau.obs);
- phi ~ dunif(-1,1);
- tau.cor <- tau.obs / (1-phi*phi); # Var = sigma2 * (1-rho^2)
-
- epsilon[1] <- Y[1] - mu;
- predY[1] <- mu; # initial value
- for(i in 2:N) {
- predY[i] <- mu + phi * epsilon[i-1];
- Y[i] ~ dnorm(predY[i], tau.cor);
- epsilon[i] <- (Y[i] - mu) - phi*epsilon[i-1];
- }
-}
-",file=model.loc)
# 2. LINEAR REGRESSION WITH AUTOCORRELATED ERRORS
+# no covariates, so intercept only.
+
+model.loc=("lmcor_intercept.txt")
+jagsscript = cat("
+model {
+ # priors on parameters
+ mu ~ dnorm(0, 0.01);
+ tau.obs ~ dgamma(0.001,0.001);
+ sd.obs <- 1/sqrt(tau.obs);
+ phi ~ dunif(-1,1);
+ tau.cor <- tau.obs / (1-phi*phi); # Var = sigma2 * (1-rho^2)
+
+ epsilon[1] <- Y[1] - mu;
+ predY[1] <- mu; # initial value
+ for(i in 2:N) {
+ predY[i] <- mu + phi * epsilon[i-1];
+ Y[i] ~ dnorm(predY[i], tau.cor);
+ epsilon[i] <- (Y[i] - mu) - phi*epsilon[i-1];
+ }
+}
+",file=model.loc)
Notice several subtle changes from the simpler first model: (1) we are estimating the autocorrelation parameter \(\phi\), which is assigned a Uniform(-1, 1) prior, (2) we model the residual variance as a function of the autocorrelation, and (3) we allow the autocorrelation to affect the predicted values predY
. One other change we can make is to add predY
to the list of parameters we want returned to R.
jags.data = list("Y"=Wind,"N"=N)
-jags.params=c("sd.obs","predY","mu","phi")
-mod_lmcor_intercept = jags(jags.data, parameters.to.save=jags.params,
- model.file=model.loc, n.chains = 3, n.burnin=5000,
- n.thin=1, n.iter=10000, DIC=TRUE)
jags.data = list("Y"=Wind,"N"=N)
+jags.params=c("sd.obs","predY","mu","phi")
+mod_lmcor_intercept = jags(jags.data, parameters.to.save=jags.params,
+ model.file=model.loc, n.chains = 3, n.burnin=5000,
+ n.thin=1, n.iter=10000, DIC=TRUE)
For some models, we may be interested in examining the posterior fits to data. You can make this plot yourself, but we have also put together a simple function whose arguments are one of our fitted models and the raw data. The function is:
-plotModelOutput = function(jagsmodel, Y) {
-# attach the model
-attach.jags(jagsmodel)
-x = seq(1,length(Y))
-summaryPredictions = cbind(apply(predY,2,quantile,0.025), apply(predY,2,mean),
-apply(predY,2,quantile,0.975))
-plot(Y, col="white",ylim=c(min(c(Y,summaryPredictions)),max(c(Y,summaryPredictions))),
-xlab="",ylab="95% CIs of predictions and data",main=paste("JAGS results:",
-jagsmodel$model.file))
-polygon(c(x,rev(x)), c(summaryPredictions[,1], rev(summaryPredictions[,3])),
-col="grey70",border=NA)
-lines(summaryPredictions[,2])
-points(Y)
-}
plotModelOutput = function(jagsmodel, Y) {
+# attach the model
+attach.jags(jagsmodel)
+x = seq(1,length(Y))
+summaryPredictions = cbind(apply(predY,2,quantile,0.025), apply(predY,2,mean),
+apply(predY,2,quantile,0.975))
+plot(Y, col="white",ylim=c(min(c(Y,summaryPredictions)),max(c(Y,summaryPredictions))),
+xlab="",ylab="95% CIs of predictions and data",main=paste("JAGS results:",
+jagsmodel$model.file))
+polygon(c(x,rev(x)), c(summaryPredictions[,1], rev(summaryPredictions[,3])),
+col="grey70",border=NA)
+lines(summaryPredictions[,2])
+points(Y)
+}
We can use the function to plot the predicted posterior mean with 95% CIs, as well as the raw data. For example, try
-plotModelOutput(mod_lmcor_intercept, Wind)
The following object is masked _by_ .GlobalEnv:
@@ -512,24 +538,24 @@ 12.3 Regression with autocorrelat
All of the previous three models can be interpreted as observation error models. Switching gears, we can alternatively model error in the state of nature, creating process error models. A simple process error model that many of you may have seen before is the random walk model. In this model, the assumption is that the true state of nature (or latent states) are measured perfectly. Thus, all uncertainty is originating from process variation (for ecological problems, this is often interpreted as environmental variation). For this simple model, we will assume that our process of interest (in this case, daily wind speed) exhibits no daily trend, but behaves as a random walk.
-\[\begin{equation} +\[\begin{equation} E[y_t] = y_{t-1} + e_{t-1} \tag{12.5} -\end{equation}\] +\end{equation}\]
And the \(e_t \sim \,\text{N}(0, \sigma^2)\). Remember back to the autocorrelated model (or MA(1) models) that we assumed that the errors \(e_t\) followed a random walk. In contrast, the AR(1) model assumes that the errors are independent, but that the state of nature follows a random walk. The JAGS random walk model and R script to run it is below:
-# 3. AR(1) MODEL WITH NO ESTIMATED AR COEFFICIENT = RANDOM WALK
-# no covariates. The model is y[t] ~ Normal(y[n-1], sigma) for
-# we will call the precision tau.pro
-# Note too that we have to define predY[1]
-model.loc=("rw_intercept.txt")
-jagsscript = cat("
-model {
- mu ~ dnorm(0, 0.01);
- tau.pro ~ dgamma(0.001,0.001);
- sd.pro <- 1/sqrt(tau.pro);
-
- predY[1] <- mu; # initial value
- for(i in 2:N) {
- predY[i] <- Y[i-1];
- Y[i] ~ dnorm(predY[i], tau.pro);
- }
-}
-",file=model.loc)
-
-jags.data = list("Y"=Wind,"N"=N)
-jags.params=c("sd.pro","predY","mu")
-mod_rw_intercept = jags(jags.data, parameters.to.save=jags.params, model.file=model.loc,
-n.chains = 3, n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
# 3. AR(1) MODEL WITH NO ESTIMATED AR COEFFICIENT = RANDOM WALK
+# no covariates. The model is y[t] ~ Normal(y[n-1], sigma) for
+# we will call the precision tau.pro
+# Note too that we have to define predY[1]
+model.loc=("rw_intercept.txt")
+jagsscript = cat("
+model {
+ mu ~ dnorm(0, 0.01);
+ tau.pro ~ dgamma(0.001,0.001);
+ sd.pro <- 1/sqrt(tau.pro);
+
+ predY[1] <- mu; # initial value
+ for(i in 2:N) {
+ predY[i] <- Y[i-1];
+ Y[i] ~ dnorm(predY[i], tau.pro);
+ }
+}
+",file=model.loc)
+
+jags.data = list("Y"=Wind,"N"=N)
+jags.params=c("sd.pro","predY","mu")
+mod_rw_intercept = jags(jags.data, parameters.to.save=jags.params, model.file=model.loc,
+n.chains = 3, n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
At this point, we have fit models with observation or process error, but we have not tried to estimate both simultaneously. We will do so here, and introduce some new notation to describe the process model and observation model. We use the notation \(x_t\) to denote the latent state or state of nature (which is unobserved) at time \(t\) and \(y_t\) to denote the observed data. For introductory purposes, we will make the process model autoregressive (similar to our AR(1) model),
-\[\begin{equation} +\[\begin{equation} x_t = \phi * x_{t-1} + e_{t-1}; e_{t-1} \sim \,\text{N}(0,q) \tag{12.7} -\end{equation}\] -For the process model, there are a number of ways to parameterize the first state (\(x_1\)), and we will talk about this more in the class. For the sake of this model, we will place a vague weakly informative prior on \(x_1\): \(x_1 \sim \,\text{N}(0, 0.01)\). Second, we need to construct an observation model linking the estimate unseen states of nature \(x_t\) to the data \(y_t\). For simplicitly, we will assume that the observation errors are indepdendent and identically distributed, with no observation component. Mathematically, this model is +\end{equation}\]
+For the process model, there are a number of ways to parameterize the first state (\(x_1\)), and we will talk about this more in the class. For the sake of this model, we will place a vague weakly informative prior on \(x_1\): \(x_1 \sim \,\text{N}(0, 0.01)\). Second, we need to construct an observation model linking the estimate unseen states of nature \(x_t\) to the data \(y_t\). For simplicitly, we will assume that the observation errors are indepdendent and identically distributed, with no observation component. Mathematically, this model is \[\begin{equation} y_t \sim \,\text{N}(x_t, r) \tag{12.8} \end{equation}\] -
In the two above models, \(q\) is the process variance and \(r\) is the observation error variance. The JAGS code will use the standard deviation (square root) of these. The code to produce and fit this model is below:
-# 5. MAKE THE SS MODEL a univariate random walk
-# no covariates.
-
-model.loc=("ss_model.txt")
-jagsscript = cat("
-model {
- # priors on parameters
- mu ~ dnorm(0, 0.01);
- tau.pro ~ dgamma(0.001,0.001);
- sd.q <- 1/sqrt(tau.pro);
- tau.obs ~ dgamma(0.001,0.001);
- sd.r <- 1/sqrt(tau.obs);
- phi ~ dnorm(0,1);
-
- X[1] <- mu;
- predY[1] <- X[1];
- Y[1] ~ dnorm(X[1], tau.obs);
-
- for(i in 2:N) {
- predX[i] <- phi*X[i-1];
- X[i] ~ dnorm(predX[i],tau.pro); # Process variation
- predY[i] <- X[i];
- Y[i] ~ dnorm(X[i], tau.obs); # Observation variation
- }
-}
-",file=model.loc)
-
-jags.data = list("Y"=Wind,"N"=N)
-jags.params=c("sd.q","sd.r","predY","mu")
-mod_ss = jags(jags.data, parameters.to.save=jags.params, model.file=model.loc, n.chains = 3,
-n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
# 5. MAKE THE SS MODEL a univariate random walk
+# no covariates.
+
+model.loc=("ss_model.txt")
+jagsscript = cat("
+model {
+ # priors on parameters
+ mu ~ dnorm(0, 0.01);
+ tau.pro ~ dgamma(0.001,0.001);
+ sd.q <- 1/sqrt(tau.pro);
+ tau.obs ~ dgamma(0.001,0.001);
+ sd.r <- 1/sqrt(tau.obs);
+ phi ~ dnorm(0,1);
+
+ X[1] <- mu;
+ predY[1] <- X[1];
+ Y[1] ~ dnorm(X[1], tau.obs);
+
+ for(i in 2:N) {
+ predX[i] <- phi*X[i-1];
+ X[i] ~ dnorm(predX[i],tau.pro); # Process variation
+ predY[i] <- X[i];
+ Y[i] ~ dnorm(X[i], tau.obs); # Observation variation
+ }
+}
+",file=model.loc)
+
+jags.data = list("Y"=Wind,"N"=N)
+jags.params=c("sd.q","sd.r","predY","mu")
+mod_ss = jags(jags.data, parameters.to.save=jags.params, model.file=model.loc, n.chains = 3,
+n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
Returning to the first example of regression with the intercept only, we will introduce Temp
as the covariate explaining our response variable Wind
. Note that to include the covariate, we (1) modify the JAGS script to include a new coefficient—in this case beta
, (2) update the predictive equation to include the effects of the new covariate, and (3) we include the new covariate in our named data list.
# 6. Include some covariates in a linear regression
-# Use temperature as a predictor of wind
-
-model.loc=("lm.txt")
-jagsscript = cat("
-model {
- mu ~ dnorm(0, 0.01);
- beta ~ dnorm(0,0.01);
- tau.obs ~ dgamma(0.001,0.001);
- sd.obs <- 1/sqrt(tau.obs);
-
- for(i in 1:N) {
- predY[i] <- mu + C[i]*beta;
- Y[i] ~ dnorm(predY[i], tau.obs);
- }
-}
-",file=model.loc)
-
-jags.data = list("Y"=Wind,"N"=N,"C"=Temp)
-jags.params=c("sd.obs","predY","mu","beta")
-mod_lm = jags(jags.data, parameters.to.save=jags.params,
- model.file=model.loc, n.chains = 3, n.burnin=5000,
- n.thin=1, n.iter=10000, DIC=TRUE)
# 6. Include some covariates in a linear regression
+# Use temperature as a predictor of wind
+
+model.loc=("lm.txt")
+jagsscript = cat("
+model {
+ mu ~ dnorm(0, 0.01);
+ beta ~ dnorm(0,0.01);
+ tau.obs ~ dgamma(0.001,0.001);
+ sd.obs <- 1/sqrt(tau.obs);
+
+ for(i in 1:N) {
+ predY[i] <- mu + C[i]*beta;
+ Y[i] ~ dnorm(predY[i], tau.obs);
+ }
+}
+",file=model.loc)
+
+jags.data = list("Y"=Wind,"N"=N,"C"=Temp)
+jags.params=c("sd.obs","predY","mu","beta")
+mod_lm = jags(jags.data, parameters.to.save=jags.params,
+ model.file=model.loc, n.chains = 3, n.burnin=5000,
+ n.thin=1, n.iter=10000, DIC=TRUE)
Let’s fit the same model as in Section 7.9 with Stan using the rstan package. If you have not already, you will need to install the rstan package. This package depends on a number of other packages which should install automatically when you install rstan.
First we write the model. We could write this to a file (recommended), but for this example, we write as a character object. Though the syntax is different from the JAGS code, it has many similarities. Note that Stan does not allow missing values in the data, thus we need to pass in only the non-missing values along with the row and column indices of those values. The latter is so we can match them to the appropriate state (\(x\)) values.
-scode <- "
-data {
- int<lower=0> TT; // length of ts
- int<lower=0> N; // num of ts; rows of y
- int<lower=0> n_pos; // number of non-NA values in y
- int<lower=0> col_indx_pos[n_pos]; // col index of non-NA vals
- int<lower=0> row_indx_pos[n_pos]; // row index of non-NA vals
- vector[n_pos] y;
-}
-parameters {
- vector[N] x0; // initial states
- real u;
- vector[N] pro_dev[TT]; // refed as pro_dev[TT,N]
- real<lower=0> sd_q;
- real<lower=0> sd_r[N]; // obs variances are different
-}
-transformed parameters {
- vector[N] x[TT]; // refed as x[TT,N]
- for(i in 1:N){
- x[1,i] = x0[i] + u + pro_dev[1,i];
- for(t in 2:TT) {
- x[t,i] = x[t-1,i] + u + pro_dev[t,i];
- }
- }
-}
-model {
- sd_q ~ cauchy(0,5);
- for(i in 1:N){
- x0[i] ~ normal(y[i],10); // assume no missing y[1]
- sd_r[i] ~ cauchy(0,5);
- for(t in 1:TT){
- pro_dev[t,i] ~ normal(0, sd_q);
- }
- }
- u ~ normal(0,2);
- for(i in 1:n_pos){
- y[i] ~ normal(x[col_indx_pos[i], row_indx_pos[i]], sd_r[row_indx_pos[i]]);
- }
-}
-generated quantities {
- vector[n_pos] log_lik;
- for (n in 1:n_pos) log_lik[n] = normal_lpdf(y[n] | x[col_indx_pos[n], row_indx_pos[n]], sd_r[row_indx_pos[n]]);
-}
-"
scode <- "
+data {
+ int<lower=0> TT; // length of ts
+ int<lower=0> N; // num of ts; rows of y
+ int<lower=0> n_pos; // number of non-NA values in y
+ int<lower=0> col_indx_pos[n_pos]; // col index of non-NA vals
+ int<lower=0> row_indx_pos[n_pos]; // row index of non-NA vals
+ vector[n_pos] y;
+}
+parameters {
+ vector[N] x0; // initial states
+ real u;
+ vector[N] pro_dev[TT]; // refed as pro_dev[TT,N]
+ real<lower=0> sd_q;
+ real<lower=0> sd_r[N]; // obs variances are different
+}
+transformed parameters {
+ vector[N] x[TT]; // refed as x[TT,N]
+ for(i in 1:N){
+ x[1,i] = x0[i] + u + pro_dev[1,i];
+ for(t in 2:TT) {
+ x[t,i] = x[t-1,i] + u + pro_dev[t,i];
+ }
+ }
+}
+model {
+ sd_q ~ cauchy(0,5);
+ for(i in 1:N){
+ x0[i] ~ normal(y[i],10); // assume no missing y[1]
+ sd_r[i] ~ cauchy(0,5);
+ for(t in 1:TT){
+ pro_dev[t,i] ~ normal(0, sd_q);
+ }
+ }
+ u ~ normal(0,2);
+ for(i in 1:n_pos){
+ y[i] ~ normal(x[col_indx_pos[i], row_indx_pos[i]], sd_r[row_indx_pos[i]]);
+ }
+}
+generated quantities {
+ vector[n_pos] log_lik;
+ for (n in 1:n_pos) log_lik[n] = normal_lpdf(y[n] | x[col_indx_pos[n], row_indx_pos[n]], sd_r[row_indx_pos[n]]);
+}
+"
Then we call stan()
and pass in the data, names of parameter we wish to have returned, and information on number of chains, samples (iter), and thinning. The output is verbose (hidden here) and may have some warnings.
ypos <- Y[!is.na(Y)]
-n_pos <- length(ypos) #number on non-NA ys
-indx_pos <- which(!is.na(Y), arr.ind=TRUE) #index on the non-NAs
-col_indx_pos <- as.vector(indx_pos[,"col"])
-row_indx_pos <- as.vector(indx_pos[,"row"])
-mod <- rstan::stan(model_code = scode,
- data = list(y=ypos, TT=ncol(Y), N=nrow(Y), n_pos=n_pos,
-col_indx_pos=col_indx_pos, row_indx_pos=row_indx_pos),
- pars = c("sd_q","x", "sd_r", "u", "x0"),
- chains = 3,
- iter = 1000,
- thin = 1)
ypos <- Y[!is.na(Y)]
+n_pos <- length(ypos) #number on non-NA ys
+indx_pos <- which(!is.na(Y), arr.ind=TRUE) #index on the non-NAs
+col_indx_pos <- as.vector(indx_pos[,"col"])
+row_indx_pos <- as.vector(indx_pos[,"row"])
+mod <- rstan::stan(model_code = scode,
+ data = list(y=ypos, TT=ncol(Y), N=nrow(Y), n_pos=n_pos,
+col_indx_pos=col_indx_pos, row_indx_pos=row_indx_pos),
+ pars = c("sd_q","x", "sd_r", "u", "x0"),
+ chains = 3,
+ iter = 1000,
+ thin = 1)
We use extract()
to extract the parameters from the fitted model and then the means and 95% credible intervals.
pars <- rstan::extract(mod)
-means <- apply(pars$x, c(2,3), mean)
-upperCI <- apply(pars$x, c(2,3), quantile, 0.975)
-lowerCI <- apply(pars$x, c(2,3), quantile, 0.025)
-colnames(means) <- colnames(upperCI) <- colnames(lowerCI) <- rownames(Y)
No id variables; using all as measure variables
No id variables; using all as measure variables
No id variables; using all as measure variables
@@ -521,24 +547,24 @@ Now let’s say that the plants have different owners, Sue and Aneesh, and we want to have \(\beta\) for the air flow effect vary by owner. If the plant is in the north and owned by Sue, the model is
\[\begin{equation}
\tag{2.17}
stack.loss_i = \alpha_n + \beta_s air_i + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2)
@@ -444,10 +470,10 @@ 2.5 Groups of
dat = cbind(dat, owner=c("s","a"))
-dat
Air.Flow Water.Temp Acid.Conc. stack.loss reg owner
1 80 27 89 42 n s
2 80 27 88 37 s a
@@ -455,13 +481,13 @@ 2.5 Groups of coef(lm(stack.loss ~ -1 + Air.Flow:owner + reg, data=dat))
+
regn regs Air.Flow:ownera Air.Flow:owners
-38.0 -3.0 0.5 1.0
Notice that we have 4 datapoints and are estimating 4 parameters. We are not going to be able to estimate any more parameters than data points. If we want to estimate any more, we’ll need to use the fuller stackflow dataset (which has 21 data points).
2.5.1 Owner \(\beta\)’s in Form 1
-Written in Form 1, this model is
+Written in Form 1, this model is
\[\begin{equation}
\tag{2.19}
\begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix}
@@ -471,19 +497,19 @@ 2.5.1 Owner \(\beta\)’s are ordered to be alphabetical because lm()
writes the \(\mathbf{Z}\) matrix like that.
+The air data have been written to the right of the 1s and 0s for north/south intercepts because that is how lm()
writes this model in Form 1 and I want to duplicate that (for teaching purposes). Also the \(\beta\)’s are ordered to be alphabetical because lm()
writes the \(\mathbf{Z}\) matrix like that.
Now our model is more complicated and using model.matrix()
to get our \(\mathbf{Z}\) saves us a lot tedious matrix building.
-fit3=lm(stack.loss ~ -1 + Air.Flow:owner + reg, data=dat)
-Z=model.matrix(fit3)
-Z[1:4,]
+
regn regs Air.Flow:ownera Air.Flow:owners
1 1 0 0 80
2 0 1 80 0
3 1 0 0 75
4 0 1 62 0
Notice the matrix output by model.matrix()
looks exactly like \(\mathbf{Z}\) in Equation (2.19) (ignore the attributes info). Now we can solve for the parameters:
-y=matrix(dat$stack.loss, ncol=1)
-solve(t(Z)%*%Z)%*%t(Z)%*%y
+
[,1]
regn -38.0
regs -3.0
@@ -493,7 +519,7 @@ 2.5.1 Owner
2.5.2 Owner \(\beta\)’s in Form 2
-To write this model in Form 2, we just add subscripts to the \(\beta\)’s in our Form 2 \(\mathbf{Z}\) matrix:
+To write this model in Form 2, we just add subscripts to the \(\beta\)’s in our Form 2 \(\mathbf{Z}\) matrix:
\[\begin{equation}
\tag{2.20}
\begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix}
@@ -506,19 +532,19 @@ 2.5.2 Owner \(\beta\)’s in our \(\mathbf{Z}\) list matrix to have owner designations:
-y=matrix(dat$stack.loss, ncol=1)
-x=matrix(c(1,dat$Air.Flow),ncol=1)
-n=nrow(dat)
-k=1
-Z=matrix(list(0),n,k*n+1)
-Z[seq(1,n,2),1]="alpha.n"
-Z[seq(2,n,2),1]="alpha.s"
-diag(Z[1:n,1+1:n])=rep(c("beta.s","beta.a"),n)[1:n]
-P=MARSS:::convert.model.mat(Z)$free[,,1]
-M=kronecker(t(x),diag(n))%*%P
-solve(t(M)%*%M)%*%t(M)%*%y
+y=matrix(dat$stack.loss, ncol=1)
+x=matrix(c(1,dat$Air.Flow),ncol=1)
+n=nrow(dat)
+k=1
+Z=matrix(list(0),n,k*n+1)
+Z[seq(1,n,2),1]="alpha.n"
+Z[seq(2,n,2),1]="alpha.s"
+diag(Z[1:n,1+1:n])=rep(c("beta.s","beta.a"),n)[1:n]
+P=MARSS:::convert.model.mat(Z)$free[,,1]
+M=kronecker(t(x),diag(n))%*%P
+solve(t(M)%*%M)%*%t(M)%*%y
[,1]
alpha.n -38.0
alpha.s -3.0
@@ -538,24 +564,24 @@ 2.5.2 Owner
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,17 +461,17 @@
2.8 Models with confounded parameters*
Try adding region as another factor in your model along with quarter and fit with lm()
:
-coef(lm(stack.loss ~ -1 + Air.Flow + reg + qtr, data=fulldat))
+
Air.Flow regn regs qtrqtr2 qtrqtr3 qtrqtr4
1.066524 -49.024320 -44.831760 -3.066094 3.499428 NA
The estimate for quarter 1 is gone (actually it was set to 0) and the estimate for quarter 4 is NA. Look at the \(\mathbf{Z}\) matrix for Form 1 and see if you can figure out the problem. Try also writing out the model for the 1st plant and you’ll see what part of the problem is and why the estimate for quarter 1 is fixed at 0.
-fit=lm(stack.loss ~ -1 + Air.Flow + reg + qtr, data=fulldat)
-Z=model.matrix(fit)
+
But why is the estimate for quarter 4 equal to NA? What if the ordering of north and south regions was different, say 1 through 4 north, 5 through 8 south, 9 through 12 north, etc?
-fulldat2=fulldat
-fulldat2$reg2 = rep(c("n","n","n","n","s","s","s","s"),3)[1:21]
-fit=lm(stack.loss ~ Air.Flow + reg2 + qtr, data=fulldat2)
-coef(fit)
+fulldat2=fulldat
+fulldat2$reg2 = rep(c("n","n","n","n","s","s","s","s"),3)[1:21]
+fit=lm(stack.loss ~ Air.Flow + reg2 + qtr, data=fulldat2)
+coef(fit)
(Intercept) Air.Flow reg2s qtrqtr2 qtrqtr3 qtrqtr4
-45.6158421 1.0407975 -3.5754722 0.7329027 3.0389763 3.6960928
Now an estimate for quarter 4 appears.
@@ -464,24 +490,24 @@ 2.8 Models with confounded parame
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -434,7 +460,7 @@
2.2 Matrix Form 1
-In this form, we have the explanatory variables in a matrix on the left of our parameter matrix:
+In this form, we have the explanatory variables in a matrix on the left of our parameter matrix:
\[\begin{equation}
\tag{2.2}
\begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\\stack.loss_4\end{bmatrix}
@@ -444,16 +470,16 @@ 2.2 Matrix Form 1
+
\begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}
\end{equation}\]
-
You should work through the matrix algebra to make sure you understand why Equation (2.2) is Equation (2.1) for all the \(i\) data points together.
-We can write the first line of Equation (2.2) succinctly as
+You should work through the matrix algebra to make sure you understand why Equation (2.2) is Equation (2.1) for all the \(i\) data points together.
+We can write the first line of Equation (2.2) succinctly as
\[\begin{equation}
\tag{2.3}
\mathbf{y} = \mathbf{Z}\mathbf{x} + \mathbf{e}
\end{equation}\]
-
where \(\mathbf{x}\) are our parameters, \(\mathbf{y}\) are our response variables, and \(\mathbf{Z}\) are our explanatory variables (with a 1 column for the intercept). The lm()
function uses Form 1, and we can recover the \(\mathbf{Z}\) matrix for Form 1 by using the model.matrix()
function on the output from a lm()
call:
-fit=lm(stack.loss ~ Air.Flow, data=dat)
-Z=model.matrix(fit)
-Z[1:4,]
+where \(\mathbf{x}\) are our parameters, \(\mathbf{y}\) are our response variables, and \(\mathbf{Z}\) are our explanatory variables (with a 1 column for the intercept). The lm()
function uses Form 1, and we can recover the \(\mathbf{Z}\) matrix for Form 1 by using the model.matrix()
function on the output from a lm()
call:
+
(Intercept) Air.Flow
1 1 80
2 1 80
@@ -463,7 +489,7 @@ 2.2 Matrix Form 1
2.2.1 Solving for the parameters
Note: You will not need to know how to solve linear matrix equations for this course. This section just shows you what the lm()
function is doing to estimate the parameters.
Notice that \(\mathbf{Z}\) is not a square matrix and its inverse does not exist but the inverse of \(\mathbf{Z}^\top\mathbf{Z}\) exists—if this is a solveable problem. We can go through the following steps to solve for \(\mathbf{x}\), our parameters \(\alpha\) and \(\beta\).
-Start with \(\mathbf{y} = \mathbf{Z}\mathbf{x} + \mathbf{e}\) and multiply by \(\mathbf{Z}^\top\) on the left to get
+Start with \(\mathbf{y} = \mathbf{Z}\mathbf{x} + \mathbf{e}\) and multiply by \(\mathbf{Z}^\top\) on the left to get
\[\begin{equation*}
\mathbf{Z}^\top\mathbf{y} = \mathbf{Z}^\top\mathbf{Z}\mathbf{x} + \mathbf{Z}^\top\mathbf{e}
\end{equation*}\]
@@ -478,8 +504,8 @@
2.2.1 Solving for the parameters<
Move \(\mathbf{x}\) to the right by itself, to get
\[\begin{equation*}
(\mathbf{Z}^\top\mathbf{Z})^{-1}\mathbf{Z}^\top\mathbf{y} - (\mathbf{Z}^\top\mathbf{Z})^{-1}\mathbf{Z}^\top\mathbf{e} = \mathbf{x}
-\end{equation*}\]
-Let’s assume our errors, the \(\mathbf{e}\), are i.i.d. which means that
+\end{equation*}\]
+
Let’s assume our errors, the \(\mathbf{e}\), are i.i.d. which means that
\[\begin{equation*}
\mathbf{e} \sim \text{MVN}\begin{pmatrix}0,
\begin{bmatrix}
@@ -487,32 +513,33 @@ 2.2.1 Solving for the parameters<
\end{bmatrix}
\end{pmatrix}
\end{equation*}\]
-This equation means \(\mathbf{e}\) is drawn from a multivariate normal distribution with a variance-covariance matrix that is diagonal with equal variances. Under that assumption, the expected value of \((\mathbf{Z}^\top\mathbf{Z})^{-1}\mathbf{Z}^\top\mathbf{e}\) is zero. So we can solve for \(\mathbf{x}\) as
+This equation means \(\mathbf{e}\) is drawn from a multivariate normal distribution with a variance-covariance matrix that is diagonal with equal variances.
+Under that assumption, the expected value of \((\mathbf{Z}^\top\mathbf{Z})^{-1}\mathbf{Z}^\top\mathbf{e}\) is zero. So we can solve for \(\mathbf{x}\) as
\[\begin{equation*}
\mathbf{x} = (\mathbf{Z}^\top\mathbf{Z})^{-1}\mathbf{Z}^\top\mathbf{y}
-\end{equation*}\]
+\end{equation*}\]
Let’s try that with R and compare to what you get with lm()
:
-y=matrix(dat$stack.loss, ncol=1)
-Z=cbind(1,dat$Air.Flow) #or use model.matrix() to get Z
-solve(t(Z)%*%Z)%*%t(Z)%*%y
+y=matrix(dat$stack.loss, ncol=1)
+Z=cbind(1,dat$Air.Flow) #or use model.matrix() to get Z
+solve(t(Z)%*%Z)%*%t(Z)%*%y
[,1]
[1,] -11.6159170
[2,] 0.6412918
-coef(lm(stack.loss ~ Air.Flow, data=dat))
+
(Intercept) Air.Flow
-11.6159170 0.6412918
As you see, you get the same values.
2.2.2 Form 1 with multiple explanatory variables
-We can easily extend Form 1 to multiple explanatory variables. Let’s say we wanted to fit this model:
+We can easily extend Form 1 to multiple explanatory variables. Let’s say we wanted to fit this model:
\[\begin{equation}
\tag{2.4}
stack.loss_i = \alpha + \beta_1 air_i + \beta_2 water_i + \beta_3 acid_i + e_i
\end{equation}\]
-
With lm()
, we can fit this with
-fit1.mult=lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc., data=dat)
-Written in matrix form (Form 1), this is
+With lm()
, we can fit this with
+
+Written in matrix form (Form 1), this is
\[\begin{equation}
\tag{2.5}
\begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\\stack.loss_4\end{bmatrix}
@@ -522,9 +549,9 @@ 2.2.2 Form 1 with multiple explan
+
\begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}
\end{equation}\]
-
Now \(\mathbf{Z}\) is a matrix with 4 columns and \(\mathbf{x}\) is a column vector with 4 rows. We can show the \(\mathbf{Z}\) matrix again directly from our lm()
fit:
-Z=model.matrix(fit1.mult)
-Z
+Now \(\mathbf{Z}\) is a matrix with 4 columns and \(\mathbf{x}\) is a column vector with 4 rows. We can show the \(\mathbf{Z}\) matrix again directly from our lm()
fit:
+
(Intercept) Air.Flow Water.Temp Acid.Conc.
1 1 80 27 89
2 1 80 27 88
@@ -533,16 +560,16 @@ 2.2.2 Form 1 with multiple explan
attr(,"assign")
[1] 0 1 2 3
We can solve for \(\mathbf{x}\) just like before and compare to what we get with lm()
:
-y=matrix(dat$stack.loss, ncol=1)
-Z=cbind(1,dat$Air.Flow, dat$Water.Temp, dat$Acid.Conc)
-#or Z=model.matrix(fit2)
-solve(t(Z)%*%Z)%*%t(Z)%*%y
+y=matrix(dat$stack.loss, ncol=1)
+Z=cbind(1,dat$Air.Flow, dat$Water.Temp, dat$Acid.Conc)
+#or Z=model.matrix(fit2)
+solve(t(Z)%*%Z)%*%t(Z)%*%y
[,1]
[1,] -524.904762
[2,] -1.047619
[3,] 7.619048
[4,] 5.000000
-coef(fit1.mult)
+
(Intercept) Air.Flow Water.Temp Acid.Conc.
-524.904762 -1.047619 7.619048 5.000000
Take a look at the \(\mathbf{Z}\) we made in R. It looks exactly like what is in our model written in matrix form (Equation (2.5)).
@@ -553,7 +580,7 @@ 2.2.3 When does Form 1 arise?
2.2.4 Matrix Form 1b: The transpose of Form 1
-We could also write Form 1 as follows:
+We could also write Form 1 as follows:
\[\begin{equation}
\tag{2.6}
\begin{split}
@@ -565,14 +592,15 @@ 2.2.4 Matrix Form 1b: The transpo
\begin{bmatrix}e_1&e_2&e_3&e_4\end{bmatrix}
\end{split}
\end{equation}\]
-
This is just the transpose of Form 1. Work through the matrix algebra to make sure you understand why Equation (2.6) is Equation (2.1) for all the \(i\) data points together and why it is equal to the transpose of Equation (2.2). You’ll need the relationship \((\mathbf{A}\mathbf{B})^\top=\mathbf{B}^\top \mathbf{A}^\top\).
-Let’s write Equation (2.6) as \(\mathbf{y} = \mathbf{D}\mathbf{d}\), where \(\mathbf{D}\) contains our parameters. Then we can solve for \(\mathbf{D}\) following the steps in Section 2.2.1 but multiplying from the right instead of from the left. Work through the steps to show that \(\mathbf{d} = \mathbf{y}\mathbf{d}^\top(\mathbf{d}\mathbf{d}^\top)^{-1}\).
-y=matrix(dat$stack.loss, nrow=1)
-d=rbind(1, dat$Air.Flow, dat$Water.Temp, dat$Acid.Conc)
-y%*%t(d)%*%solve(d%*%t(d))
+This is just the transpose of Form 1. Work through the matrix algebra to make sure you understand why Equation (2.6) is Equation (2.1) for all the \(i\) data points together and why it is equal to the transpose of Equation (2.2). You’ll need the relationship \((\mathbf{A}\mathbf{B})^\top=\mathbf{B}^\top \mathbf{A}^\top\).
+Let’s write Equation (2.6) as \(\mathbf{y} = \mathbf{D}\mathbf{d}\), where \(\mathbf{D}\) contains our parameters. Then we can solve for \(\mathbf{D}\) following the steps in Section 2.2.1 but multiplying from the right instead of from the left. Work through the steps to show that
+\(\mathbf{d} = \mathbf{y}\mathbf{d}^\top(\mathbf{d}\mathbf{d}^\top)^{-1}\).
+y=matrix(dat$stack.loss, nrow=1)
+d=rbind(1, dat$Air.Flow, dat$Water.Temp, dat$Acid.Conc)
+y%*%t(d)%*%solve(d%*%t(d))
[,1] [,2] [,3] [,4]
[1,] -524.9048 -1.047619 7.619048 5
-coef(fit1.mult)
+
(Intercept) Air.Flow Water.Temp Acid.Conc.
-524.904762 -1.047619 7.619048 5.000000
@@ -588,24 +616,24 @@ 2.2.4 Matrix Form 1b: The transpo
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -434,7 +460,7 @@
2.3 Matrix Form 2
-In this form, we have the explanatory variables in a matrix on the right of our parameter matrix as in Form 1b but we arrange everything a little differently:
+In this form, we have the explanatory variables in a matrix on the right of our parameter matrix as in Form 1b but we arrange everything a little differently:
\[\begin{equation}
\tag{2.7}
\begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\\stack.loss_4\end{bmatrix}
@@ -449,15 +475,15 @@ 2.3 Matrix Form 2
+
\begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}
\end{equation}\]
-
Work through the matrix algebra to make sure you understand why Equation (2.7) is the same as Equation (2.1) for all the \(i\) data points together.
-We will write Form 2 succinctly as
+Work through the matrix algebra to make sure you understand why Equation (2.7) is the same as Equation (2.1) for all the \(i\) data points together.
+We will write Form 2 succinctly as
\[\begin{equation}
\tag{2.8}
\mathbf{y}=\mathbf{Z}\mathbf{x}+\mathbf{e}
-\end{equation}\]
+\end{equation}\]
2.3.1 Form 2 with multiple explanatory variables
-The \(\mathbf{x}\) is a column vector of the explanatory variables. If we have more explanatory variables, we add them to the column vector at the bottom. So if we had air flow, water temperature and acid concentration as explanatory variables, \(\mathbf{x}\) looks like
+The \(\mathbf{x}\) is a column vector of the explanatory variables. If we have more explanatory variables, we add them to the column vector at the bottom. So if we had air flow, water temperature and acid concentration as explanatory variables, \(\mathbf{x}\) looks like
\[\begin{equation}
\tag{2.9}
\begin{bmatrix}1 \\ air_1 \\ air_2 \\ air_3 \\ air_4 \\ water_1 \\ water_2 \\ water_3 \\ water_4 \\ acid_1 \\ acid_2 \\ acid_3 \\ acid_4 \end{bmatrix}
@@ -471,11 +497,11 @@ 2.3.1 Form 2 with multiple explan
\alpha&0&0&0&\beta_1&0&0&0&\beta_2&0&0&0&\beta_3
\end{bmatrix}
\end{equation}\]
-
The number of rows of \(\mathbf{Z}\) is always \(n\), the number of rows of \(\mathbf{y}\), because the number of rows on the left and right of the equal sign must match. The number of columns in \(\mathbf{Z}\) is determined by the size of \(\mathbf{x}\). If there is an intercept, there is a 1 in \(\mathbf{x}\). Then each explanatory variable (like air flow and wind) appears \(n\) times. So if the number of explanatory variables is \(k\), the number of columns in \(\mathbf{Z}\) is \(1+k \times n\) if there is an intercept term and \(k \times n\) if there is not.
+The number of rows of \(\mathbf{Z}\) is always \(n\), the number of rows of \(\mathbf{y}\), because the number of rows on the left and right of the equal sign must match. The number of columns in \(\mathbf{Z}\) is determined by the size of \(\mathbf{x}\). If there is an intercept, there is a 1 in \(\mathbf{x}\). Then each explanatory variable (like air flow and wind) appears \(n\) times. So if the number of explanatory variables is \(k\), the number of columns in \(\mathbf{Z}\) is \(1+k \times n\) if there is an intercept term and \(k \times n\) if there is not.
2.3.2 When does Form 2 arise?
-Form 2 is similar to how multivariate time series models are typically written for reading by humans (on a whiteboard or paper). In these models, we see equations like this:
+Form 2 is similar to how multivariate time series models are typically written for reading by humans (on a whiteboard or paper). In these models, we see equations like this:
\[\begin{equation}
\tag{2.10}
\begin{bmatrix}y_1\\y_2\\y_3\\y_4\end{bmatrix}_t
@@ -490,12 +516,12 @@ 2.3.2 When does Form 2 arise?
-In this case, \(\mathbf{y}_t\) is the set of 4 observations at time \(t\) and \(\mathbf{x}_t\) is the set of 2 explanatory variables at time \(t\). The \(\mathbf{Z}\) is showing how we are modeling the effects of \(x_1\) and \(x_2\) on the \(y\)s. Notice that the effects are not consistent across the \(x\) and \(y\). This model would not be possible to fit with lm()
but will be easy to fit with MARSS()
.
+In this case, \(\mathbf{y}_t\) is the set of 4 observations at time \(t\) and \(\mathbf{x}_t\) is the set of 2 explanatory variables at time \(t\). The \(\mathbf{Z}\) is showing how we are modeling the effects of \(x_1\) and \(x_2\) on the \(y\)s. Notice that the effects are not consistent across the \(x\) and \(y\). This model would not be possible to fit with lm()
but will be easy to fit with MARSS()
.
2.3.3 Solving for the parameters for Form 2
-
-To solve for \(\alpha\) and \(\beta\), we need our parameters in a column matrix like so \(\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]\). We do this by rewritting \(\mathbf{Z}\mathbf{x}\) in Equation (2.8) in `vec’ form: if \(\mathbf{Z}\) is a \(n \times m\) matrix and \(\mathbf{x}\) is a matrix with 1 column and \(m\) rows, then \(\mathbf{Z}\mathbf{x} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\,\text{vec}(\mathbf{Z})\). The symbol \(\otimes\) means Kronecker product and just ignore it since you’ll never see it again in our course (or google ‘kronecker product’ if you are curious). The “vec” of a matrix is that matrix rearranged as a single column:
+
+To solve for \(\alpha\) and \(\beta\), we need our parameters in a column matrix like so \(\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]\). We do this by rewritting \(\mathbf{Z}\mathbf{x}\) in Equation (2.8) in `vec’ form: if \(\mathbf{Z}\) is a \(n \times m\) matrix and \(\mathbf{x}\) is a matrix with 1 column and \(m\) rows, then \(\mathbf{Z}\mathbf{x} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\,\text{vec}(\mathbf{Z})\). The symbol \(\otimes\) means Kronecker product and just ignore it since you’ll never see it again in our course (or google ‘kronecker product’ if you are curious). The “vec” of a matrix is that matrix rearranged as a single column:
\[\begin{equation*}
\,\text{vec} \begin{bmatrix}
1&2\\
@@ -504,11 +530,11 @@ 2.3.3 Solving for the parameters
1\\3\\2\\4
\end{bmatrix}
\end{equation*}\]
-
Notice how you just take each column one by one and stack them under each other. In R, the vec is
-A=matrix(1:6,nrow=2,byrow=TRUE)
-vecA = matrix(A,ncol=1)
+Notice how you just take each column one by one and stack them under each other. In R, the vec is
+
\(\mathbf{I}_n\) is a \(n \times n\) identity matrix, a diagonal matrix with all 0s on the off-diagonals and all 1s on the diagonal. In R, this is simply diag(n)
.
-To show how we solve for \(\alpha\) and \(\beta\), let’s use an example with only 3 data points so Equation (2.7) becomes:
+To show how we solve for \(\alpha\) and \(\beta\), let’s use an example with only 3 data points so Equation (2.7) becomes:
\[\begin{equation}
\tag{2.11}
\begin{bmatrix}stack.loss_1\\stack.loss_2\\stack.loss_3\end{bmatrix}
@@ -582,38 +608,43 @@ 2.3.3 Solving for the parameters
\beta
\end{bmatrix} = \mathbf{P}\mathbf{p}
\end{equation}\]
-where \(\mathbf{P}\) is the permutation matrix and \(\mathbf{p}=\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]\). Thus,
+where \(\mathbf{P}\) is the permutation matrix and \(\mathbf{p}=\left[ \begin{smallmatrix}\alpha\\\beta\end{smallmatrix} \right]\).
+Thus,
\[\begin{equation}
\tag{2.12}
\mathbf{y}=\mathbf{Z}\mathbf{x}+\mathbf{e} = (\mathbf{x}^\top \otimes \mathbf{I}_n)\mathbf{P}\begin{bmatrix}\alpha\\ \beta\end{bmatrix} = \mathbf{M}\mathbf{p} + \mathbf{e}
\end{equation}\]
-
where \(\mathbf{M}=(\mathbf{x}^\top \otimes \mathbf{I}_n)\mathbf{P}\). We can solve for \(\mathbf{p}\), the parameters, using \[(\mathbf{M}^\top\mathbf{M})^{-1}\mathbf{M}^\top\mathbf{y}\] as before.
+where \(\mathbf{M}=(\mathbf{x}^\top \otimes \mathbf{I}_n)\mathbf{P}\).
+We can solve for \(\mathbf{p}\), the parameters, using
+\[(\mathbf{M}^\top\mathbf{M})^{-1}\mathbf{M}^\top\mathbf{y}\]
+as before.
2.3.4 Code to solve for parameters in Form 2
In the homework, you will use the R code in this section to solve for the parameters in Form 2. Later when you are fitting multivariate time series models, you will not solve for parameters this way but you will need to both construct \(\mathbf{Z}\) matrices in R and read \(\mathbf{Z}\) matrices. The homework will give you practice creating \(\mathbf{Z}\) matrices in R.
-#make your y and x matrices
-y=matrix(dat$stack.loss, ncol=1)
-x=matrix(c(1,dat$Air.Flow),ncol=1)
-#make the Z matrix
-n=nrow(dat) #number of rows in our data file
-k=1
-#Z has n rows and 1 col for intercept, and n cols for the n air data points
-#a list matrix allows us to combine "characters" and numbers
-Z=matrix(list(0),n,k*n+1)
-Z[,1]="alpha"
-diag(Z[1:n,1+1:n])="beta"
-#this function creates that permutation matrix for you
-P=MARSS:::convert.model.mat(Z)$free[,,1]
-M=kronecker(t(x),diag(n))%*%P
-solve(t(M)%*%M)%*%t(M)%*%y
+#make your y and x matrices
+y=matrix(dat$stack.loss, ncol=1)
+x=matrix(c(1,dat$Air.Flow),ncol=1)
+#make the Z matrix
+n=nrow(dat) #number of rows in our data file
+k=1
+#Z has n rows and 1 col for intercept, and n cols for the n air data points
+#a list matrix allows us to combine "characters" and numbers
+Z=matrix(list(0),n,k*n+1)
+Z[,1]="alpha"
+diag(Z[1:n,1+1:n])="beta"
+#this function creates that permutation matrix for you
+P=MARSS:::convert.model.mat(Z)$free[,,1]
+M=kronecker(t(x),diag(n))%*%P
+solve(t(M)%*%M)%*%t(M)%*%y
[,1]
alpha -11.6159170
beta 0.6412918
-coef(lm(dat$stack.loss ~ dat$Air.Flow))
+
(Intercept) dat$Air.Flow
-11.6159170 0.6412918
-Go through this code line by line at the R command line. Look at Z
. It is a list matrix that allows you to combine numbers (the 0s) with character string (names of parameters). Look at the permutation matrix P
. Try MARSS:::convert.model.mat(Z)$free
and see that it returns a 3D matrix, which is why the [,,1]
appears (to get us a 2D matrix). To use more data points, you can redefine dat
to say dat=stackloss
to use all 21 data points.
+Go through this code line by line at the R command line. Look at Z
. It is a list matrix that allows you to combine numbers (the 0s) with character string (names of parameters). Look at the permutation matrix P
. Try MARSS:::convert.model.mat(Z)$free
and see that it returns a 3D matrix, which is why the [,,1]
appears (to get us a 2D matrix). To use more data points, you can redefine
+dat
to say dat=stackloss
to use all 21 data points.
@@ -627,24 +658,24 @@ 2.3.4 Code to solve for parameter
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
Let’s say that the odd numbered plants are in the north and the even numbered are in the south. We want to include this as a factor in our model that affects the intercept. Let’s go back to just having air flow be our explanatory variable. Now if the plant is in the north our model is
\[\begin{equation}
\tag{2.13}
stack.loss_i = \alpha_n + \beta air_i + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2)
@@ -444,23 +470,23 @@ 2.4 Groups of intercepts
\tag{2.14}
stack.loss_i = \alpha_s + \beta air_i + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2)
\end{equation}\]
-
We’ll add north/south as a factor called `reg’ (region) to our dataframe:
-dat = cbind(dat, reg=rep(c("n","s"),n)[1:n])
-dat
Air.Flow Water.Temp Acid.Conc. stack.loss reg
1 80 27 89 42 n
2 80 27 88 37 s
3 75 25 90 37 n
4 62 24 87 28 s
And we can easily fit this model with lm()
.
fit2 = lm(stack.loss ~ -1 + Air.Flow + reg, data=dat)
-coef(fit2)
Air.Flow regn regs
0.5358166 -2.0257880 -5.5429799
The -1 is added to the lm()
call to get rid of \(\alpha\). We just want the \(\alpha_n\) and \(\alpha_s\) intercepts coming from our regions.
Written in matrix form, Form 1 for this model is
\[\begin{equation}
\tag{2.15}
\begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix}
@@ -470,40 +496,40 @@ 2.4.1 North/South intercepts in F
+
\begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}
\end{equation}\]
-
Notice that odd plants get \(\alpha_n\) and even plants get \(\alpha_s\). Use model.matrix()
to see that this is the \(\mathbf{Z}\) matrix that lm()
formed. Notice the matrix output by model.matrix()
looks exactly like \(\mathbf{Z}\) in Equation (2.15).
Z=model.matrix(fit2)
-Z[1:4,]
model.matrix()
to see that this is the \(\mathbf{Z}\) matrix that lm()
formed. Notice the matrix output by model.matrix()
looks exactly like \(\mathbf{Z}\) in Equation (2.15).
+
Air.Flow regn regs
1 80 1 0
2 80 0 1
3 75 1 0
4 62 0 1
We can solve for the parameters using \(\mathbf{x} = (\mathbf{Z}^\top\mathbf{Z})^{-1}\mathbf{Z}^\top\mathbf{y}\) as we did for Form 1 before by adding on the 1s and 0s columns we see in the \(\mathbf{Z}\) matrix in Equation (2.15). We could build this \(\mathbf{Z}\) using the following R code:
-Z=cbind(dat$Air.Flow,c(1,0,1,0),c(0,1,0,1))
-colnames(Z)=c("beta","regn","regs")
Or just use model.matrix()
. This will save time when models are more complex.
Z=model.matrix(fit2)
-Z[1:4,]
Air.Flow regn regs
1 80 1 0
2 80 0 1
3 75 1 0
4 62 0 1
Now we can solve for the parameters:
-y=matrix(dat$stack.loss, ncol=1)
-solve(t(Z)%*%Z)%*%t(Z)%*%y
[,1]
Air.Flow 0.5358166
regn -2.0257880
regs -5.5429799
Compare to the output from lm()
and you will see it is the same.
coef(fit2)
Air.Flow regn regs
0.5358166 -2.0257880 -5.5429799
We would write this model in Form 2 as
\[\begin{equation}
\tag{2.16}
\begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix}
@@ -516,25 +542,25 @@ 2.4.2 North/South intercepts in F
\end{bmatrix}\begin{bmatrix}1\\air_1\\air_2\\air_3\\air_4\end{bmatrix}
+
\begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}=\mathbf{Z}\mathbf{x}+\mathbf{e}
-\end{equation}\]
+\end{equation}\]
To estimate the parameters, we need to be able to write a list matrix that looks like \(\mathbf{Z}\) in Equation (2.16). We can use the same code we used in Section 2.3.4 with \(\mathbf{Z}\) changed to look like that in Equation (2.16).
-y=matrix(dat$stack.loss, ncol=1)
-x=matrix(c(1,dat$Air.Flow),ncol=1)
-n=nrow(dat)
-k=1
-#list matrix allows us to combine numbers and character strings
-Z=matrix(list(0),n,k*n+1)
-Z[seq(1,n,2),1]="alphanorth"
-Z[seq(2,n,2),1]="alphasouth"
-diag(Z[1:n,1+1:n])="beta"
-P=MARSS:::convert.model.mat(Z)$free[,,1]
-M=kronecker(t(x),diag(n))%*%P
-solve(t(M)%*%M)%*%t(M)%*%y
y=matrix(dat$stack.loss, ncol=1)
+x=matrix(c(1,dat$Air.Flow),ncol=1)
+n=nrow(dat)
+k=1
+#list matrix allows us to combine numbers and character strings
+Z=matrix(list(0),n,k*n+1)
+Z[seq(1,n,2),1]="alphanorth"
+Z[seq(2,n,2),1]="alphasouth"
+diag(Z[1:n,1+1:n])="beta"
+P=MARSS:::convert.model.mat(Z)$free[,,1]
+M=kronecker(t(x),diag(n))%*%P
+solve(t(M)%*%M)%*%t(M)%*%y
[,1]
alphanorth -2.0257880
alphasouth -5.5429799
beta 0.5358166
-Make sure you understand the code used to form the \(\mathbf{Z}\) matrix. Also notice that class(Z[1,3])="numeric"
while class(Z[1,2])="character"
. This is important. 0
in R is a number while "0"
would be a character (the name of a parameter).
Make sure you understand the code used to form the \(\mathbf{Z}\) matrix. Also notice that class(Z[1,3])="numeric"
while class(Z[1,2])="character"
. This is important. 0
in R is a number while "0"
would be a character (the name of a parameter).
We will start by regressing stack loss against air flow. In R using the lm()
function this is
#the dat data.frame is defined on the first page of the chapter
-lm(stack.loss ~ Air.Flow, data=dat)
#the dat data.frame is defined on the first page of the chapter
+lm(stack.loss ~ Air.Flow, data=dat)
This fits the following model for the \(i\)-th measurment: \[\begin{equation} \tag{2.1} stack.loss_i = \alpha + \beta air_i + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2) \end{equation}\] -
We will write the model for all the measurements together in two different ways, Form 1 and Form 2.
+We will write the model for all the measurements together in two different ways, Form 1 and Form 2.For the homework questions, we will using part of the airquality
data set in R. Load that as
data(airquality, package="datasets")
-#remove any rows with NAs omitted.
-airquality=na.omit(airquality)
-#make Month a factor (i.e., the Month number is a name rather than a number)
-airquality$Month=as.factor(airquality$Month)
-#add a region factor
-airquality$region = rep(c("north","south"),60)[1:111]
-#Only use 5 data points for the homework so you can show the matrices easily
-homeworkdat = airquality[1:5,]
data(airquality, package="datasets")
+#remove any rows with NAs omitted.
+airquality=na.omit(airquality)
+#make Month a factor (i.e., the Month number is a name rather than a number)
+airquality$Month=as.factor(airquality$Month)
+#add a region factor
+airquality$region = rep(c("north","south"),60)[1:111]
+#Only use 5 data points for the homework so you can show the matrices easily
+homeworkdat = airquality[1:5,]
Using Form 1 \(\mathbf{y}=\mathbf{Z}\mathbf{x}+\mathbf{e}\), write out the model, showing the \(\mathbf{Z}\) and \(\mathbf{x}\) matrices, being fit by this command
-fit=lm(Ozone ~ Wind + Temp, data=homeworkdat)
For the above model, write out the following R code.
lm()
call.Add -1 to your lm()
call in question 1:
fit=lm(Ozone ~ -1 + Wind + Temp, data=homeworkdat)
lm()
.For the model for question 1,
A model of the ozone data with only a region (north/south) effect can be written:
-fit=lm(Ozone ~ -1 + region, data=homeworkdat)
lm()
call.Using the same model from question 5,
Using the same model from question 5,
Z
and x
in R code.lm()
call. To do this, you adapt the code from subsection 2.3.4.Write the model below in Form 2 as an equation. Show the \(\mathbf{Z}\), \(\mathbf{y}\) and \(\mathbf{x}\) matrices.
-fit=lm(Ozone ~ Temp:region, data=homeworkdat)
Using the airquality dataset with 111 data points
fit=lm(Ozone ~ -1 + Temp:region + Month, data=airquality)
Let’s imagine that the data were taken consecutively in time by quarter. We want to model the seasonal effect as an intercept change. We will drop all other effects for now.
+If the data were collected in quarter 1, the model is
\[\begin{equation}
\tag{2.21}
stack.loss_i = \alpha_1 + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2)
@@ -444,37 +471,37 @@ 2.6 Seasonal effect as a factor
\tag{2.22}
stack.loss_i = \alpha_2 + e_i, \text{ where } e_i \sim \text{N}(0,\sigma^2)
\end{equation}\]
-
etc.
+etc.We add a column to our dataframe to account for season:
-dat = cbind(dat, qtr=paste(rep("qtr",n),1:4,sep=""))
-dat
Air.Flow Water.Temp Acid.Conc. stack.loss reg owner qtr
1 80 27 89 42 n s qtr1
2 80 27 88 37 s a qtr2
3 75 25 90 37 n s qtr3
4 62 24 87 28 s a qtr4
And we can easily fit this model with lm()
.
coef(lm(stack.loss ~ -1 + qtr, data=dat))
qtrqtr1 qtrqtr2 qtrqtr3 qtrqtr4
42 37 37 28
The -1 is added to the lm()
call to get rid of \(\alpha\). We just want the \(\alpha_1\), \(\alpha_2\), etc. intercepts coming from our quarters.
For comparison look at
-coef(lm(stack.loss ~ qtr, data=dat))
(Intercept) qtrqtr2 qtrqtr3 qtrqtr4
42 -5 -5 -14
Why does it look like that when -1 is missing from the lm()
call? Where did the intercept for quarter 1 go and why are the other intercepts so much smaller?
Remembering that lm()
puts models in Form 1, look at the \(\mathbf{Z}\) matrix for Form 1:
fit4=lm(stack.loss ~ -1 + qtr, data=dat)
-Z=model.matrix(fit4)
-Z[1:4,]
qtrqtr1 qtrqtr2 qtrqtr3 qtrqtr4
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
-Written in Form 1, this model is
+Written in Form 1, this model is
\[\begin{equation}
\tag{2.23}
\begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix}
@@ -483,11 +510,11 @@ 2.6.1 Seasonal intercepts written
\begin{bmatrix}\alpha_1 \\ \alpha_2 \\ \alpha_3 \\ \alpha_4 \end{bmatrix}
+
\begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}=\mathbf{Z}\mathbf{x}+\mathbf{e}
-\end{equation}\]
+\end{equation}\]
Compare to the model that lm()
is using when the intercept included. What does this model look like written in matrix form?
fit5=lm(stack.loss ~ qtr, data=dat)
-Z=model.matrix(fit5)
-Z[1:4,]
(Intercept) qtrqtr2 qtrqtr3 qtrqtr4
1 1 0 0 0
2 1 1 0 0
@@ -496,7 +523,7 @@ 2.6.1 Seasonal intercepts written
We do not need to add 1s and 0s to our \(\mathbf{Z}\) matrix in Form 2; we just add subscripts to our intercepts like we did when we had north-south intercepts. In this model, we do not have any explanatory variables (except intercept) so our \(\mathbf{x}\) is just a \(1 \times 1\) matrix:
\[\begin{equation}
\tag{2.24}
\begin{bmatrix}stack.loss_1\\ stack.loss_2\\ stack.loss_3\\ stack.loss_4\end{bmatrix}
@@ -509,7 +536,7 @@ 2.6.2 Seasonal intercepts written
\end{bmatrix}\begin{bmatrix}1\end{bmatrix}
+
\begin{bmatrix}e_1\\e_2\\e_3\\e_4\end{bmatrix}=\mathbf{Z}\mathbf{x}+\mathbf{e}
-\end{equation}\]
+\end{equation}\]
With our 4 data points, we are limited to estimating 4 parameters. Let’s use the full 21 data points so we can estimate some more complex models. We’ll add an owner variable and a quarter variable to the stackloss dataset.
-data(stackloss, package="datasets")
-fulldat=stackloss
-n=nrow(fulldat)
-fulldat=cbind(fulldat,
- owner=rep(c("sue","aneesh","joe"),n)[1:n],
- qtr=paste("qtr",rep(1:4,n)[1:n],sep=""),
- reg=rep(c("n","s"),n)[1:n])
data(stackloss, package="datasets")
+fulldat=stackloss
+n=nrow(fulldat)
+fulldat=cbind(fulldat,
+ owner=rep(c("sue","aneesh","joe"),n)[1:n],
+ qtr=paste("qtr",rep(1:4,n)[1:n],sep=""),
+ reg=rep(c("n","s"),n)[1:n])
Let’s fit a model where there is only an effect of air flow, but that effect varies by owner and by quarter. We also want a different intercept for each quarter. So if datapoint \(i\) is from quarter \(j\) on a plant owned by owner \(k\), the model is \[\begin{equation} \tag{2.25} stack.loss_i = \alpha_j + \beta_{j,k} air_i + e_i \end{equation}\] -
So there there are \(4 \times 3\) \(\beta\)’s (4 quarters and 3 owners) and 4 \(\alpha\)’s (4 quarters).
+So there there are \(4 \times 3\) \(\beta\)’s (4 quarters and 3 owners) and 4 \(\alpha\)’s (4 quarters).With lm()
, we fit the model as:
fit7 = lm(stack.loss ~ -1 + qtr + Air.Flow:qtr:owner, data=fulldat)
Take a look at \(\mathbf{Z}\) for Form 1 using model.matrix(Z)
. It’s not shown since it is large:
model.matrix(fit7)
The \(\mathbf{x}\) will be \[\begin{bmatrix}\alpha_1 \\ \alpha_2 \\ \alpha_3 \\ \alpha_4 \\ \beta_{1,a} \\ \beta_{2,a} \\ \beta_{3,a} \\ \dots \end{bmatrix}\]
+ +The \(\mathbf{x}\) will be +\[\begin{bmatrix}\alpha_1 \\ \alpha_2 \\ \alpha_3 \\ \alpha_4 \\ \beta_{1,a} \\ \beta_{2,a} \\ \beta_{3,a} \\ \dots \end{bmatrix}\]
Take a look at the model matrix that lm()
is using and make sure you understand how \(\mathbf{Z}\mathbf{x}\) produces Equation (2.25).
Z=model.matrix(fit7)
For Form 2, our \(\mathbf{Z}\) size doesn’t change; number of rows is \(n\) (the number data points) and number of columns is 1 (for intercept) plus the number of explanatory variables times \(n\). So in this case, we only have one explanatory variable (air flow) so \(\mathbf{Z}\) has 1+21 columns. To allow the intercept to vary by quarter, we use \(\alpha_1\) in the rows of \(\mathbf{Z}\) where the data is from quarter 1, use \(\alpha_2\) where the data is from quarter 2, etc. Similarly we use the appropriate \(\beta_{j,k}\) depending on the quarter and owner for that data point.
We could construct \(\mathbf{Z}\), \(\mathbf{x}\) and \(\mathbf{y}\) for Form 2 using
-y=matrix(fulldat$stack.loss, ncol=1)
-x=matrix(c(1,fulldat$Air.Flow),ncol=1)
-n=nrow(fulldat)
-k=1
-Z=matrix(list(0),n,k*n+1)
-#give the intercepts names based on qtr
-Z[,1]=paste(fulldat$qtr)
-#give the betas names based on qtr and owner
-diag(Z[1:n,1+1:n])=paste("beta",fulldat$qtr,fulldat$owner,sep=".")
-P=MARSS:::convert.model.mat(Z)$free[,,1]
-M=kronecker(t(x),diag(n))%*%P
-solve(t(M)%*%M)%*%t(M)%*%y
y=matrix(fulldat$stack.loss, ncol=1)
+x=matrix(c(1,fulldat$Air.Flow),ncol=1)
+n=nrow(fulldat)
+k=1
+Z=matrix(list(0),n,k*n+1)
+#give the intercepts names based on qtr
+Z[,1]=paste(fulldat$qtr)
+#give the betas names based on qtr and owner
+diag(Z[1:n,1+1:n])=paste("beta",fulldat$qtr,fulldat$owner,sep=".")
+P=MARSS:::convert.model.mat(Z)$free[,,1]
+M=kronecker(t(x),diag(n))%*%P
+solve(t(M)%*%M)%*%t(M)%*%y
Note, the estimates are the same as for lm()
but are not listed in the same order.
Make sure to look at the \(\mathbf{Z}\) and \(\mathbf{x}\) for the models and that you understand why they look like they do.
When we are looking at data over a large geographic region, we might make the assumption that the different census regions are measuring a single population if we think animals are moving sufficiently such that the whole area (multiple regions together) is “well-mixed”. We write a model of the total population abundance for this case as: \[\begin{equation} n_t = \,\text{exp}(u + w_t) n_{t-1}, \tag{7.2} \end{equation}\] -where \(n_t\) is the total count in year \(t\), \(u\) is the mean population growth rate, and \(w_t\) is the deviation from that average in year \(t\). We then take the log of both sides and write the model in log space: +where \(n_t\) is the total count in year \(t\), \(u\) is the mean population growth rate, and \(w_t\) is the deviation from that average in year \(t\). +We then take the log of both sides and write the model in log space: \[\begin{equation} x_t = x_{t-1} + u + w_t, \textrm{ where } w_t \sim \,\text{N}(0,q) \tag{7.3} \end{equation}\] -
\(x_t=\log{n_t}\). When there is one effective population, there is one \(x\), therefore \(\mathbf{x}_t\) is a \(1 \times 1\) matrix. This is our state model and \(x\) is called the “state”. This is just the jargon used in this type of model (state-space model) for the hidden state that you are estimating from the data. “Hidden” means that you observe this state with error.
+\(x_t=\log{n_t}\). When there is one effective population, there is one \(x\), therefore \(\mathbf{x}_t\) is a \(1 \times 1\) matrix. This is our state model and \(x\) is called the “state”. This is just the jargon used in this type of model (state-space model) for the hidden state that you are estimating from the data. “Hidden” means that you observe this state with error.We assume that all four regional time series are observations of this one population trajectory but they are scaled up or down relative to that trajectory. In effect, we think of each regional survey as an index of the total population. With this model, we do not think the regions represent independent subpopulations but rather independent observations of one population.
+Our model for the data, \(\mathbf{y}_t = \mathbf{Z} \mathbf{x}_t + \mathbf{a} + \mathbf{v}_t\), is written as:
\[\begin{equation}
\left[ \begin{array}{c}
y_{1} \\
@@ -471,8 +499,8 @@ Each \(y_{i}\) is the observed time series of counts for a different region. The \(a\)’s are the bias between the regional sample and the total population. \(\mathbf{Z}\) specifies which observation time series, \(y_i\), is associated with which population trajectory, \(x_j\). In this case, \(\mathbf{Z}\) is a matrix with 1 column since each region is an observation of the one population trajectory.7.3.1 The observation process
-
We allow that each region could have a unique observation variance and that the observation errors are independent between regions. We assume that the observations errors on log(counts) are normal and thus the errors on (counts) are log-normal. The assumption of normality is not unreasonable since these regional counts are the sum of counts across multiple haul-outs. We specify independent observation errors with different variances by specifying that \(\mathbf{v} \sim \,\text{MVN}(0,\mathbf{R})\), where
\[\begin{equation}
\mathbf{R} = \begin{bmatrix}
r_1 & 0 & 0 & 0 \\
@@ -481,28 +509,28 @@ This is a diagonal matrix with unequal variances. The shortcut for this structure in 7.3.1 The observation process
-MARSS()
is "diagonal and unequal"
.MARSS()
is "diagonal and unequal"
.
We need to write the model in the form of Equation (7.1) with each parameter written as a matrix. The observation model (Equation (7.4)) is already in matrix form. Let’s write the state model in matrix form too: \[\begin{equation} [x]_t = [1][x]_{t-1} + [u] + [w]_t, \textrm{ where } [w]_t \sim \,\text{N}(0,[q]) \tag{7.6} \end{equation}\] -
It is very simple since all terms are \(1 \times 1\) matrices.
+It is very simple since all terms are \(1 \times 1\) matrices.To fit our model with MARSS()
, we set up a list which precisely describes the size and structure of each parameter matrix. Fixed values in a matrix are designated with their numeric value and estimated values are given a character name and put in quotes. Our model list for a single well-mixed population is:
mod.list.0 <- list(
-B=matrix(1),
-U=matrix("u"),
-Q=matrix("q"),
-Z=matrix(1,4,1),
-A="scaling",
-R="diagonal and unequal",
-x0=matrix("mu"),
-tinitx=0 )
mod.list.0 <- list(
+B=matrix(1),
+U=matrix("u"),
+Q=matrix("q"),
+Z=matrix(1,4,1),
+A="scaling",
+R="diagonal and unequal",
+x0=matrix("mu"),
+tinitx=0 )
and fit:
-fit.0 <- MARSS(dat, model=mod.list.0)
Success! abstol and log-log tests passed at 32 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -529,18 +557,18 @@ 7.3.2 Fitting the model
Standard errors have not been calculated.
Use MARSSparamCIs to compute CIs and bias estimates.
-We already discussed that the short-cut "diagonal and unequal"
means a diagonal matrix with each diagonal element having a different value. The short-cut "scaling"
means the form of \(\mathbf{a}\) in Equation (7.4) with one value set to 0 and the rest estimated. You should run the code in the list to make sure you see that each parameter in the list has the same form as in our mathematical equation for the model.
We already discussed that the short-cut "diagonal and unequal"
means a diagonal matrix with each diagonal element having a different value. The short-cut "scaling"
means the form of \(\mathbf{a}\) in Equation (7.4) with one value set to 0 and the rest estimated. You should run the code in the list to make sure you see that each parameter in the list has the same form as in our mathematical equation for the model.
The model fits fine but look at the model residuals (Figure 7.2). They have problems.
-par(mfrow=c(2,2))
-resids <- residuals(fit.0)
-for(i in 1:4){
-plot(resids$model.residuals[i,],ylab="model residuals", xlab="")
-abline(h=0)
-title(rownames(dat)[i])
-}
par(mfrow=c(2,2))
+resids <- residuals(fit.0)
+for(i in 1:4){
+plot(resids$model.residuals[i,],ylab="model residuals", xlab="")
+abline(h=0)
+title(rownames(dat)[i])
+}
Here we show you how to fit a MARSS model for the harbor seal data using JAGS. We will focus on four time series from inland Washington and set up the data as follows:
-data(harborSealWA, package="MARSS")
-sites <- c("SJF","SJI","EBays","PSnd")
-Y <- harborSealWA[,sites]
-Y <- t(Y) # time across columns
data(harborSealWA, package="MARSS")
+sites <- c("SJF","SJI","EBays","PSnd")
+Y <- harborSealWA[,sites]
+Y <- t(Y) # time across columns
We will fit the model with four temporally independent subpopulations with the same population growth rate (\(u\)) and year-to-year variance (\(q\)). This is the model in Section 7.4.
The first step is to write this model in JAGS. See Chapter 12 for more information on and examples of JAGS models.
-jagsscript <- cat("
-model {
- U ~ dnorm(0, 0.01);
- tauQ~dgamma(0.001,0.001);
- Q <- 1/tauQ;
-
- # Estimate the initial state vector of population abundances
- for(i in 1:nSites) {
- X[i,1] ~ dnorm(3,0.01); # vague normal prior
- }
-
- # Autoregressive process for remaining years
- for(t in 2:nYears) {
- for(i in 1:nSites) {
- predX[i,t] <- X[i,t-1] + U;
- X[i,t] ~ dnorm(predX[i,t], tauQ);
- }
- }
-
- # Observation model
- # The Rs are different in each site
- for(i in 1:nSites) {
- tauR[i]~dgamma(0.001,0.001);
- R[i] <- 1/tauR[i];
- }
- for(t in 1:nYears) {
- for(i in 1:nSites) {
- Y[i,t] ~ dnorm(X[i,t],tauR[i]);
- }
- }
-}
-
-",file="marss-jags.txt")
jagsscript <- cat("
+model {
+ U ~ dnorm(0, 0.01);
+ tauQ~dgamma(0.001,0.001);
+ Q <- 1/tauQ;
+
+ # Estimate the initial state vector of population abundances
+ for(i in 1:nSites) {
+ X[i,1] ~ dnorm(3,0.01); # vague normal prior
+ }
+
+ # Autoregressive process for remaining years
+ for(t in 2:nYears) {
+ for(i in 1:nSites) {
+ predX[i,t] <- X[i,t-1] + U;
+ X[i,t] ~ dnorm(predX[i,t], tauQ);
+ }
+ }
+
+ # Observation model
+ # The Rs are different in each site
+ for(i in 1:nSites) {
+ tauR[i]~dgamma(0.001,0.001);
+ R[i] <- 1/tauR[i];
+ }
+ for(t in 1:nYears) {
+ for(i in 1:nSites) {
+ Y[i,t] ~ dnorm(X[i,t],tauR[i]);
+ }
+ }
+}
+
+",file="marss-jags.txt")
{#sec-mss-fit-jags}
Then we write the data list, parameter list, and pass the model to the jags()
function:
jags.data <- list("Y"=Y, nSites=nrow(Y), nYears = ncol(Y)) # named list
-jags.params <- c("X","U","Q","R")
-model.loc <- "marss-jags.txt" # name of the txt file
-mod_1 <- jags(jags.data, parameters.to.save=jags.params,
- model.file=model.loc, n.chains = 3,
- n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
jags.data <- list("Y"=Y, nSites=nrow(Y), nYears = ncol(Y)) # named list
+jags.params <- c("X","U","Q","R")
+model.loc <- "marss-jags.txt" # name of the txt file
+mod_1 <- jags(jags.data, parameters.to.save=jags.params,
+ model.file=model.loc, n.chains = 3,
+ n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
We can plot any of the variables we chose to return to R in the jags.params
list. Let’s focus on the X
. When we look at the dimension of the X
, we can use the apply()
function to calculate the means and 95 percent CIs of the estimated states.
#attach.jags attaches the jags.params to our workspace
-attach.jags(mod_1)
-means <- apply(X,c(2,3),mean)
-upperCI <- apply(X,c(2,3),quantile,0.975)
-lowerCI <- apply(X,c(2,3),quantile,0.025)
-par(mfrow = c(2,2))
-nYears <- ncol(Y)
-for(i in 1:nrow(means)) {
- plot(means[i,],lwd=3,ylim=range(c(lowerCI[i,],upperCI[i,])),
- type="n",main=colnames(Y)[i],ylab="log abundance", xlab="time step")
- polygon(c(1:nYears,nYears:1,1),
- c(upperCI[i,],rev(lowerCI[i,]),upperCI[i,1]),col="skyblue",lty=0)
- lines(means[i,],lwd=3)
- title(rownames(Y)[i])
-}
#attach.jags attaches the jags.params to our workspace
+attach.jags(mod_1)
+means <- apply(X,c(2,3),mean)
+upperCI <- apply(X,c(2,3),quantile,0.975)
+lowerCI <- apply(X,c(2,3),quantile,0.025)
+par(mfrow = c(2,2))
+nYears <- ncol(Y)
+for(i in 1:nrow(means)) {
+ plot(means[i,],lwd=3,ylim=range(c(lowerCI[i,],upperCI[i,])),
+ type="n",main=colnames(Y)[i],ylab="log abundance", xlab="time step")
+ polygon(c(1:nYears,nYears:1,1),
+ c(upperCI[i,],rev(lowerCI[i,]),upperCI[i,1]),col="skyblue",lty=0)
+ lines(means[i,],lwd=3)
+ title(rownames(Y)[i])
+}
detach.jags()
As discussed in Chapter 6, the MARSS package fits multivariate state-space models in this form:
\[\begin{equation}
\begin{gathered}
\mathbf{x}_t = \mathbf{B} \mathbf{x}_{t-1}+\mathbf{u}+\mathbf{w}_t \text{ where } \mathbf{w}_t \sim \,\text{N}(0,\mathbf{Q}) \\
@@ -443,7 +469,7 @@ 7.1 Overview
\end{gathered}
\tag{7.1}
\end{equation}\]
-
where each of the bolded terms are matrices. Those that are bolded and small (not capitalized) have one column only, so are column matrices.
+where each of the bolded terms are matrices. Those that are bolded and small (not capitalized) have one column only, so are column matrices.To fit a multivariate time series model with the MARSS package, you need to first determine the size and structure of each of the parameter matrices: \(\mathbf{B}\), \(\mathbf{u}\), \(\mathbf{Q}\), \(\mathbf{Z}\), \(\mathbf{a}\), \(\mathbf{R}\) and \(\boldsymbol{\mu}\). This requires first writing down your model in matrix form. We will illustarte this with a series of models for the temporal population dynamics of West coast harbor seals.
For these questions, use the harborSealWA
data set in MARSS. The data are already logged, but you will need to remove the year column and have time going across the columns not down the rows.
require(MARSS)
-data(harborSealWA, package="MARSS")
-dat <- t(harborSealWA[,2:6])
The sites are San Juan de Fuca (SJF 3), San Juan Islands (SJI 4), Eastern Bays (EBays 5), Puget Sound (PSnd 6) and Hood Canal (HC 7).
Plot the harbor seal data. Use whatever plotting functions you wish (e.g. ggplot()
, plot(); points(); lines()
, matplot()
).
Plot the harbor seal data. Use whatever plotting functions you wish (e.g. ggplot()
, plot(); points(); lines()
, matplot()
).
Fit a panmictic population model that assumes that each of the 5 sites is observing one “Inland WA” harbor seal population with trend \(u\). Assume the observation errors are independent and identical. This means 1 variance on diagonal and 0s on off-diagonal. This is the default assumption for MARSS()
.
Write the \(\mathbf{Z}\) for this model. The code to use for making a matrix in Rmarkdown is
-$$\begin{bmatrix}a & b & 0\\d & e & f\\0 & h & i\end{bmatrix}$$
$$\begin{bmatrix}a & b & 0\\d & e & f\\0 & h & i\end{bmatrix}$$
+Write the \(\mathbf{Z}\) matrix in R using Z=matrix(...)
and using the factor short-cut for specifying \(\mathbf{Z}\). Z=factor(c(...)
.
Fit the model using MARSS()
. What is the estimated trend (\(u\))? How fast was the population increasing (percent per year) based on this estimated \(u\)?
Compute the confidence intervals for the parameter estimates. Compare the intervals using the Hessian approximation and using a parametric bootstrap. What differences do you see between the two approaches? Use this code:
+Compute the confidence intervals for the parameter estimates. Compare the intervals using the Hessian approximation and using a parametric bootstrap. What differences do you see between the two approaches? Use this code:
library(broom)
tidy(fit)
# set nboot low so it doesn't take forever
-tidy(fit, method="parametric",nboot=100)
What does an estimate of \(\mathbf{Q}=0\) mean? What would the estimated state (\(x\)) look like when \(\mathbf{Q}=0\)?
Using the same panmictic population model, compare 3 assumptions about the observation error structure.
Write the \(\mathbf{R}\) variance-covariance matrices for each assumption.
Create each R matrix in R. To combine, numbers and characters in a matrix use a list matrix like so:
+Create each R matrix in R. To combine, numbers and characters in a matrix use a list matrix like so:
A <- matrix(list(0),3,3)
-A[1,1] <- "sigma2"
Fit each model using MARSS()
and compare the estimated \(u\) (the population long-term trend). Does the assumption about the errors change the \(u\) estimate?
Plot the state residuals, the ACF of the state residuals, and the histogram of the state residuals for each fit. Are there any issues that you see? Use this code to get your state residuals:
+A[1,1] <- "sigma2" +Fit each model using MARSS()
and compute the confidence intervals (CIs) for the estimated parameters. Compare the estimated \(u\) (the population long-term trend) along with their CIs. Does the assumption about the observation errors change the \(u\) estimate?
Plot the state residuals, the ACF of the state residuals, and the histogram of the state residuals for each fit. Are there any issues that you see? Use this code to get your state residuals:
residuals(fit)$state.residuals[1,]
You need the [1,]
since the residuals are returned as a matrix.
Fit a model with 3 subpopulations. 1=SJF,SJI; 2=PS,EBays; 3=HC. The \(x\) part f the model is the population structure. Assume that the observation errors are identical and independent (R="diagonal and equal"
). Assume that the process errors are unique and independent (Q="diagonal and unequal"
).
Fit a model with 3 subpopulations. 1=SJF,SJI; 2=PS,EBays; 3=HC. The \(x\) part of the model is the population structure. Assume that the observation errors are identical and independent (R="diagonal and equal"
). Assume that the process errors are unique and independent (Q="diagonal and unequal"
). Assume that the \(u\) are unique among the 3 subpopulation.
Write the \(\mathbf{x}\) equation. Make sure each matrix in the equation has the right number of rows and columns.
Write the \(\mathbf{Z}\) matrix.
Fit the model with MARSS()
.
What do the estimated \(u\) and \(\mathbf{Q}\) imply about the population dynamics in the 3 subpopulations?
Repeat the fit from Question 4 but assume that the 3 subpopulations covary. Use Q="unconstrained"
.
Repeat the fit from Question 4 but assume that the 3 subpopulations covary. Use Q="unconstrained"
.
What does the estimated \(\mathbf{Q}\) matrix tell you about how the 3 subpopulation covary?
Compare the AICc from the model in Question 4 and the one with Q="unconstrained"
. Which is more supported?
Fit the model with Q="equalvarcov"
. Is this more supported based on AICc?
Compare the AICc from the model in Question 4 and the one with Q="unconstrained"
. Which is more supported?
Fit the model with Q="equalvarcov"
. Is this more supported based on AICc?
Develop the following alternative models for the structure of the inland harbor seal population. For each model assume that the observation errors are identical and independent (R="diagonal and equal"
). Assume that the process errors covary with equal variance and covariances (Q="equalvarcov"
).
Develop the following alternative models for the structure of the inland harbor seal population. For each model assume that the observation errors are identical and independent (R="diagonal and equal"
). Assume that the process errors covary with equal variance and covariances (Q="equalvarcov"
).
Fit each model using MARSS()
.
Prepare a table of each model with a column for the AICc values. And a column for \(\Delta AICc\) (AICc minus the lowest AICc in the group). What is the most supported model?
Do diagnostics on the model residuals for the 3 subpopulation model from question 4. Use the following code to get your model residuals. This will put NAs in the model residuals where there is missing data. Then do the tests on each row of resids
.
resids <- residuals(fit)$model.residuals
-resids[is.na(dat)] <- NA
Plot the model residuals.
Plot the ACF of the model residuals. Use acf(..., na.action=na.pass)
.
The model for one well-mixed population was not very good. Another reasonable assumption is that the different census regions are measuring four different temporally independent subpopulations. We write a model of the log subpopulation abundances for this case as:
\[\begin{equation}
\begin{gathered}
\begin{bmatrix}x_1\\x_2\\x_3\\x_4\end{bmatrix}_t =
@@ -457,11 +483,11 @@ 7.4 Four subpopulations with temp
\end{gathered}
\tag{7.7}
\end{equation}\]
-
The \(\mathbf{Q}\) matrix is diagonal with one variance value. This means that the process variance (variance in year-to-year population growth rates) is independent (good and bad years are not correlated) but the level of variability is the same across regions. We made the \(\mathbf{u}\) matrix with one \(u\) value. This means that we assume the population growth rates are the same across regions.
+The \(\mathbf{Q}\) matrix is diagonal with one variance value. This means that the process variance (variance in year-to-year population growth rates) is independent (good and bad years are not correlated) but the level of variability is the same across regions. We made the \(\mathbf{u}\) matrix with one \(u\) value. This means that we assume the population growth rates are the same across regions.Notice that we set the \(\mathbf{B}\) matrix equal to a diagonal matrix with 1 on the diagonal. This is the “identity” matrix and it is like a 1 but for matrices. We do not need \(\mathbf{B}\) for our model, but MARSS()
requires a value.
In this model, each survey is an observation of a different \(x\):
\[\begin{equation}
\left[ \begin{array}{c}
y_{1} \\
@@ -485,8 +511,8 @@ No \(a\)’s can be estimated since we do not have multiple observations of a given \(x\) time series. Our \(\mathbf{R}\) matrix doesn’t change; the observation errors are still assumed to the independent with different variances.7.4.1 The observation process
-
Notice that our \(\mathbf{Z}\) matrix changed. \(\mathbf{Z}\) is specifying which \(y_i\) goes to which \(x_j\). The one we have specified means that \(y_1\) is observing \(x_1\), \(y_2\) observes \(x_2\), etc. We could have set up \(\mathbf{Z}\) like so
\[\begin{equation}
\begin{bmatrix}
0 & 1 & 0 & 0 \\
@@ -494,24 +520,24 @@ 7.4.1 The observation process
+\end{equation}\]
This would mean that \(y_1\) observes \(x_2\), \(y_2\) observes \(x_1\), \(y_3\) observes \(x_4\), and \(y_4\) observes \(x_3\). Which \(x\) goes to which \(y\) is arbitrary; we need to make sure it is one-to-one. We will stay with \(\mathbf{Z}\) as an identity matrix since \(y_i\) observing \(x_i\) makes it easier to remember which \(x\) goes with which \(y\).
We set up the model list for MARSS()
as:
mod.list.1 <- list(
-B="identity",
-U="equal",
-Q="diagonal and equal",
-Z="identity",
-A="scaling",
-R="diagonal and unequal",
-x0="unequal",
-tinitx=0 )
We introduced a few more short-cuts. "equal"
means all the values in the matrix are the same. "diagonal and equal"
means that the matrix is diagonal with one value on the diagonal. "unequal"
means that all values in the matrix are different.
mod.list.1 <- list(
+B="identity",
+U="equal",
+Q="diagonal and equal",
+Z="identity",
+A="scaling",
+R="diagonal and unequal",
+x0="unequal",
+tinitx=0 )
We introduced a few more short-cuts. "equal"
means all the values in the matrix are the same. "diagonal and equal"
means that the matrix is diagonal with one value on the diagonal. "unequal"
means that all values in the matrix are different.
We can then fit our model for 4 subpopulations as:
-fit.1 <- MARSS::MARSS(dat, model=mod.list.1)
Only the \(\mathbf{Z}\) matrices change for our model. We will set up a base model list used for all models.
-mod.list = list(
-B = "identity",
-U = "unequal",
-Q = "equalvarcov",
-Z = "placeholder",
-A = "scaling",
-R = "diagonal and equal",
-x0 = "unequal",
-tinitx = 0 )
mod.list = list(
+B = "identity",
+U = "unequal",
+Q = "equalvarcov",
+Z = "placeholder",
+A = "scaling",
+R = "diagonal and equal",
+x0 = "unequal",
+tinitx = 0 )
Then we set up the \(\mathbf{Z}\) matrices using the factor short-cut.
-Z.models <- list(
-H1 = factor(c("pnw","pnw",rep("ps",4),"ca","ca","pnw","pnw","ps")),
-H2 = factor(c(rep("coast",2),rep("ps",4),rep("coast",4),"ps")),
-H3 = factor(c(rep("N",6),"S","S","N","S","N")),
-H4 = factor(c("nc","nc","is","is","ps","ps","sc","sc","nc","sc","is")),
-H5 = factor(rep("pan",11)),
-H6 = factor(1:11) #site
-)
-names(Z.models) <-
- c("stock","coast+PS","N+S","NC+strait+PS+SC","panmictic","site")
Z.models <- list(
+H1 = factor(c("pnw","pnw",rep("ps",4),"ca","ca","pnw","pnw","ps")),
+H2 = factor(c(rep("coast",2),rep("ps",4),rep("coast",4),"ps")),
+H3 = factor(c(rep("N",6),"S","S","N","S","N")),
+H4 = factor(c("nc","nc","is","is","ps","ps","sc","sc","nc","sc","is")),
+H5 = factor(rep("pan",11)),
+H6 = factor(1:11) #site
+)
+names(Z.models) <-
+ c("stock","coast+PS","N+S","NC+strait+PS+SC","panmictic","site")
We loop through the models, fit and store the results:
-out.tab <- NULL
-fits <- list()
-for(i in 1:length(Z.models)){
- mod.list$Z <- Z.models[[i]]
- fit <- MARSS::MARSS(sealData, model=mod.list,
- silent=TRUE, control=list(maxit=1000))
- out <- data.frame(H=names(Z.models)[i],
- logLik=fit$logLik, AICc=fit$AICc,
- num.param=fit$num.params,
- m=length(unique(Z.models[[i]])),
- num.iter=fit$numIter,
- converged=!fit$convergence)
- out.tab <- rbind(out.tab,out)
- fits <- c(fits,list(fit))
-}
out.tab <- NULL
+fits <- list()
+for(i in 1:length(Z.models)){
+ mod.list$Z <- Z.models[[i]]
+ fit <- MARSS::MARSS(sealData, model=mod.list,
+ silent=TRUE, control=list(maxit=1000))
+ out <- data.frame(H=names(Z.models)[i],
+ logLik=fit$logLik, AICc=fit$AICc,
+ num.param=fit$num.params,
+ m=length(unique(Z.models[[i]])),
+ num.iter=fit$numIter,
+ converged=!fit$convergence)
+ out.tab <- rbind(out.tab,out)
+ fits <- c(fits,list(fit))
+}
We will use AICc and AIC weights to summarize the data support for the different hypotheses. First we will sort the fits based on AICc:
-min.AICc <- order(out.tab$AICc)
-out.tab.1 <- out.tab[min.AICc,]
Next we add the \(\Delta\)AICc values by subtracting the lowest AICc:
-out.tab.1 <- cbind(out.tab.1,
- delta.AICc=out.tab.1$AICc-out.tab.1$AICc[1])
Relative likelihood is defined as \(\,\text{exp}(-\Delta \mathrm{AICc}/2)\).
-out.tab.1 <- cbind(out.tab.1,
- rel.like=exp(-1*out.tab.1$delta.AICc/2))
The AIC weight for a model is its relative likelihood divided by the sum of all the relative likelihoods.
-out.tab.1 <- cbind(out.tab.1,
- AIC.weight = out.tab.1$rel.like/sum(out.tab.1$rel.like))
Let’s look at the model weights (out.tab.1
):
H delta.AICc AIC.weight converged
NC+strait+PS+SC 0.00 0.979 TRUE
@@ -506,24 +532,24 @@ 7.8.1 Fit the models
For our next example, we will use MARSS models to test hypotheses about the population structure of harbor seals on the west coast. For this example, we will evaluate the support for different population structures (numbers of subpopulations) using different \(\mathbf{Z}\)s to specify how survey regions map onto subpopulations. We will assume correlated process errors with the same magnitude of process variance and covariance. We will assume independent observations errors with equal variances at each site. We could do unequal variances but it takes a long time to fit so for this example, the observation variances are set equal.
The dataset we will use is harborSeal
, a 29-year dataset of abundance indices for 12 regions along the U.S. west coast between 1975-2004 (Figure 7.5).
We start by setting up our data matrix. We will leave off Hood Canal.
-dat <- MARSS::harborSeal
-years <- dat[,"Year"]
-good <- !(colnames(dat)%in%c("Year","HoodCanal"))
-sealData <- t(dat[,good])
dat <- MARSS::harborSeal
+years <- dat[,"Year"]
+good <- !(colnames(dat)%in%c("Year","HoodCanal"))
+sealData <- t(dat[,good])
The survey methodologies were consistent throughout the 20 years of the data but we do not know what fraction of the population that each region represents nor do we know the observation-error variance for each region. Given differences between the numbers of haul-outs in each region, the observation errors may be quite different. The regions have had different levels of sampling; the best sampled region has only 4 years missing while the worst has over half the years missing (Figure 7.1).
The harbor seal data are included in the MARSS package as matrix with years in column 1 and the logged counts in the other columns. Let’s look at the first few years of data:
-data(harborSealWA, package="MARSS")
-print(harborSealWA[1:8,], digits=3)
Year SJF SJI EBays PSnd HC
[1,] 1978 6.03 6.75 6.63 5.82 6.6
[2,] 1979 NA NA NA NA NA
@@ -459,12 +485,12 @@ 7.2.1 Load the harbor seal data
[7,] 1984 6.93 7.74 7.45 NA NA
[8,] 1985 7.16 7.53 7.26 6.60 NA
We are going to leave out Hood Canal (HC) since that region is somewhat isolated from the others and experiencing very different conditions due to hypoxic events and periodic intense killer whale predation. We will set up the data as follows:
-dat <- MARSS::harborSealWA
-years = dat[,"Year"]
-dat= dat[,!(colnames(dat) %in% c("Year", "HC"))]
-dat=t(dat) #transpose to have years across columns
-colnames(dat) = years
-n = nrow(dat)-1
Here is an example where we have both process and observation error but the covariates only affect the process:
\[\begin{equation}
\begin{gathered}
\mathbf{x}_t = \mathbf{B}\mathbf{x}_{t-1} + \mathbf{C}_t\mathbf{c}_t + \mathbf{w}_t, \text{ where } \mathbf{w}_t \sim \text{MVN}(0,\mathbf{Q})\\
@@ -442,18 +468,18 @@ 8.5 Both process- and observation
\end{gathered}
\tag{8.5}
\end{equation}\]
-
\(\mathbf{x}\) is the true algae abundances and \(\mathbf{y}\) is the observation of the \(\mathbf{x}\)’s.
+\(\mathbf{x}\) is the true algae abundances and \(\mathbf{y}\) is the observation of the \(\mathbf{x}\)’s.Let’s say we knew that the observation variance on the algae measurements was about 0.16 and we wanted to include that known value in the model. To do that, we can simply add \(\mathbf{R}\) to the model list from the process-error only model in the last example.
-D <- d <- A <- U <- "zero"; Z <- "identity"
-B <- "diagonal and unequal"
-Q <- "equalvarcov"
-C <- "unconstrained"
-c <- covariates
-R <- diag(0.16,2)
-x0 <- "unequal"
-tinitx <- 1
-model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,D=D,d=d,C=C,c=c,x0=x0,tinitx=tinitx)
-kem <- MARSS(dat, model=model.list)
D <- d <- A <- U <- "zero"; Z <- "identity"
+B <- "diagonal and unequal"
+Q <- "equalvarcov"
+C <- "unconstrained"
+c <- covariates
+R <- diag(0.16,2)
+x0 <- "unequal"
+tinitx <- 1
+model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,D=D,d=d,C=C,c=c,x0=x0,tinitx=tinitx)
+kem <- MARSS(dat, model=model.list)
Success! abstol and log-log tests passed at 36 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -481,7 +507,7 @@ 8.5 Both process- and observation
Standard errors have not been calculated.
Use MARSSparamCIs to compute CIs and bias estimates.
Note, our estimates of the effect of temperature and total phosphorous are not that different than what you get from a simple multiple regression (our first example). This might be because the autoregressive component is small, meaning the estimated diagonals on the \(\mathbf{B}\) matrix are small.
-Here is an example where we have both process and observation error but the covariates only affect the observation process: +Here is an example where we have both process and observation error but the covariates only affect the observation process:
\[\begin{equation}
\begin{gathered}
\mathbf{x}_t = \mathbf{B}\mathbf{x}_{t-1} + \mathbf{w}_t, \text{ where } \mathbf{w}_t \sim \text{MVN}(0,\mathbf{Q})\\
@@ -489,17 +515,17 @@ 8.5 Both process- and observation
\end{gathered}
\tag{8.6}
\end{equation}\]
-
\(\mathbf{x}\) is the true algae abundances and \(\mathbf{y}\) is the observation of the \(\mathbf{x}\)’s.
-C <- c <- A <- U <- "zero"; Z <- "identity"
-B <- "diagonal and unequal"
-Q <- "equalvarcov"
-D <- "unconstrained"
-d <- covariates
-R <- diag(0.16,2)
-x0 <- "unequal"
-tinitx <- 1
-model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,D=D,d=d,C=C,c=c,x0=x0,tinitx=tinitx)
-kem <- MARSS(dat, model=model.list)
C <- c <- A <- U <- "zero"; Z <- "identity"
+B <- "diagonal and unequal"
+Q <- "equalvarcov"
+D <- "unconstrained"
+d <- covariates
+R <- diag(0.16,2)
+x0 <- "unequal"
+tinitx <- 1
+model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,D=D,d=d,C=C,c=c,x0=x0,tinitx=tinitx)
+kem <- MARSS(dat, model=model.list)
Success! abstol and log-log tests passed at 45 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -538,24 +564,24 @@ 8.5 Both process- and observation
For these problems, use the following code to load in phytoplankton data, covariates, and z-score all the data. Then use dat
and covars
directly in your code.
phytos <- c("Cryptomonas", "Diatoms", "Greens",
- "Unicells", "Other.algae")
-yrs <- lakeWAplanktonTrans[,"Year"]%in%1985:1994
-dat <- t(lakeWAplanktonTrans[yrs,phytos])
-#z-score the data
-avg <- apply(dat, 1, mean, na.rm=TRUE)
-sd <- sqrt(apply(dat, 1, var, na.rm=TRUE))
-dat <- (dat-avg)/sd
-rownames(dat)=phytos
-#z-score the covariates
-covars <- rbind(Temp=lakeWAplanktonTrans[yrs,"Temp"],
- TP=lakeWAplanktonTrans[yrs,"TP"])
-avg <- apply(covars, 1, mean)
-sd <- sqrt(apply(covars, 1, var, na.rm=TRUE))
-covars <- (covars-avg)/sd
-rownames(covars) <- c("Temp","TP")
-
-#always check that the mean and variance are 1 after z-scoring
-apply(dat,1,mean, na.rm=TRUE) #this should be 0
phytos <- c("Cryptomonas", "Diatoms", "Greens",
+ "Unicells", "Other.algae")
+yrs <- lakeWAplanktonTrans[,"Year"]%in%1985:1994
+dat <- t(lakeWAplanktonTrans[yrs,phytos])
+#z-score the data
+avg <- apply(dat, 1, mean, na.rm=TRUE)
+sd <- sqrt(apply(dat, 1, var, na.rm=TRUE))
+dat <- (dat-avg)/sd
+rownames(dat)=phytos
+#z-score the covariates
+covars <- rbind(Temp=lakeWAplanktonTrans[yrs,"Temp"],
+ TP=lakeWAplanktonTrans[yrs,"TP"])
+avg <- apply(covars, 1, mean)
+sd <- sqrt(apply(covars, 1, var, na.rm=TRUE))
+covars <- (covars-avg)/sd
+rownames(covars) <- c("Temp","TP")
+
+#always check that the mean and variance are 1 after z-scoring
+apply(dat,1,mean, na.rm=TRUE) #this should be 0
Cryptomonas Diatoms Greens Unicells Other.algae
2.329499e-17 1.463504e-17 -2.761472e-19 3.546358e-17 1.507041e-18
-apply(dat,1,var, na.rm=TRUE) #this should be 1
Cryptomonas Diatoms Greens Unicells Other.algae
1 1 1 1 1
Here are some guidelines to help you answer the questions:
Z="identity"
.B="diagonal and unequal"
. This implies that each of the taxa are operating under varying degrees of density-dependence, and that they do not interact with any of the other taxa.U="zero"
and A="zero"
. Make sure to check that the means of the data are 0 and the variance is 1.Z="identity"
.B="diagonal and unequal"
. This implies that each of the taxa are operating under varying degrees of density-dependence, and that they do not interact with any of the other taxa.U="zero"
and A="zero"
. Make sure to check that the means of the data are 0 and the variance is 1.tinitx=0
(the default). Do not change to tinitx=1
. You can try it to see what happens but answer the questions with tinitx=0
. Normally, tinitx=1
will make your models fit more easily when you are estimating \(\mathbf{B}\), but in this case it causes a problem. Why does the \(\mathbf{R}\) tend to go to zero when tinitx=1
for the models we are fitting?MARSS()
without any additional arguments. If you want, you can try using control=list(maxit=1000)
to increase the number of iterations. Or you can try method="BFGS"
in your MARSS()
call. This will use the BFGS optimization method, however it may throw an error for these data.MARSS()
without any additional arguments. If you want, you can try using control=list(maxit=1000)
to increase the number of iterations. Or you can try method="BFGS"
in your MARSS()
call. This will use the BFGS optimization method, however it may throw an error for these data.We will examine some basic model diagnostics for these three approaches by looking at plots of the model residuals and their autocorrelation functions (ACFs) for all five taxa using the following code:
-for(i in 1:3) {
- dev.new()
- modn <- paste("seas.mod",i,sep=".")
- for(j in 1:5) {
- plot.ts(residuals(get(modn))$model.residuals[j,],
- ylab="Residual", main=phytos[j])
- abline(h=0, lty="dashed")
- acf(residuals(get(modn))$model.residuals[j,])
- }
- }
for(i in 1:3) {
+ dev.new()
+ modn <- paste("seas.mod",i,sep=".")
+ for(j in 1:5) {
+ plot.ts(residuals(get(modn))$model.residuals[j,],
+ ylab="Residual", main=phytos[j])
+ abline(h=0, lty="dashed")
+ acf(residuals(get(modn))$model.residuals[j,])
+ }
+ }
We can estimate the effect of the covariates using a process-error only model, an observation-error only model, or a model with both types of error. An observation-error only model is a multivariate regression, and we will start here so you see the relationship of MARSS model to more familiar linear regression models.
-In a standard multivariate linear regression, we only have an observation model with independent errors (the state process does not appear in the model): +In a standard multivariate linear regression, we only have an observation model with independent errors (the state process does not appear in the model):
\[\begin{equation}
\mathbf{y}_t = \mathbf{a} + \mathbf{D}\mathbf{d}_t + \mathbf{v}_t, \text{ where } \mathbf{v}_t \sim \text{MVN}(0,\mathbf{R})
\tag{8.2}
@@ -460,15 +486,15 @@ 8.3 Observation-error only model<
v_{2} \end{bmatrix}_t
\end{split}
\tag{8.3}
-\end{equation}\]
+\end{equation}\]
Let’s fit this model with MARSS. The \(\mathbf{x}\) part of the model is irrelevant so we want to fix the parameters in that part of the model. We won’t set \(\mathbf{B}=0\) or \(\mathbf{Z}=0\) since that might cause numerical issues for the Kalman filter. Instead we fix them as identity matrices and fix \(\mathbf{x}_0=0\) so that \(\mathbf{x}_t=0\) for all \(t\).
-Q <- U <- x0 <- "zero"; B <- Z <- "identity"
-d <- covariates
-A <- "zero"
-D <- "unconstrained"
-y <- dat # to show relationship between dat & the equation
-model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,D=D,d=d,x0=x0)
-kem <- MARSS(y, model=model.list)
Q <- U <- x0 <- "zero"; B <- Z <- "identity"
+d <- covariates
+A <- "zero"
+D <- "unconstrained"
+y <- dat # to show relationship between dat & the equation
+model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,D=D,d=d,x0=x0)
+kem <- MARSS(y, model=model.list)
Success! algorithm run for 15 iterations. abstol and log-log tests passed.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -490,7 +516,7 @@ 8.3 Observation-error only model<
Standard errors have not been calculated.
Use MARSSparamCIs to compute CIs and bias estimates.
-We set A="zero"
because the data and covariates have been demeaned. Of course, one can do multiple regression in R using, say, lm()
, and that would be much, much faster. The EM algorithm is over-kill here, but it is shown so that you see how a standard multivariate linear regression model is written as a MARSS model in matrix form.
We set A="zero"
because the data and covariates have been demeaned. Of course, one can do multiple regression in R using, say, lm()
, and that would be much, much faster. The EM algorithm is over-kill here, but it is shown so that you see how a standard multivariate linear regression model is written as a MARSS model in matrix form.
A multivariate autoregressive state-space (MARSS) model with covariate effects in both the process and observation components is written as:
\[\begin{equation}
\begin{gathered}
\mathbf{x}_t = \mathbf{B}_t\mathbf{x}_{t-1} + \mathbf{u}_t + \mathbf{C}_t\mathbf{c}_t + \mathbf{w}_t, \text{ where } \mathbf{w}_t \sim \text{MVN}(0,\mathbf{Q}_t) \\
@@ -442,7 +468,7 @@ 8.1 Overview
\end{gathered}
\tag{8.1}
\end{equation}\]
-
where \(\mathbf{c}_t\) is the \(p \times 1\) vector of covariates (e.g., temperature, rainfall) which affect the states and \(\mathbf{d}_t\) is a \(q \times 1\) vector of covariates (potentially the same as \(\mathbf{c}_t\)), which affect the observations. \(\mathbf{C}_t\) is an \(m \times p\) matrix of coefficients relating the effects of \(\mathbf{c}_t\) to the \(m \times 1\) state vector \(\mathbf{x}_t\), and \(\mathbf{D}_t\) is an \(n \times q\) matrix of coefficients relating the effects of \(\mathbf{d}_t\) to the \(n \times 1\) observation vector \(\mathbf{y}_t\).
+where \(\mathbf{c}_t\) is the \(p \times 1\) vector of covariates (e.g., temperature, rainfall) which affect the states and \(\mathbf{d}_t\) is a \(q \times 1\) vector of covariates (potentially the same as \(\mathbf{c}_t\)), which affect the observations. \(\mathbf{C}_t\) is an \(m \times p\) matrix of coefficients relating the effects of \(\mathbf{c}_t\) to the \(m \times 1\) state vector \(\mathbf{x}_t\), and \(\mathbf{D}_t\) is an \(n \times q\) matrix of coefficients relating the effects of \(\mathbf{d}_t\) to the \(n \times 1\) observation vector \(\mathbf{y}_t\).With the MARSS()
function, one can fit this model by passing in model$c
and/or model$d
in the model
argument as a \(p \times T\) or \(q \times T\) matrix, respectively. The form for \(\mathbf{C}_t\) and \(\mathbf{D}_t\) is similarly specified by passing in model$C
and/or model$D
. \(\mathbf{C}\) and \(\mathbf{D}\) are matrices and are specified as 2-dimensional matrices as you would other parameter matrices.
We will prepare the data by z-scoring. The original data lakeWAplanktonTrans
were already z-scored, but we changed the mean when we subsampled the years so we need to z-score again.
# z-score the response variables
-the.mean = apply(dat,1,mean,na.rm=TRUE)
-the.sigma = sqrt(apply(dat,1,var,na.rm=TRUE))
-dat = (dat-the.mean)*(1/the.sigma)
# z-score the response variables
+the.mean = apply(dat,1,mean,na.rm=TRUE)
+the.sigma = sqrt(apply(dat,1,var,na.rm=TRUE))
+dat = (dat-the.mean)*(1/the.sigma)
Next we set up the covariate data, temperature and total phosphorous. We z-score the covariates to standardize and remove the mean.
-the.mean = apply(covariates,1,mean,na.rm=TRUE)
-the.sigma = sqrt(apply(covariates,1,var,na.rm=TRUE))
-covariates = (covariates-the.mean)*(1/the.sigma)
the.mean = apply(covariates,1,mean,na.rm=TRUE)
+the.sigma = sqrt(apply(covariates,1,var,na.rm=TRUE))
+covariates = (covariates-the.mean)*(1/the.sigma)
Read Section 8.8 for the data and tips on answering the questions and setting up your models. Note the questions asking about the effects on growth rate are asking about the C matrix in \[\mathbf{x}_t=\mathbf{B}\mathbf{x}_{t-1}+\mathbf{C}\mathbf{c}_t+\mathbf{w}_t\] The \(\mathbf{C}\mathbf{c}_t+\mathbf{w}_t\) are the process errors and represent the growth rates (growth above or below what you would expect given \(\mathbf{x}_{t-1}\)). Use your raw data in the MARSS model. You do not need to difference the data to get at the growth rates since the process model is modeling that.
+Read Section 8.8 for the data and tips on answering the questions and setting up your models. Note the questions asking about the effects on growth rate are asking about the C matrix in +\[\mathbf{x}_t=\mathbf{B}\mathbf{x}_{t-1}+\mathbf{C}\mathbf{c}_t+\mathbf{w}_t\] +The \(\mathbf{C}\mathbf{c}_t+\mathbf{w}_t\) are the process errors and represent the growth rates (growth above or below what you would expect given \(\mathbf{x}_{t-1}\)). Use your raw data in the MARSS model. You do not need to difference the data to get at the growth rates since the process model is modeling that.
Now let’s model the data as an autoregressive process observed without error, and incorporate the covariates into the process model. Note that this is much different from typical linear regression models. The \(\mathbf{x}\) part represents our model of the data (in this case plankton species). How is this different from the autoregressive observation errors? Well, we are modeling our data as autoregressive so data at \(t-1\) affects the data at \(t\). Population abundances are inherently autoregressive so this model is a bit closer to the underlying mechanism generating the data. Here is our new process model for plankton abundance. \[\begin{equation} \mathbf{x}_t = \mathbf{x}_{t-1} + \mathbf{C}\mathbf{c}_t + \mathbf{w}_t, \text{ where } \mathbf{w}_t \sim \text{MVN}(0,\mathbf{Q}) \tag{8.4} \end{equation}\] -
We can fit this as follows:
-R <- A <- U <- "zero"; B <- Z <- "identity"
-Q <- "equalvarcov"
-C <- "unconstrained"
-model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=covariates)
-kem <- MARSS(dat, model=model.list)
R <- A <- U <- "zero"; B <- Z <- "identity"
+Q <- "equalvarcov"
+C <- "unconstrained"
+model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=covariates)
+kem <- MARSS(dat, model=model.list)
Success! algorithm run for 15 iterations. abstol and log-log tests passed.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -470,8 +496,8 @@ 8.4 Process-error only model
Standard errors have not been calculated.
Use MARSSparamCIs to compute CIs and bias estimates.
Now, it looks like temperature has a strong negative effect on algae? Also our log-likelihood dropped a lot. Well, the data do not look at all like a random walk model (where \(\mathbf{B}=1\)), which we can see from the plot of the data (Figure 8.1). The data are fluctuating about some mean so let’s switch to a better autoregressive model—a mean-reverting model. To do this, we will allow the diagonal elements of \(\mathbf{B}\) to be something other than 1.
-model.list$B <- "diagonal and unequal"
-kem <- MARSS(dat, model=model.list)
Success! algorithm run for 15 iterations. abstol and log-log tests passed.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -500,10 +526,10 @@ 8.4 Process-error only model
Use MARSSparamCIs to compute CIs and bias estimates.
Notice that the log-likelihood goes up quite a bit, which means that the mean-reverting model fits the data much better.
With this model, we are estimating \(\mathbf{x}_0\). If we set model$tinitx=1
, we will get a error message that \(\mathbf{R}\) diagonals are equal to 0 and we need to fix x0
. Because \(\mathbf{R}=0\), if we set the initial states at \(t=1\), then they are fully determined by the data.
x0 <- dat[,1,drop=FALSE]
-model.list$tinitx <- 1
-model.list$x0 <- x0
-kem <- MARSS(dat, model=model.list)
x0 <- dat[,1,drop=FALSE]
+model.list$tinitx <- 1
+model.list$x0 <- x0
+kem <- MARSS(dat, model=model.list)
Success! algorithm run for 15 iterations. abstol and log-log tests passed.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -540,24 +566,24 @@ 8.4 Process-error only model
Time-series data are often collected at intervals with some implicit ``seasonality.’’ For example, quarterly earnings for a business, monthly rainfall totals, or hourly air temperatures. In those cases, it is often helpful to extract any recurring seasonal patterns that might otherwise mask some of the other temporal dynamics we are interested in examining.
Here we show a few approaches for including seasonal effects using the Lake Washington plankton data, which were collected monthly. The following examples will use all five phytoplankton species from Lake Washington. First, let’s set up the data.
-years <- fulldat[,"Year"]>=1965 & fulldat[,"Year"]<1975
-phytos <- c("Diatoms", "Greens", "Bluegreens",
- "Unicells", "Other.algae")
-dat <- t(fulldat[years, phytos])
-
-# z.score data because we changed the mean when we subsampled
-the.mean <- apply(dat,1,mean,na.rm=TRUE)
-the.sigma <- sqrt(apply(dat,1,var,na.rm=TRUE))
-dat <- (dat-the.mean)*(1/the.sigma)
-# number of time periods/samples
-TT <- dim(dat)[2]
years <- fulldat[,"Year"]>=1965 & fulldat[,"Year"]<1975
+phytos <- c("Diatoms", "Greens", "Bluegreens",
+ "Unicells", "Other.algae")
+dat <- t(fulldat[years, phytos])
+
+# z.score data because we changed the mean when we subsampled
+the.mean <- apply(dat,1,mean,na.rm=TRUE)
+the.sigma <- sqrt(apply(dat,1,var,na.rm=TRUE))
+dat <- (dat-the.mean)*(1/the.sigma)
+# number of time periods/samples
+TT <- dim(dat)[2]
One common approach for estimating seasonal effects is to treat each one as a fixed factor, such that the number of parameters equals the number of ``seasons’’ (e.g., 24 hours per day, 4 quarters per year). The plankton data are collected monthly, so we will treat each month as a fixed factor. To fit a model with fixed month effects, we create a \(12 \times T\) covariate matrix \(\mathbf{c}\) with one row for each month (Jan, Feb, …) and one column for each time point. We put a 1 in the January row for each column corresponding to a January time point, a 1 in the February row for each column corresponding to a February time point, and so on. All other values of \(\mathbf{c}\) equal 0. The following code will create such a \(\mathbf{c}\) matrix.
-# number of "seasons" (e.g., 12 months per year)
-period <- 12
-# first "season" (e.g., Jan = 1, July = 7)
-per.1st <- 1
-# create factors for seasons
-c.in <- diag(period)
-for(i in 2:(ceiling(TT/period)))
- {c.in <- cbind(c.in,diag(period))}
-# trim c.in to correct start & length
-c.in <- c.in[,(1:TT)+(per.1st-1)]
-# better row names
-rownames(c.in) <- month.abb
Next we need to set up the form of the \(\mathbf{C}\) matrix which defines any constraints we want to set on the month effects. \(\mathbf{C}\) is a \(5 \times 12\) matrix. Five taxon and 12 month effects. If we wanted each taxon to have the same month effect, i.e. there is a common month effect across all taxon, then we have the same value in each \(\mathbf{C}\) column:
-C <- matrix(month.abb,5,12,byrow=TRUE)
-C
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
-[1,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
-[2,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
-[3,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
-[4,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
-[5,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov"
- [,12]
-[1,] "Dec"
-[2,] "Dec"
-[3,] "Dec"
-[4,] "Dec"
-[5,] "Dec"
-Notice, that \(\mathbf{C}\) only has 12 values in it, the 12 common month effects. However, for this example, we will let each taxon have a different month effect thus allowing different seasonality for each taxon. For this model, we want each value in \(\mathbf{C}\) to be unique:
-C <- "unconstrained"
# number of "seasons" (e.g., 12 months per year)
+period <- 12
+# first "season" (e.g., Jan = 1, July = 7)
+per.1st <- 1
+# create factors for seasons
+c.in <- diag(period)
+for(i in 2:(ceiling(TT/period)))
+ {c.in <- cbind(c.in,diag(period))}
+# trim c.in to correct start & length
+c.in <- c.in[,(1:TT)+(per.1st-1)]
+# better row names
+rownames(c.in) <- month.abb
Next we need to set up the form of the \(\mathbf{C}\) matrix which defines any constraints we want to set on the month effects. \(\mathbf{C}\) is a \(5 \times 12\) matrix. Five taxon and 12 month effects. +If we wanted each taxon to have the same month effect, i.e. there is a common month effect across all taxon, then +we have the same value in each \(\mathbf{C}\) column:
+ + [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
+[1,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
+[2,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
+[3,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
+[4,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
+[5,] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
+Notice, that \(\mathbf{C}\) only has 12 values in it, the 12 common month effects. +However, for this example, we will let each taxon have a different month effect thus allowing different seasonality for each taxon. For this model, we want each value in \(\mathbf{C}\) to be unique:
+Now \(\mathbf{C}\) has 5 \(\times\) 12 = 60 separate effects.
Then we set up the form for the rest of the model parameters. We make the following assumptions:
-# Each taxon has unique density-dependence
-B <- "diagonal and unequal"
-# Assume independent process errors
-Q <- "diagonal and unequal"
-# We have demeaned the data & are fitting a mean-reverting model
-# by estimating a diagonal B, thus
-U <- "zero"
-# Each obs time series is associated with only one process
-Z <- "identity"
-# The data are demeaned & fluctuate around a mean
-A <- "zero"
-# We assume observation errors are independent, but they
-# have similar variance due to similar collection methods
-R <- "diagonal and equal"
-# We are not including covariate effects in the obs equation
-D <- "zero"
-d <- "zero"
# Each taxon has unique density-dependence
+B <- "diagonal and unequal"
+# Assume independent process errors
+Q <- "diagonal and unequal"
+# We have demeaned the data & are fitting a mean-reverting model
+# by estimating a diagonal B, thus
+U <- "zero"
+# Each obs time series is associated with only one process
+Z <- "identity"
+# The data are demeaned & fluctuate around a mean
+A <- "zero"
+# We assume observation errors are independent, but they
+# have similar variance due to similar collection methods
+R <- "diagonal and equal"
+# We are not including covariate effects in the obs equation
+D <- "zero"
+d <- "zero"
Now we can set up the model list for MARSS and fit the model (results are not shown since they are verbose with 60 different month effects).
-model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=c.in,D=D,d=d)
-seas.mod.1 <- MARSS(dat,model=model.list,control=list(maxit=1500))
-
-# Get the estimated seasonal effects
-# rows are taxa, cols are seasonal effects
-seas.1 <- coef(seas.mod.1,type="matrix")$C
-rownames(seas.1) <- phytos
-colnames(seas.1) <- month.abb
model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=c.in,D=D,d=d)
+seas.mod.1 <- MARSS(dat,model=model.list,control=list(maxit=1500))
+
+# Get the estimated seasonal effects
+# rows are taxa, cols are seasonal effects
+seas.1 <- coef(seas.mod.1,type="matrix")$C
+rownames(seas.1) <- phytos
+colnames(seas.1) <- month.abb
The top panel in Figure 8.2 shows the estimated seasonal effects for this model. Note that if we had set U=“unequal”, we would need to set one of the columns of \(\mathbf{C}\) to zero because the model would be under-determined (infinite number of solutions). If we substracted the mean January abundance off each time series, we could set the January column in \(\mathbf{C}\) to 0 and get rid of 5 estimated effects.
The fixed factor approach required estimating 60 effects. Another approach is to model the month effect as a 3rd-order (or higher) polynomial: \(a+b\times m + c\times m^2 + d \times m^3\) where \(m\) is the month number. This approach has less flexibility but requires only 20 estimated parameters (i.e., 4 regression parameters times 5 taxa). To do so, we create a \(4 \times T\) covariate matrix \(\mathbf{c}\) with the rows corresponding to 1, \(m\), \(m^2\), and \(m^3\), and the columns again corresponding to the time points. Here is how to set up this matrix:
-# number of "seasons" (e.g., 12 months per year)
-period <- 12
-# first "season" (e.g., Jan = 1, July = 7)
-per.1st <- 1
-# order of polynomial
-poly.order <- 3
-# create polynomials of months
-month.cov <- matrix(1,1,period)
-for(i in 1:poly.order) {month.cov = rbind(month.cov,(1:12)^i)}
-# our c matrix is month.cov replicated once for each year
-c.m.poly <- matrix(month.cov, poly.order+1, TT+period, byrow=FALSE)
-# trim c.in to correct start & length
-c.m.poly <- c.m.poly[,(1:TT)+(per.1st-1)]
-
-# Everything else remains the same as in the previous example
-model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=c.m.poly,D=D,d=d)
-seas.mod.2 <- MARSS(dat, model=model.list, control=list(maxit=1500))
# number of "seasons" (e.g., 12 months per year)
+period <- 12
+# first "season" (e.g., Jan = 1, July = 7)
+per.1st <- 1
+# order of polynomial
+poly.order <- 3
+# create polynomials of months
+month.cov <- matrix(1,1,period)
+for(i in 1:poly.order) {month.cov = rbind(month.cov,(1:12)^i)}
+# our c matrix is month.cov replicated once for each year
+c.m.poly <- matrix(month.cov, poly.order+1, TT+period, byrow=FALSE)
+# trim c.in to correct start & length
+c.m.poly <- c.m.poly[,(1:TT)+(per.1st-1)]
+
+# Everything else remains the same as in the previous example
+model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=c.m.poly,D=D,d=d)
+seas.mod.2 <- MARSS(dat, model=model.list, control=list(maxit=1500))
The effect of month \(m\) for taxon \(i\) is \(a_i + b_i \times m + c_i \times m^2 + d_i \times m^3\), where \(a_i\), \(b_i\), \(c_i\) and \(d_i\) are in the \(i\)-th row of \(\mathbf{C}\). We can now calculate the matrix of seasonal effects as follows, where each row is a taxon and each column is a month:
-C.2 = coef(seas.mod.2,type="matrix")$C
-seas.2 = C.2 %*% month.cov
-rownames(seas.2) <- phytos
-colnames(seas.2) <- month.abb
C.2 = coef(seas.mod.2,type="matrix")$C
+seas.2 = C.2 %*% month.cov
+rownames(seas.2) <- phytos
+colnames(seas.2) <- month.abb
The middle panel in Figure 8.2 shows the estimated seasonal effects for this polynomial model.
Note: Setting the covariates up like this means that our covariates are collinear since \(m\), \(m^2\) and \(m^3\) are correlated, obviously. A better approach is to use the poly()
function to create an orthogonal polynomial covariate matrix c.m.poly.o
:
month.cov.o <- cbind(1, poly(1:period, poly.order))
-c.m.poly.o <- matrix(t(month.cov.o), poly.order+1, TT+period, byrow=FALSE)
-c.m.poly.o <- c.m.poly.o[,(1:TT)+(per.1st-1)]
The factor approach required estimating 60 effects, and the 3rd order polynomial model was an improvement at only 20 parameters. A third option is to use a discrete Fourier series, which is combination of sine and cosine waves; it would require only 10 parameters. Specifically, the effect of month \(m\) on taxon \(i\) is \(a_i \times \cos(2 \pi m/p) + b_i \times \sin(2 \pi m/p)\), where \(p\) is the period (e.g., 12 months, 4 quarters), and \(a_i\) and \(b_i\) are contained in the \(i\)-th row of \(\mathbf{C}\).
We begin by defining the \(2 \times T\) seasonal covariate matrix \(\mathbf{c}\) as a combination of 1 cosine and 1 sine wave:
-cos.t <- cos(2 * pi * seq(TT) / period)
-sin.t <- sin(2 * pi * seq(TT) / period)
-c.Four <- rbind(cos.t,sin.t)
cos.t <- cos(2 * pi * seq(TT) / period)
+sin.t <- sin(2 * pi * seq(TT) / period)
+c.Four <- rbind(cos.t,sin.t)
Everything else remains the same and we can fit this model as follows:
-model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=c.Four,D=D,d=d)
-seas.mod.3 <- MARSS(dat, model=model.list, control=list(maxit=1500))
model.list <- list(B=B,U=U,Q=Q,Z=Z,A=A,R=R,C=C,c=c.Four,D=D,d=d)
+seas.mod.3 <- MARSS(dat, model=model.list, control=list(maxit=1500))
We make our seasonal effect matrix as follows:
-C.3 <- coef(seas.mod.3, type="matrix")$C
-# The time series of net seasonal effects
-seas.3 <- C.3 %*% c.Four[,1:period]
-rownames(seas.3) <- phytos
-colnames(seas.3) <- month.abb
C.3 <- coef(seas.mod.3, type="matrix")$C
+# The time series of net seasonal effects
+seas.3 <- C.3 %*% c.Four[,1:period]
+rownames(seas.3) <- phytos
+colnames(seas.3) <- month.abb
The bottom panel in Figure 8.2 shows the estimated seasonal effects for this seasonal-effects model based on a discrete Fourier series.
Rather than rely on our eyes to judge model fits, we should formally assess which of the 3 approaches offers the most parsimonious fit to the data. Here is a table of AICc values for the 3 models:
-data.frame(Model=c("Fixed", "Cubic", "Fourier"),
- AICc=round(c(seas.mod.1$AICc,
- seas.mod.2$AICc,
- seas.mod.3$AICc),1))
data.frame(Model=c("Fixed", "Cubic", "Fourier"),
+ AICc=round(c(seas.mod.1$AICc,
+ seas.mod.2$AICc,
+ seas.mod.3$AICc),1))
Model AICc
1 Fixed 1188.4
2 Cubic 1144.9
@@ -587,24 +610,24 @@ 8.6.3 Seasonal effects as a Fouri
The specific formulation of Equation (8.1) creates restrictions on the assumptions regarding the covariate data. You have to assume that your covariate data has no error, which is probably not true. You cannot have missing values in your covariate data, again unlikely. You cannot combine instrument time series; for example, if you have two temperature recorders with different error rates and biases. Also, what if you have one noisy temperature sensor in the first part of your time series and then you switch to a much better sensor in the second half of your time series? All these problems require pre-analysis massaging of the covariate data, leaving out noisy and gappy covariate data, and making what can feel like arbitrary choices about which covariate time series to include.
-To circumvent these potential problems and allow more flexibility in how we incorporate covariate data, one can instead treat the covariates as components of an auto-regressive process by including them in both the process and observation models. Beginning with the process equation, we can write +To circumvent these potential problems and allow more flexibility in how we incorporate covariate data, one can instead treat the covariates as components of an auto-regressive process by including them in both the process and observation models. Beginning with the process equation, we can write
\[\begin{equation}
\begin{gathered}
\begin{bmatrix}\mathbf{x}^{(v)} \\ \mathbf{x}^{(c)}\end{bmatrix}_t
@@ -447,8 +473,8 @@ 11.1 Covariates with missing valu
\end{gathered}
\tag{11.1}
\end{equation}\]
-
The elements with superscript \({(v)}\) are for the \(k\) variate states and those with superscript \({(c)}\) are for the \(q\) covariate states. The dimension of \(\mathbf{x}^{(c)}\) is \(q \times 1\) and \(q\) is not necessarily equal to \(p\), the number of covariate observation time series in your dataset. Imagine, for example, that you have two temperature sensors and you are combining these data. Then you have two covariate observation time series (\(p=2\)) but only one underlying covariate state time series (\(q=1\)). The matrix \(\mathbf{C}\) is dimension \(k \times q\), and \(\mathbf{B}^{(c)}\) and \(\mathbf{Q}^{(c)}\) are dimension \(q \times q\). The dimension of \(\mathbf{x}^{(v)}\) is \(k \times 1\), and \(\mathbf{B}^{(v)}\) and \(\mathbf{Q}^{(v)}\) are dimension \(k \times k\). The dimension of \(\mathbf{x}\) is always denoted \(m\). If your process model includes only variates, then \(k=m\), but now your process model includes \(k\) variates and \(q\) covariate states so \(m=k+q\).
-Next, we can write the observation equation in an analogous manner, such that +The elements with superscript \({(v)}\) are for the \(k\) variate states and those with superscript \({(c)}\) are for the \(q\) covariate states. The dimension of \(\mathbf{x}^{(c)}\) is \(q \times 1\) and \(q\) is not necessarily equal to \(p\), the number of covariate observation time series in your dataset. Imagine, for example, that you have two temperature sensors and you are combining these data. Then you have two covariate observation time series (\(p=2\)) but only one underlying covariate state time series (\(q=1\)). The matrix \(\mathbf{C}\) is dimension \(k \times q\), and \(\mathbf{B}^{(c)}\) and \(\mathbf{Q}^{(c)}\) are dimension \(q \times q\). The dimension of \(\mathbf{x}^{(v)}\) is \(k \times 1\), and \(\mathbf{B}^{(v)}\) and \(\mathbf{Q}^{(v)}\) are dimension \(k \times k\). The dimension of \(\mathbf{x}\) is always denoted \(m\). If your process model includes only variates, then \(k=m\), but now your process model includes \(k\) variates and \(q\) covariate states so \(m=k+q\). +Next, we can write the observation equation in an analogous manner, such that
\[\begin{equation}
\begin{gathered}
\begin{bmatrix} \mathbf{y}^{(v)} \\ \mathbf{y}^{(c)} \end{bmatrix}_t
@@ -460,9 +486,9 @@ 11.1 Covariates with missing valu
\end{gathered}
\tag{11.2}
\end{equation}\]
-
The dimension of \(\mathbf{y}^{(c)}\) is \(p \times 1\), where \(p\) is the number of covariate observation time series in your dataset. The dimension of \(\mathbf{y}^{(v)}\) is \(l \times 1\), where \(l\) is the number of variate observation time series in your dataset. The total dimension of \(\mathbf{y}\) is \(l+p\). The matrix \(\mathbf{D}\) is dimension \(l \times q\), \(\mathbf{Z}^{(c)}\) is dimension \(p \times q\), and \(\mathbf{R}^{(c)}\) are dimension \(p \times p\). The dimension of \(\mathbf{Z}^{(v)}\) is dimension \(l \times k\), and \(\mathbf{R}^{(v)}\) are dimension \(l \times l\).
+The dimension of \(\mathbf{y}^{(c)}\) is \(p \times 1\), where \(p\) is the number of covariate observation time series in your dataset. The dimension of \(\mathbf{y}^{(v)}\) is \(l \times 1\), where \(l\) is the number of variate observation time series in your dataset. The total dimension of \(\mathbf{y}\) is \(l+p\). The matrix \(\mathbf{D}\) is dimension \(l \times q\), \(\mathbf{Z}^{(c)}\) is dimension \(p \times q\), and \(\mathbf{R}^{(c)}\) are dimension \(p \times p\). The dimension of \(\mathbf{Z}^{(v)}\) is dimension \(l \times k\), and \(\mathbf{R}^{(v)}\) are dimension \(l \times l\).The \(\mathbf{D}\) matrix would presumably have a number of all zero rows in it, as would the \(\mathbf{C}\) matrix. The covariates that affect the states would often be different than the covariates that affect the observations. For example, mean annual temperature might affect population growth rates for many species while having little or no affect on observability, and turbidity might strongly affect observability in many types of aquatic surveys but have little affect on population growth rate.
-Our MARSS model with covariates now looks on the surface like a regular MARSS model: +Our MARSS model with covariates now looks on the surface like a regular MARSS model:
\[\begin{equation}
\begin{gathered}
\mathbf{x}_t = \mathbf{B}\mathbf{x}_{t-1} + \mathbf{u} + \mathbf{w}_t, \text{ where } \mathbf{w}_t \sim \,\text{MVN}(0,\mathbf{Q}) \\
@@ -477,8 +503,8 @@ 11.1 Covariates with missing valu
\end{gathered}
\tag{11.3}
\end{equation}\]
-
Note \(\mathbf{Q}\) and \(\mathbf{R}\) are written as block diagonal matrices, but you could allow covariances if that made sense. \(\mathbf{u}\) and \(\mathbf{a}\) are column vectors here. We can fit the model (Equation (11.3)) as usual using the MARSS()
function.
MARSS()
function.
+The log-likelihood that is returned by MARSS will include the log-likelihood of the covariates under the covariate state model. If you want only the the log-likelihood of the non-covariate data, you will need to subtract off the log-likelihood of the covariate model:
\[\begin{equation}
\begin{gathered}
\mathbf{x}^{(c)}_t = \mathbf{B}^{(c)}\mathbf{x}_{t-1}^{(c)} + \mathbf{u}^{(c)} + \mathbf{w}_t, \text{ where } \mathbf{w}_t \sim \,\text{MVN}(0,\mathbf{Q}^{(c)}) \\
@@ -486,13 +512,16 @@ 11.1 Covariates with missing valu
\end{gathered}
\tag{11.4}
\end{equation}\]
-
An easy way to get this log-likelihood for the covariate data only is use the augmented model (Equation (11.2) with terms defined as in Equation (11.3) but pass in missing values for the non-covariate data. The following code shows how to do this.
-y.aug = rbind(data,covariates)
-fit.aug = MARSS(y.aug, model=model.aug)
fit.aug
is the MLE object that can be passed to MARSSkf()
. You need to make a version of this MLE object with the non-covariate data filled with NAs so that you can compute the log-likelihood without the covariates. This needs to be done in the marss
element since that is what is used by MARSSkf()
. Below is code to do this.
fit.cov = fit.aug
-fit.cov$marss$data[1:dim(data)[1], ] = NA
-extra.LL = MARSSkf(fit.cov)$logLik
Note that when you fit the augmented model, the estimates of \(\mathbf{C}\) and \(\mathbf{B}^{(c)}\) are affected by the non-covariate data since the model for both the non-covariate and covariate data are estimated simultaneously and are not independent (since the covariate states affect the non-covariates states). If you want the covariate model to be unaffected by the non-covariate data, you can fit the covariate model separately and use the estimates for \(\mathbf{B}^{(c)}\) and \(\mathbf{Q}^{(c)}\) as fixed values in your augmented model.
A variation of the random walk model described previously is the autoregressive time series model of order 1, AR(1). This model is essentially the same as the random walk model but it introduces an estimated coefficient, which we will call \(\phi\). The parameter \(\phi\) controls the degree to which the random walk reverts to the mean – when \(\phi\) = 1, the model is identical to the random walk, but at smaller values, the model will revert back to the mean (which in this case is zero). Also, \(\phi\) can take on negative values, which we’ll discuss more in future lectures. The math to describe the AR(1) model is: \[y_t = \phi y_{t-1} + e_{t}\].
+A variation of the random walk model described previously is the autoregressive time series model of order 1, AR(1). This model is essentially the same as the random walk model but it introduces an estimated coefficient, which we will call \(\phi\). The parameter \(\phi\) controls the degree to which the random walk reverts to the mean – when \(\phi\) = 1, the model is identical to the random walk, but at smaller values, the model will revert back to the mean (which in this case is zero). Also, \(\phi\) can take on negative values, which we’ll discuss more in future lectures. The math to describe the AR(1) model is: +\[y_t = \phi y_{t-1} + e_{t}\].
The fit_stan()
function can fit higher order AR models, but for now we just want to fit an AR(1) model and make a histogram of phi.
ar1 <- atsar::fit_stan(y = Temp, x = matrix(1, nrow = length(Temp), ncol = 1),
- model_name = "ar", est_drift=FALSE, P = 1)
First load the plankton dataset from the MARSS package.
- library(MARSS)
- data(lakeWAplankton, package="MARSS")
- # we want lakeWAplanktonTrans, which has been transformed
- # so the 0s are replaced with NAs and the data z-scored
- dat <- lakeWAplanktonTrans
- # use only the 10 years from 1980-1989
- plankdat <- dat[dat[,"Year"]>=1980 & dat[,"Year"]<1990,]
- # create vector of phytoplankton group names
- phytoplankton <- c("Cryptomonas", "Diatoms", "Greens",
- "Unicells", "Other.algae")
- # get only the phytoplankton
- dat.spp.1980 <- t(plankdat[,phytoplankton])
- # z-score the data since we subsetted time
- dat.spp.1980 <- dat.spp.1980-apply(dat.spp.1980,1,mean,na.rm=TRUE)
- dat.spp.1980 <- dat.spp.1980/sqrt(apply(dat.spp.1980,1,var,na.rm=TRUE))
- #check our z-score
- apply(dat.spp.1980,1,mean,na.rm=TRUE)
library(MARSS)
+ data(lakeWAplankton, package="MARSS")
+ # we want lakeWAplanktonTrans, which has been transformed
+ # so the 0s are replaced with NAs and the data z-scored
+ dat <- lakeWAplanktonTrans
+ # use only the 10 years from 1980-1989
+ plankdat <- dat[dat[,"Year"]>=1980 & dat[,"Year"]<1990,]
+ # create vector of phytoplankton group names
+ phytoplankton <- c("Cryptomonas", "Diatoms", "Greens",
+ "Unicells", "Other.algae")
+ # get only the phytoplankton
+ dat.spp.1980 <- t(plankdat[,phytoplankton])
+ # z-score the data since we subsetted time
+ dat.spp.1980 <- dat.spp.1980-apply(dat.spp.1980,1,mean,na.rm=TRUE)
+ dat.spp.1980 <- dat.spp.1980/sqrt(apply(dat.spp.1980,1,var,na.rm=TRUE))
+ #check our z-score
+ apply(dat.spp.1980,1,mean,na.rm=TRUE)
Cryptomonas Diatoms Greens Unicells Other.algae
4.951913e-17 -1.337183e-17 3.737694e-18 -5.276451e-18 4.365269e-18
- apply(dat.spp.1980,1,var,na.rm=TRUE)
Cryptomonas Diatoms Greens Unicells Other.algae
1 1 1 1 1
Plot the data.
-#make into ts since easier to plot
-dat.ts <- ts(t(dat.spp.1980),frequency=12, start=c(1980,1))
-par(mfrow=c(3,2),mar=c(2,2,2,2))
-for(i in 1:5)
- plot(dat.ts[,i], type="b",
- main=colnames(dat.ts)[i],col="blue",pch=16)
#make into ts since easier to plot
+dat.ts <- ts(t(dat.spp.1980),frequency=12, start=c(1980,1))
+par(mfrow=c(3,2),mar=c(2,2,2,2))
+for(i in 1:5)
+ plot(dat.ts[,i], type="b",
+ main=colnames(dat.ts)[i],col="blue",pch=16)
Run a 3 trend model on these data.
-mod_3 <- atsar::fit_dfa(y = dat.spp.1980, num_trends=3)
Rotate the estimated trends and look at what it produces.
-rot <- atsar::rotate_trends(mod_3)
-names(rot)
[1] "Z_rot" "trends" "Z_rot_mean" "trends_mean"
-[5] "trends_lower" "trends_upper"
+
+[1] "Z_rot" "trends" "Z_rot_mean" "trends_mean" "trends_lower"
+[6] "trends_upper"
Plot the estimate of the trends.
-matplot(t(rot$trends_mean),type="l",lwd=2,ylab="mean trend")
We will fit multiple DFA with different numbers of trends and use leave one out (LOO) cross-validation to choose the best model.
-mod_1 = atsar::fit_dfa(y = dat.spp.1980, num_trends=1)
-mod_2 = atsar::fit_dfa(y = dat.spp.1980, num_trends=2)
-mod_3 = atsar::fit_dfa(y = dat.spp.1980, num_trends=3)
-mod_4 = atsar::fit_dfa(y = dat.spp.1980, num_trends=4)
Warning: There were 28 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
+mod_1 = atsar::fit_dfa(y = dat.spp.1980, num_trends=1)
+mod_2 = atsar::fit_dfa(y = dat.spp.1980, num_trends=2)
+mod_3 = atsar::fit_dfa(y = dat.spp.1980, num_trends=3)
+Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
+Running the chains for more iterations may help. See
+http://mc-stan.org/misc/warnings.html#tail-ess
+
+Warning: The largest R-hat is 1.05, indicating chains have not mixed.
+Running the chains for more iterations may help. See
+http://mc-stan.org/misc/warnings.html#r-hat
+Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
+Running the chains for more iterations may help. See
+http://mc-stan.org/misc/warnings.html#bulk-ess
+Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
+Running the chains for more iterations may help. See
+http://mc-stan.org/misc/warnings.html#tail-ess
+
+Warning: There were 92 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
Warning: Examine the pairs() plot to diagnose sampling problems
-mod_5 = atsar::fit_dfa(y = dat.spp.1980, num_trends=5)
-Warning: There were 6 transitions after warmup that exceeded the maximum treedepth. Increase max_treedepth above 10. See
-http://mc-stan.org/misc/warnings.html#maximum-treedepth-exceeded
-
-Warning: Examine the pairs() plot to diagnose sampling problems
+Warning: The largest R-hat is 1.12, indicating chains have not mixed.
+Running the chains for more iterations may help. See
+http://mc-stan.org/misc/warnings.html#r-hat
+Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
+Running the chains for more iterations may help. See
+http://mc-stan.org/misc/warnings.html#bulk-ess
+Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
+Running the chains for more iterations may help. See
+http://mc-stan.org/misc/warnings.html#tail-ess
We will compute the Leave One Out Information Criterion (LOOIC) using the loo package. Like AIC, lower is better.
-loo::loo(loo::extract_log_lik(mod_1))$looic
-[1] 1633.909
+
+[1] 1662.225
Table of the LOOIC values:
-looics = c(loo::loo(loo::extract_log_lik(mod_1))$looic, loo::loo(loo::extract_log_lik(mod_2))$looic,
- loo::loo(loo::extract_log_lik(mod_3))$looic, loo::loo(loo::extract_log_lik(mod_4))$looic,
- loo::loo(loo::extract_log_lik(mod_5))$looic)
-looic.table <- data.frame(trends = 1:5, LOOIC = looics)
-looic.table
+looics = c(
+ loo::loo(loo::extract_log_lik(mod_1))$looic,
+ loo::loo(loo::extract_log_lik(mod_2))$looic,
+ loo::loo(loo::extract_log_lik(mod_3))$looic,
+ loo::loo(loo::extract_log_lik(mod_4))$looic,
+ loo::loo(loo::extract_log_lik(mod_5))$looic
+ )
+looic.table <- data.frame(trends=1:5, LOOIC=looics)
+looic.table
trends LOOIC
-1 1 1633.909
-2 2 1598.121
-3 3 1474.484
-4 4 1444.378
-5 5 1421.217
+1 1 1662.225
+2 2 1593.104
+3 3 1469.747
+4 4 1425.165
+5 5 1414.991
In our first model, the errors were independent in time. We’re going to modify this to model autocorrelated errors. Autocorrelated errors are widely used in ecology and other fields – for a greater discussion, see Morris and Doak (2002) Quantitative Conservation Biology. To make the errors autocorrelated, we start by defining the error in the first time step, \({e}_{1} = y_{1} - \beta\). The expectation of \({Y_t}\) in each time step is then written as \[E[{Y_t}] = \beta + \phi e_{t-1}\]
-In addition to affecting the expectation, the correlation parameter \(\phi\) also affects the variance of the errors, so that \[{ \sigma }^{ 2 }={ \psi }^{ 2 }\left( 1-{ \phi }^{ 2 } \right)\] Like in our first model, we assume that the data follows a normal likelihood (or equivalently that the residuals are normally distributed), \(y_t = E[Y_t] + e_t\), or \(Y_t \sim N(E[{Y_t}], \sigma)\). Thus, it is possible to express the subsequent deviations as \({e}_{t} = {y}_{t} - E[{Y_t}]\), or equivalently as \({e}_{t} = {y}_{t} - \beta -\phi {e}_{t-1}\).
+In our first model, the errors were independent in time. We’re going to modify this to model autocorrelated errors. Autocorrelated errors are widely used in ecology and other fields – for a greater discussion, see Morris and Doak (2002) Quantitative Conservation Biology. To make the errors autocorrelated, we start by defining the error in the first time step, \({e}_{1} = y_{1} - \beta\). The expectation of \({Y_t}\) in each time step is then written as +\[E[{Y_t}] = \beta + \phi e_{t-1}\]
+In addition to affecting the expectation, the correlation parameter \(\phi\) also affects the variance of the errors, so that +\[{ \sigma }^{ 2 }={ \psi }^{ 2 }\left( 1-{ \phi }^{ 2 } \right)\] +Like in our first model, we assume that the data follows a normal likelihood (or equivalently that the residuals are normally distributed), \(y_t = E[Y_t] + e_t\), or \(Y_t \sim N(E[{Y_t}], \sigma)\). Thus, it is possible to express the subsequent deviations as \({e}_{t} = {y}_{t} - E[{Y_t}]\), or equivalently as \({e}_{t} = {y}_{t} - \beta -\phi {e}_{t-1}\).
We can fit this regression with autocorrelated errors by changing the model name to ‘regression_cor’
-lm_intercept_cor <- atsar::fit_stan(y = Temp, x = rep(1, length(Temp)),
- model_name = "regression_cor",
- mcmc_list = list(n_mcmc = 1000, n_burn = 1, n_chain = 1, n_thin = 1))
We’ll start with the simplest time series model possible: linear regression with only an intercept, so that the predicted values of all observations are the same. There are several ways we can write this equation. First, the predicted values can be written as \(E[Y_{t}] = \beta x\), where \(x=1\). Assuming that the residuals are normally distributed, the model linking our predictions to observed data is written as \[y_t = \beta x + e_{t}, e_{t} \sim N(0,\sigma), x=1\]
-An equivalent way to think about this model is that instead of the residuals as normally distributed with mean zero, we can think of the data \(y_t\) as being drawn from a normal distribution with a mean of the intercept, and the same residual standard deviation: \[Y_t \sim N(E[Y_{t}],\sigma)\] Remember that in linear regression models, the residual error is interpreted as independent and identically distributed observation error.
+We’ll start with the simplest time series model possible: linear regression with only an intercept, so that the predicted values of all observations are the same. There are several ways we can write this equation. First, the predicted values can be written as \(E[Y_{t}] = \beta x\), where \(x=1\). Assuming that the residuals are normally distributed, the model linking our predictions to observed data is written as +\[y_t = \beta x + e_{t}, e_{t} \sim N(0,\sigma), x=1\]
+An equivalent way to think about this model is that instead of the residuals as normally distributed with mean zero, we can think of the data \(y_t\) as being drawn from a normal distribution with a mean of the intercept, and the same residual standard deviation: +\[Y_t \sim N(E[Y_{t}],\sigma)\] +Remember that in linear regression models, the residual error is interpreted as independent and identically distributed observation error.
To run this model using our package, we’ll need to specify the response and predictor variables. The covariate matrix with an intercept only is a matrix of 1s. To double check, you could always look at
-x <- model.matrix(lm(Temp~1))
Fitting the model using our function is done with this code,
-lm_intercept <- atsar::fit_stan(y = as.numeric(Temp), x = rep(1, length(Temp)),
- model_name = "regression")
lm_intercept <- atsar::fit_stan(y = as.numeric(Temp), x = rep(1, length(Temp)),
+ model_name = "regression")
Coarse summaries of stanfit
objects can be examined by typing one of the following
lm_intercept
-# this is huge
-summary(lm_intercept)
But to get more detailed output for each parameter, you have to use the extract()
function,
pars <- rstan::extract(lm_intercept)
-names(pars)
[1] "beta" "sigma" "pred" "log_lik" "lp__"
-extract()
will return the draws from the posterior for your parameters and any derived variables specified in your stan code. In this case, our model is \[y_t = \beta \times 1 + e_t, e_t \sim N(0,\sigma)\] so our estimated parameters are \(\beta\) and \(\sigma\). Our stan code computed the derived variables: predicted \(y_t\) which is \(\hat{y}_t = \beta \times 1\) and the log-likelihood. lp__ is the log posterior which is automatically returned.
extract()
will return the draws from the posterior for your parameters and any derived variables specified in your stan code. In this case, our model is
+\[y_t = \beta \times 1 + e_t, e_t \sim N(0,\sigma)\]
+so our estimated parameters are \(\beta\) and \(\sigma\). Our stan code computed the derived variables: predicted \(y_t\) which is \(\hat{y}_t = \beta \times 1\) and the log-likelihood. lp__ is the log posterior which is automatically returned.
We can then make basic plots or summaries of each of these parameters,
-hist(pars$beta, 40, col="grey", xlab="Intercept", main="")
quantile(pars$beta, c(0.025,0.5,0.975))
2.5% 50% 97.5%
- 4.470008 8.771392 12.796053
+ 4.628010 8.704677 12.990291
One of the other useful things we can do is look at the predicted values of our model (\(\hat{y}_t=\beta \times 1\)) and overlay the data. The predicted values are pars$pred.
-plot(apply(pars$pred, 2, mean), main="Predicted values", lwd=2,
- ylab="Wind", ylim= c(min(pars$pred), max(pars$pred)), type="l")
-lines(apply(pars$pred, 2, quantile,0.025))
-lines(apply(pars$pred, 2, quantile,0.975))
-points(Wind, col="red")
plot(apply(pars$pred, 2, mean), main="Predicted values", lwd=2,
+ ylab="Wind", ylim= c(min(pars$pred), max(pars$pred)), type="l")
+lines(apply(pars$pred, 2, quantile,0.025))
+lines(apply(pars$pred, 2, quantile,0.975))
+points(Wind, col="red")
To illustrate the effects of the burn-in/warmup period and thinning, we can re-run the above model, but for just 1 MCMC chain (the default is 3).
-lm_intercept <- atsar::fit_stan(y = Temp, x = rep(1, length(Temp)),
- model_name = "regression",
- mcmc_list = list(n_mcmc = 1000, n_burn = 1, n_chain = 1, n_thin = 1))
lm_intercept <- atsar::fit_stan(y = Temp, x = rep(1, length(Temp)),
+ model_name = "regression",
+ mcmc_list = list(n_mcmc = 1000, n_burn = 1, n_chain = 1, n_thin = 1))
Here is a plot of the time series of beta
with one chain and no burn-in. Based on visual inspection, when does the chain converge?
pars <- rstan::extract(lm_intercept)
-plot(pars$beta)
\[y_t = y_{t-1} + e_{t}\]
And the \({e}_{t} \sim N(0, \sigma)\). Remember back to the autocorrelated model (or MA(1) models) that we assumed that the errors \(e_t\) followed a random walk. In contrast, this model assumes that the errors are independent, but that the state of nature follows a random walk. Note also that this model as written doesn’t include a drift term (this can be turned on / off using the est_drift
argument).
We can fit the random walk model using argument model_name = 'rw'
passed to the fit_stan()
function.
rw <- atsar::fit_stan(y = Temp, est_drift = FALSE, model_name = "rw")
We will look at the effect of missing data on the uncertainty intervals on estimates states using a DFA on the harbor seal dataset.
-data(harborSealWA, package="MARSS")
-#the first column is year
-matplot(harborSealWA[,1],harborSealWA[,-1],type="l",
- ylab="Log abundance", xlab="")
data(harborSealWA, package="MARSS")
+#the first column is year
+matplot(harborSealWA[,1],harborSealWA[,-1],type="l",
+ ylab="Log abundance", xlab="")
Assume they are all observing a single trend.
-seal.mod <- atsar::fit_dfa(y = t(harborSealWA[,-1]), num_trends = 1)
pars <- rstan::extract(seal.mod)
pred_mean <- c(apply(pars$x, c(2, 3), mean))
-pred_lo <- c(apply(pars$x, c(2, 3), quantile, 0.025))
-pred_hi <- c(apply(pars$x, c(2, 3), quantile, 0.975))
-
-plot(pred_mean, type = "l", lwd = 3, ylim = range(c(pred_mean,
- pred_lo, pred_hi)), main = "Trend")
-lines(pred_lo)
-lines(pred_hi)
pred_mean <- c(apply(pars$x, c(2,3), mean))
+pred_lo <- c(apply(pars$x, c(2,3), quantile, 0.025))
+pred_hi <- c(apply(pars$x, c(2,3), quantile, 0.975))
+
+plot(pred_mean, type="l", lwd = 3, ylim = range(c(pred_mean, pred_lo, pred_hi)), main = "Trend")
+lines(pred_lo)
+lines(pred_hi)
At this point, we’ve fit models with observation or process error, but we haven’t tried to estimate both simultaneously. We will do so here, and introduce some new notation to describe the process model and observation model. We use the notation \({x_t}\) to denote the latent state or state of nature (which is unobserved) at time \(t\) and \({y_t}\) to denote the observed data. For introductory purposes, we’ll make the process model autoregressive (similar to our AR(1) model),
\[x_{t} = \phi x_{t-1} + e_{t}, e_{t} \sim N(0,q)\]
-For the process model, there are a number of ways to parameterize the first ‘state’, and we’ll talk about this more in the class, but for the sake of this model, we’ll place a vague weakly informative prior on \(x_1\), \(x_1 \sim N(0, 0.01)\).Second, we need to construct an observation model linking the estimate unseen states of nature \(x_t\) to the data \(y_t\). For simplicitly, we’ll assume that the observation errors are indepdendent and identically distributed, with no observation component. Mathematically, this model is \[Y_t \sim N(x_t, r)\] In the two above models, we’ll refer to \(q\) as the standard deviation of the process variance and \(r\) as the standard deviation of the observation error variance
+For the process model, there are a number of ways to parameterize the first ‘state’, and we’ll talk about this more in the class, but for the sake of this model, we’ll place a vague weakly informative prior on \(x_1\), \(x_1 \sim N(0, 0.01)\).Second, we need to construct an observation model linking the estimate unseen states of nature \(x_t\) to the data \(y_t\). For simplicitly, we’ll assume that the observation errors are indepdendent and identically distributed, with no observation component. Mathematically, this model is +\[Y_t \sim N(x_t, r)\] +In the two above models, we’ll refer to \(q\) as the standard deviation of the process variance and \(r\) as the standard deviation of the observation error variance
We can fit the state-space AR(1) and random walk models using the fit_stan()
function:
ss_ar <- atsar::fit_stan(y = Temp, est_drift=FALSE, model_name = "ss_ar")
-ss_rw <- atsar::fit_stan(y = Temp, est_drift=FALSE, model_name = "ss_rw")
This is the seasonal effect (\(s_t\)), assuming \(\lambda = 1/9\), but, \(s_t\) includes the remainder \(e_t\) as well. Instead we can estimate the mean seasonal effect (\(s_t\)).
-seas_2 <- decompose(xx)$seasonal
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-plot.ts(seas_2, las = 1, ylab = "")
This is the seasonal effect (\(s_t\)), assuming \(\lambda = 1/9\), +but, \(s_t\) includes the remainder \(e_t\) as well. Instead we can estimate the mean seasonal effect (\(s_t\)).
+seas_2 <- decompose(xx)$seasonal
+par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
+plot.ts(seas_2, las = 1, ylab = "")
ee <- decompose(xx)$random
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-plot.ts(ee, las = 1, ylab = "")
ee <- decompose(xx)$random
+par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
+plot.ts(ee, las = 1, ylab = "")
\[ \{ x_1,x_2,x_3,\dots,x_n \} \]
-For example, \[
+ For example,
+\[
\{ 10,31,27,42,53,15 \}
-\] It can be further classified.
Interval across real time; \(x(t)\)
@@ -485,24 +513,24 @@Let’s repeat the decomposition with the log of the airline data.
-lx <- log(AirPassengers)
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-plot.ts(lx, las = 1, ylab = "")
lx <- log(AirPassengers)
+par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
+plot.ts(lx, las = 1, ylab = "")
le <- lx - pp - seas_2
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-plot.ts(le, las = 1, ylab = "")
data(WWWusage, package="datasets")
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-plot.ts(WWWusage, ylab = "", las = 1, col = "blue", lwd = 2)
data(WWWusage, package="datasets")
+par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
+plot.ts(WWWusage, ylab = "", las = 1, col = "blue", lwd = 2)
data(lynx, package="datasets")
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-plot.ts(lynx, ylab = "", las = 1, col = "blue", lwd = 2)
data(lynx, package="datasets")
+par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
+plot.ts(lynx, ylab = "", las = 1, col = "blue", lwd = 2)
White noise: \(x_t \sim N(0,1)\)
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-matplot(ww, type="l", lty="solid", las = 1,
- ylab = expression(italic(x[t])), xlab = "Time",
- col = gray(0.5, 0.4))
par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
+matplot(ww, type="l", lty="solid", las = 1,
+ ylab = expression(italic(x[t])), xlab = "Time",
+ col = gray(0.5, 0.4))
Random walk: \(x_t = x_{t-1} + w_t,~\text{with}~w_t \sim N(0,1)\)
-par(mai = c(0.9,0.9,0.1,0.1), omi = c(0,0,0,0))
-matplot(apply(ww, 2, cumsum), type="l", lty="solid", las = 1,
- ylab = expression(italic(x[t])), xlab = "Time",
- col = gray(0.5, 0.4))
Autoregressive models of order \(p\), abbreviated AR(\(p\)), are commonly used in time series analyses. In particular, AR(1) models (and their multivariate extensions) see considerable use in ecology as we will see later in the course. Recall from lecture that an AR(\(p\)) model is written as
-\[\begin{equation} +\[\begin{equation} \tag{4.22} x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \dots + \phi_p x_{t-p} + w_t, -\end{equation}\] +\end{equation}\]
where \(\{w_t\}\) is a white noise sequence with zero mean and some variance \(\sigma^2\). For our purposes we usually assume that \(w_t \sim \text{N}(0,q)\). Note that the random walk in Equation (4.18) is a special case of an AR(1) model where \(\phi_1=1\) and \(\phi_k=0\) for \(k \geq 2\).
Note that you can omit the ma
element entirely if you have an AR(\(p\)) model, or omit the ar
element if you have an MA(\(q\)) model. If you omit the sd
element, arima.sim()
will assume you want normally distributed errors with SD = 1. Also note that you can pass arima.sim()
your own time series of random errors or the name of a function that will generate the errors (e.g., you could use rpois()
if you wanted a model with Poisson errors). Type ?arima.sim
for more details.
Let’s begin by simulating some AR(1) models and comparing their behavior. First, let’s choose models with contrasting AR coefficients. Recall that in order for an AR(1) model to be stationary, \(\phi < \lvert 1 \rvert\), so we’ll try 0.1 and 0.9. We’ll again set the random number seed so we will get the same answers.
-set.seed(456)
-## list description for AR(1) model with small coef
-AR.sm <- list(order=c(1,0,0), ar=0.1, sd=0.1)
-## list description for AR(1) model with large coef
-AR.lg <- list(order=c(1,0,0), ar=0.9, sd=0.1)
-## simulate AR(1)
-AR1.sm <- arima.sim(n=50, model=AR.sm)
-AR1.lg <- arima.sim(n=50, model=AR.lg)
set.seed(456)
+## list description for AR(1) model with small coef
+AR.sm <- list(order=c(1,0,0), ar=0.1, sd=0.1)
+## list description for AR(1) model with large coef
+AR.lg <- list(order=c(1,0,0), ar=0.9, sd=0.1)
+## simulate AR(1)
+AR1.sm <- arima.sim(n=50, model=AR.sm)
+AR1.lg <- arima.sim(n=50, model=AR.lg)
Now let’s plot the 2 simulated series.
-## setup plot region
-par(mfrow=c(1,2))
-## get y-limits for common plots
-ylm <- c(min(AR1.sm,AR1.lg), max(AR1.sm,AR1.lg))
-## plot the ts
-plot.ts(AR1.sm, ylim=ylm,
- ylab=expression(italic(x)[italic(t)]),
- main=expression(paste(phi," = 0.1")))
-plot.ts(AR1.lg, ylim=ylm,
- ylab=expression(italic(x)[italic(t)]),
- main=expression(paste(phi," = 0.9")))
## setup plot region
+par(mfrow=c(1,2))
+## get y-limits for common plots
+ylm <- c(min(AR1.sm,AR1.lg), max(AR1.sm,AR1.lg))
+## plot the ts
+plot.ts(AR1.sm, ylim=ylm,
+ ylab=expression(italic(x)[italic(t)]),
+ main=expression(paste(phi," = 0.1")))
+plot.ts(AR1.lg, ylim=ylm,
+ ylab=expression(italic(x)[italic(t)]),
+ main=expression(paste(phi," = 0.9")))
What do you notice about the two plots in Figure 4.22? It looks like the time series with the smaller AR coefficient is more “choppy” and seems to stay closer to 0 whereas the time series with the larger AR coefficient appears to wander around more. Remember that as the coefficient in an AR(1) model goes to 0, the model approaches a WN sequence, which is stationary in both the mean and variance. As the coefficient goes to 1, however, the model approaches a random walk, which is not stationary in either the mean or variance.
Next, let’s generate two AR(1) models that have the same magnitude coeficient, but opposite signs, and compare their behavior.
-set.seed(123)
-## list description for AR(1) model with small coef
-AR.pos <- list(order=c(1,0,0), ar=0.5, sd=0.1)
-## list description for AR(1) model with large coef
-AR.neg <- list(order=c(1,0,0), ar=-0.5, sd=0.1)
-## simulate AR(1)
-AR1.pos <- arima.sim(n=50, model=AR.pos)
-AR1.neg <- arima.sim(n=50, model=AR.neg)
set.seed(123)
+## list description for AR(1) model with small coef
+AR.pos <- list(order=c(1,0,0), ar=0.5, sd=0.1)
+## list description for AR(1) model with large coef
+AR.neg <- list(order=c(1,0,0), ar=-0.5, sd=0.1)
+## simulate AR(1)
+AR1.pos <- arima.sim(n=50, model=AR.pos)
+AR1.neg <- arima.sim(n=50, model=AR.neg)
OK, let’s plot the 2 simulated series.
-## setup plot region
-par(mfrow=c(1,2))
-## get y-limits for common plots
-ylm <- c(min(AR1.pos,AR1.neg), max(AR1.pos,AR1.neg))
-## plot the ts
-plot.ts(AR1.pos, ylim=ylm,
- ylab=expression(italic(x)[italic(t)]),
- main=expression(paste(phi[1]," = 0.5")))
-plot.ts(AR1.neg,
- ylab=expression(italic(x)[italic(t)]),
- main=expression(paste(phi[1]," = -0.5")))
## setup plot region
+par(mfrow=c(1,2))
+## get y-limits for common plots
+ylm <- c(min(AR1.pos,AR1.neg), max(AR1.pos,AR1.neg))
+## plot the ts
+plot.ts(AR1.pos, ylim=ylm,
+ ylab=expression(italic(x)[italic(t)]),
+ main=expression(paste(phi[1]," = 0.5")))
+plot.ts(AR1.neg,
+ ylab=expression(italic(x)[italic(t)]),
+ main=expression(paste(phi[1]," = -0.5")))
Now it appears like both time series vary around the mean by about the same amount, but the model with the negative coefficient produces a much more “sawtooth” time series. It turns out that any AR(1) model with \(-1<\phi<0\) will exhibit the 2-point oscillation you see here.
We can simulate higher order AR(\(p\)) models in the same manner, but care must be exercised when choosing a set of coefficients that result in a stationary model or else arima.sim()
will fail and report an error. For example, an AR(2) model with both coefficients equal to 0.5 is not stationary, and therefore this function call will not work:
arima.sim(n=100, model=list(order(2,0,0), ar=c(0.5,0.5)))
If you try, R will respond that the “'ar' part of model is not stationary
”.
Let’s review what we learned in lecture about the general behavior of the ACF and PACF for AR(\(p\)) models. To do so, we’ll simulate four stationary AR(\(p\)) models of increasing order \(p\) and then examine their ACF’s and PACF’s. Let’s use a really big \(n\) so as to make them “pure”, which will provide a much better estimate of the correlation structure.
-set.seed(123)
-## the 4 AR coefficients
-ARp <- c(0.7, 0.2, -0.1, -0.3)
-## empty list for storing models
-AR.mods <- list()
-## loop over orders of p
-for(p in 1:4) {
- ## assume SD=1, so not specified
- AR.mods[[p]] <- arima.sim(n=10000, list(ar=ARp[1:p]))
-}
set.seed(123)
+## the 4 AR coefficients
+ARp <- c(0.7, 0.2, -0.1, -0.3)
+## empty list for storing models
+AR.mods <- list()
+## loop over orders of p
+for(p in 1:4) {
+ ## assume SD=1, so not specified
+ AR.mods[[p]] <- arima.sim(n=10000, list(ar=ARp[1:p]))
+}
Now that we have our four AR(\(p\)) models, lets look at plots of the time series, ACF’s, and PACF’s.
-## set up plot region
-par(mfrow=c(4,3))
-## loop over orders of p
-for(p in 1:4) {
- plot.ts(AR.mods[[p]][1:50],
- ylab=paste("AR(",p,")",sep=""))
- acf(AR.mods[[p]], lag.max=12)
- pacf(AR.mods[[p]], lag.max=12, ylab="PACF")
-}
## set up plot region
+par(mfrow=c(4,3))
+## loop over orders of p
+for(p in 1:4) {
+ plot.ts(AR.mods[[p]][1:50],
+ ylab=paste("AR(",p,")",sep=""))
+ acf(AR.mods[[p]], lag.max=12)
+ pacf(AR.mods[[p]], lag.max=12, ylab="PACF")
+}
ARMA(\(p,q\)) models have a rich history in the time series literature, but they are not nearly as common in ecology as plain AR(\(p\)) models. As we discussed in lecture, both the ACF and PACF are important tools when trying to identify the appropriate order of \(p\) and \(q\). Here we will see how to simulate time series from AR(\(p\)), MA(\(q\)), and ARMA(\(p,q\)) processes, as well as fit time series models to data based on insights gathered from the ACF and PACF.
We can write an ARMA(\(p,q\)) as a mixture of AR(\(p\)) and MA(\(q\)) models, such that
-\[\begin{equation} +\[\begin{equation} \tag{4.25} x_t = \phi_1x_{t-1} + \phi_2x_{t-2} + \dots + \phi_p x_{t-p} + w_t + \theta w_{t-1} + \theta_2 w_{t-2} + \dots + \theta_q x_{t-q}, -\end{equation}\] +\end{equation}\]
and the \(w_t\) are white noise.
arima()
\[\begin{equation} \tag{4.26} x_t = \mu + \phi (x_{t-1} - \mu) + w_t. -\end{equation}\] +\end{equation}\]
If you know for a fact that the time series data have a mean of zero (e.g., you already subtracted the mean from them), you should include the argument include.mean=FALSE
, which is set to TRUE
by default. Note that ignoring and not estimating a mean in ARMA(\(p,q\)) models when one exists will bias the estimates of all other parameters.
Let’s see an example of how arima()
works. First we’ll simulate an ARMA(2,2) model and then estimate the parameters to see how well we can recover them. In addition, we’ll add in a constant to create a non-zero mean, which arima()
reports as intercept
in its output.
set.seed(123)
-## ARMA(2,2) description for arim.sim()
-ARMA22 <- list(order=c(2,0,2), ar=c(-0.7,0.2), ma=c(0.7,0.2))
-## mean of process
-mu <- 5
-## simulated process (+ mean)
-ARMA.sim <- arima.sim(n=10000, model=ARMA22) + mu
-## estimate parameters
-arima(x=ARMA.sim, order=c(2,0,2))
set.seed(123)
+## ARMA(2,2) description for arim.sim()
+ARMA22 <- list(order=c(2,0,2), ar=c(-0.7,0.2), ma=c(0.7,0.2))
+## mean of process
+mu <- 5
+## simulated process (+ mean)
+ARMA.sim <- arima.sim(n=10000, model=ARMA22) + mu
+## estimate parameters
+arima(x=ARMA.sim, order=c(2,0,2))
Call:
arima(x = ARMA.sim, order = c(2, 0, 2))
@@ -480,24 +506,24 @@ 4.9.1 Fitting ARMA(
4.9.2 Searching over model orders
In an ideal situation, you could examine the ACF and PACF of the time series of interest and immediately decipher what orders of \(p\) and \(q\) must have generated the data, but that doesn’t always work in practice. Instead, we are often left with the task of searching over several possible model forms and seeing which of them provides the most parsimonious fit to the data. There are two easy ways to do this for ARIMA models in R. The first is to write a little script that loops ove the possible dimensions of \(p\) and \(q\). Let’s try that for the process we simulated above and search over orders of \(p\) and \(q\) from 0-3 (it will take a few moments to run and will likely report an error about a “possible convergence problem
”, which you can ignore).
-## empty list to store model fits
-ARMA.res <- list()
-## set counter
-cc <- 1
-## loop over AR
-for(p in 0:3) {
- ## loop over MA
- for(q in 0:3) {
- ARMA.res[[cc]] <- arima(x=ARMA.sim,order=c(p,0,q))
- cc <- cc + 1
- }
-}
+## empty list to store model fits
+ARMA.res <- list()
+## set counter
+cc <- 1
+## loop over AR
+for(p in 0:3) {
+ ## loop over MA
+ for(q in 0:3) {
+ ARMA.res[[cc]] <- arima(x=ARMA.sim,order=c(p,0,q))
+ cc <- cc + 1
+ }
+}
Warning in arima(x = ARMA.sim, order = c(p, 0, q)): possible convergence
problem: optim gave code = 1
-## get AIC values for model evaluation
-ARMA.AIC <- sapply(ARMA.res,function(x) x$aic)
-## model with lowest AIC is the best
-ARMA.res[[which(ARMA.AIC==min(ARMA.AIC))]]
+## get AIC values for model evaluation
+ARMA.AIC <- sapply(ARMA.res,function(x) x$aic)
+## model with lowest AIC is the best
+ARMA.res[[which(ARMA.AIC==min(ARMA.AIC))]]
Call:
arima(x = ARMA.sim, order = c(p, 0, q))
@@ -509,8 +535,8 @@ 4.9.2 Searching over model orders
sigma^2 estimated as 0.9972: log likelihood = -14175.92, aic = 28363.84
It looks like our search worked, so let’s look at the other method for fitting ARIMA models. The auto.arima()
function in the forecast package will conduct an automatic search over all possible orders of ARIMA models that you specify. For details, type ?auto.arima
after loading the package. Let’s repeat our search using the same criteria.
-## find best ARMA(p,q) model
-auto.arima(ARMA.sim, start.p=0, max.p=3, start.q=0, max.q=3)
+
Series: ARMA.sim
ARIMA(2,0,2) with non-zero mean
@@ -536,24 +562,24 @@ 4.9.2 Searching over model orders
Autocorrelation is the correlation of a variable with itself at differing time lags. Recall from lecture that we defined the sample autocovariance function (ACVF), \(c_k\), for some lag \(k\) as
-\[\begin{equation} +\[\begin{equation} \tag{4.10} c_k = \frac{1}{n}\sum_{t=1}^{n-k} \left(x_t-\bar{x}\right) \left(x_{t+k}-\bar{x}\right) -\end{equation}\] +\end{equation}\]
Note that the sample autocovariance of \(\{x_t\}\) at lag 0, \(c_0\), equals the sample variance of \(\{x_t\}\) calculated with a denominator of \(n\). The sample autocorrelation function (ACF) is defined as
-\[\begin{equation} +\[\begin{equation} \tag{4.11} r_k = \frac{c_k}{c_0} = \text{Cor}(x_t,x_{t+k}) -\end{equation}\] +\end{equation}\]
Recall also that an approximate 95% confidence interval on the ACF can be estimated by
-\[\begin{equation} +\[\begin{equation} \tag{4.12} -\frac{1}{n} \pm \frac{2}{\sqrt{n}} -\end{equation}\] +\end{equation}\]
where \(n\) is the number of data points used in the calculation of the ACF.
It is important to remember two things here. First, although the confidence interval is commonly plotted and interpreted as a horizontal line over all time lags, the interval itself actually grows as the lag increases because the number of data points \(n\) used to estimate the correlation decreases by 1 for every integer increase in lag. Second, care must be exercised when interpreting the “significance” of the correlation at various lags because we should expect, a priori, that approximately 1 out of every 20 correlations will be significant based on chance alone.
-We can use the acf()
function in R to compute the sample ACF (note that adding the option type="covariance"
will return the sample auto-covariance (ACVF) instead of the ACF–type ?acf
for details). Calling the function by itself will will automatically produce a correlogram (i.e., a plot of the autocorrelation versus time lag). The argument lag.max
allows you to set the number of positive and negative lags. Let’s try it for the CO\(_2\) data.
## correlogram of the CO2 data
-acf(co2, lag.max=36)
We can use the acf()
function in R to compute the sample ACF (note that adding the option type="covariance"
will return the sample auto-covariance (ACVF) instead of the ACF–type ?acf
for details). Calling the function by itself will will automatically produce a correlogram (i.e., a plot of the autocorrelation versus time lag). The argument lag.max
allows you to set the number of positive and negative lags. Let’s try it for the CO\(_2\) data.
As an alternative to the default plots for acf objects, let’s define a new plot function for acf objects with some better features:
-## better ACF plot
-plot.acf <- function(ACFobj) {
- rr <- ACFobj$acf[-1]
- kk <- length(rr)
- nn <- ACFobj$n.used
- plot(seq(kk),rr,type="h",lwd=2,yaxs="i",xaxs="i",
- ylim=c(floor(min(rr)),1),xlim=c(0,kk+1),
- xlab="Lag",ylab="Correlation",las=1)
- abline(h=-1/nn+c(-2,2)/sqrt(nn),lty="dashed",col="blue")
- abline(h=0)
-}
## better ACF plot
+plot.acf <- function(ACFobj) {
+ rr <- ACFobj$acf[-1]
+ kk <- length(rr)
+ nn <- ACFobj$n.used
+ plot(seq(kk),rr,type="h",lwd=2,yaxs="i",xaxs="i",
+ ylim=c(floor(min(rr)),1),xlim=c(0,kk+1),
+ xlab="Lag",ylab="Correlation",las=1)
+ abline(h=-1/nn+c(-2,2)/sqrt(nn),lty="dashed",col="blue")
+ abline(h=0)
+}
Now we can assign the result of acf()
to a variable and then use the information contained therein to plot the correlogram with our new plot function.
## acf of the CO2 data
-co2.acf <- acf(co2, lag.max=36)
-## correlogram of the CO2 data
-plot.acf(co2.acf)
## acf of the CO2 data
+co2.acf <- acf(co2, lag.max=36)
+## correlogram of the CO2 data
+plot.acf(co2.acf)
Notice that all of the relevant information is still there (Figure 4.11), but now \(r_0=1\) is not plotted at lag-0 and the lags on the \(x\)-axis are displayed correctly as integers.
Before we move on to the PACF, let’s look at the ACF for some deterministic time series, which will help you identify interesting properties (e.g., trends, seasonal effects) in a stochastic time series, and account for them in time series models–an important topic in this course. First, let’s look at a straight line.
-## length of ts
-nn <- 100
-## create straight line
-tt <- seq(nn)
-## set up plot area
-par(mfrow=c(1,2))
-## plot line
-plot.ts(tt, ylab=expression(italic(x[t])))
-## get ACF
-line.acf <- acf(tt, plot=FALSE)
-## plot ACF
-plot.acf(line.acf)
## length of ts
+nn <- 100
+## create straight line
+tt <- seq(nn)
+## set up plot area
+par(mfrow=c(1,2))
+## plot line
+plot.ts(tt, ylab=expression(italic(x[t])))
+## get ACF
+line.acf <- acf(tt, plot=FALSE)
+## plot ACF
+plot.acf(line.acf)
The correlogram for a straight line is itself a linearly decreasing function over time (Figure 4.12).
Now let’s examine the ACF for a sine wave and see what sort of pattern arises.
-## create sine wave
-tt <- sin(2*pi*seq(nn)/12)
-## set up plot area
-par(mfrow=c(1,2))
-## plot line
-plot.ts(tt, ylab=expression(italic(x[t])))
-## get ACF
-sine.acf <- acf(tt, plot=FALSE)
-## plot ACF
-plot.acf(sine.acf)
## create sine wave
+tt <- sin(2*pi*seq(nn)/12)
+## set up plot area
+par(mfrow=c(1,2))
+## plot line
+plot.ts(tt, ylab=expression(italic(x[t])))
+## get ACF
+sine.acf <- acf(tt, plot=FALSE)
+## plot ACF
+plot.acf(sine.acf)
Perhaps not surprisingly, the correlogram for a sine wave is itself a sine wave whose amplitude decreases linearly over time (Figure 4.13).
Now let’s examine the ACF for a sine wave with a linear downward trend and see what sort of patterns arise.
-## create sine wave with trend
-tt <- sin(2*pi*seq(nn)/12) - seq(nn)/50
-## set up plot area
-par(mfrow=c(1,2))
-## plot line
-plot.ts(tt, ylab=expression(italic(x[t])))
-## get ACF
-sili.acf <- acf(tt, plot=FALSE)
-## plot ACF
-plot.acf(sili.acf)
## create sine wave with trend
+tt <- sin(2*pi*seq(nn)/12) - seq(nn)/50
+## set up plot area
+par(mfrow=c(1,2))
+## plot line
+plot.ts(tt, ylab=expression(italic(x[t])))
+## get ACF
+sili.acf <- acf(tt, plot=FALSE)
+## plot ACF
+plot.acf(sili.acf)
The partial autocorrelation function (PACF) measures the linear correlation of a series \(\{x_t\}\) and a lagged version of itself \(\{x_{t+k}\}\) with the linear dependence of \(\{x_{t-1},x_{t-2},\dots,x_{t-(k-1)}\}\) removed. Recall from lecture that we define the PACF as
-\[\begin{equation} +\[\begin{equation} \tag{4.13} f_k = \begin{cases} \text{Cor}(x_1,x_0)=r_1 & \text{if } k = 1;\\ \text{Cor}(x_k-x_k^{k-1},x_0-x_0^{k-1}) & \text{if } k \geq 2; \end{cases} -\end{equation}\] +\end{equation}\]
with
It’s easy to compute the PACF for a variable in R using the pacf()
function, which will automatically plot a correlogram when called by itself (similar to acf()
). Let’s look at the PACF for the CO\(_2\) data.
## PACF of the CO2 data
-pacf(co2, lag.max=36)
The default plot for PACF is a bit better than for ACF, but here is another plotting function that might be useful.
-## better PACF plot
-plot.pacf <- function(PACFobj) {
- rr <- PACFobj$acf
- kk <- length(rr)
- nn <- PACFobj$n.used
- plot(seq(kk),rr,type="h",lwd=2,yaxs="i",xaxs="i",
- ylim=c(floor(min(rr)),1),xlim=c(0,kk+1),
- xlab="Lag",ylab="PACF",las=1)
- abline(h=-1/nn+c(-2,2)/sqrt(nn),lty="dashed",col="blue")
- abline(h=0)
-}
## better PACF plot
+plot.pacf <- function(PACFobj) {
+ rr <- PACFobj$acf
+ kk <- length(rr)
+ nn <- PACFobj$n.used
+ plot(seq(kk),rr,type="h",lwd=2,yaxs="i",xaxs="i",
+ ylim=c(floor(min(rr)),1),xlim=c(0,kk+1),
+ xlab="Lag",ylab="PACF",las=1)
+ abline(h=-1/nn+c(-2,2)/sqrt(nn),lty="dashed",col="blue")
+ abline(h=0)
+}
Notice in Figure 4.15 that the partial autocorrelation at lag-1 is very high (it equals the ACF at lag-1), but the other values at lags > 1 are relatively small, unlike what we saw for the ACF. We will discuss this in more detail later on in this lab.
Notice also that the PACF plot again has real-valued indices for the time lag, but it does not include any value for lag-0 because it is impossible to remove any intermediate autocorrelation between \(t\) and \(t-k\) when \(k=0\), and therefore the PACF does not exist at lag-0. If you would like, you can use the plot.acf()
function we defined above to plot the PACF estimates because acf()
and pacf()
produce identical list structures (results not shown here).
## PACF of the CO2 data
-co2.pacf <- pacf(co2)
-## correlogram of the CO2 data
-plot.acf(co2.pacf)
As with the ACF, we will see later on how the PACF can also be used to help identify the appropriate order of \(p\) and \(q\) in ARMA(\(p\),\(q\)) models.
Often we are interested in looking for relationships between 2 different time series. There are many ways to do this, but a simple method is via examination of their cross-covariance and cross-correlation.
We begin by defining the sample cross-covariance function (CCVF) in a manner similar to the ACVF, in that
-\[\begin{equation} +\[\begin{equation} \tag{4.14} g_k^{xy} = \frac{1}{n}\sum_{t=1}^{n-k} \left(y_t-\bar{y}\right) \left(x_{t+k}-\bar{x}\right), -\end{equation}\] +\end{equation}\]
but now we are estimating the correlation between a variable \(y\) and a different time-shifted variable \(x_{t+k}\). The sample cross-correlation function (CCF) is then defined analogously to the ACF, such that
-\[\begin{equation} +\[\begin{equation} \tag{4.15} r_k^{xy} = \frac{g_k^{xy}}{\sqrt{\text{SD}_x\text{SD}_y}}; -\end{equation}\] +\end{equation}\]
SD\(_x\) and SD\(_y\) are the sample standard deviations of \(\{x_t\}\) and \(\{y_t\}\), respectively. It is important to re-iterate here that \(r_k^{xy} \neq r_{-k}^{xy}\), but \(r_k^{xy} = r_{-k}^{yx}\). Therefore, it is very important to pay particular attention to which variable you call \(y\) (i.e., the “response”) and which you call \(x\) (i.e., the “predictor”).
As with the ACF, an approximate 95% confidence interval on the CCF can be estimated by
-\[\begin{equation} +\[\begin{equation} \tag{4.16} -\frac{1}{n} \pm \frac{2}{\sqrt{n}} -\end{equation}\] +\end{equation}\]
where \(n\) is the number of data points used in the calculation of the CCF, and the same assumptions apply to its interpretation.
Computing the CCF in R is easy with the function ccf()
and it works just like acf()
. In fact, ccf()
is just a “wrapper” function that calls acf()
. As an example, let’s examine the CCF between sunspot activity and number of lynx trapped in Canada as in the classic paper by Moran.
To begin, let’s get the data, which are conveniently included in the datasets package included as part of the base installation of R. Before calculating the CCF, however, we need to find the matching years of data. Again, we’ll use the ts.intersect()
function.
## get the matching years of sunspot data
-suns <- ts.intersect(lynx,sunspot.year)[,"sunspot.year"]
-## get the matching lynx data
-lynx <- ts.intersect(lynx,sunspot.year)[,"lynx"]
## get the matching years of sunspot data
+suns <- ts.intersect(lynx,sunspot.year)[,"sunspot.year"]
+## get the matching lynx data
+lynx <- ts.intersect(lynx,sunspot.year)[,"lynx"]
Here are plots of the time series.
-## plot time series
-plot(cbind(suns,lynx), yax.flip=TRUE)
It is important to remember which of the 2 variables you call \(y\) and \(x\) when calling ccf(x, y, ...)
. In this case, it seems most relevant to treat lynx as the \(y\) and sunspots as the \(x\), in which case we are mostly interested in the CCF at negative lags (i.e., when sunspot activity predates inferred lynx abundance). Furthermore, we’ll use log-transformed lynx trappings.
## CCF of sunspots and lynx
-ccf(suns, log(lynx), ylab="Cross-correlation")
Plotting time series data is an important first step in analyzing their various components. Beyond that, however, we need a more formal means for identifying and removing characteristics such as a trend or seasonal variation. As discussed in lecture, the decomposition model reduces a time series into 3 components: trend, seasonal effects, and random errors. In turn, we aim to model the random errors as some form of stationary process.
Let’s begin with a simple, additive decomposition model for a time series \(x_t\)
-\[\begin{equation} +\[\begin{equation} \tag{4.1} x_t = m_t + s_t + e_t, -\end{equation}\] +\end{equation}\]
where, at time \(t\), \(m_t\) is the trend, \(s_t\) is the seasonal effect, and \(e_t\) is a random error that we generally assume to have zero-mean and to be correlated over time. Thus, by estimating and subtracting both \(\{m_t\}\) and \(\{s_t\}\) from \(\{x_t\}\), we hope to have a time series of stationary residuals \(\{e_t\}\).
In lecture we discussed how linear filters are a common way to estimate trends in time series. One of the most common linear filters is the moving average, which for time lags from \(-a\) to \(a\) is defined as
-\[\begin{equation} +\[\begin{equation} \tag{4.2} \hat{m}_t = \sum_{k=-a}^{a} \left(\frac{1}{1+2a}\right) x_{t+k}. -\end{equation}\] +\end{equation}\]
This model works well for moving windows of odd-numbered lengths, but should be adjusted for even-numbered lengths by adding only \(\frac{1}{2}\) of the 2 most extreme lags so that the filtered value at time \(t\) lines up with the original observation at time \(t\). So, for example, in a case with monthly data such as the atmospheric CO\(_2\) concentration where a 12-point moving average would be an obvious choice, the linear filter would be
-\[\begin{equation} +\[\begin{equation} \tag{4.3} \hat{m}_t = \frac{\frac{1}{2}x_{t-6} + x_{t-5} + \dots + x_{t-1} + x_t + x_{t+1} + \dots + x_{t+5} + \frac{1}{2}x_{t+6}}{12} -\end{equation}\] +\end{equation}\]
It is important to note here that our time series of the estimated trend \(\{\hat{m}_t\}\) is actually shorter than the observed time series by \(2a\) units.
Conveniently, R has the built-in function filter()
for estimating moving-average (and other) linear filters. In addition to specifying the time series to be filtered, we need to pass in the filter weights (and 2 other arguments we won’t worry about here–type ?filter
to get more information). The easiest way to create the filter is with the rep()
function:
## weights for moving avg
-fltr <- c(1/2,rep(1,times=11),1/2)/12
Now let’s get our estimate of the trend \(\{\hat{m}\}\) with filter()
} and plot it:
## estimate of trend
-co2.trend <- filter(co2, filter=fltr, method="convo", sides=2)
-## plot the trend
-plot.ts(co2.trend, ylab="Trend", cex=1)
## estimate of trend
+co2.trend <- filter(co2, filter=fltr, method="convo", sides=2)
+## plot the trend
+plot.ts(co2.trend, ylab="Trend", cex=1)
The trend is a more-or-less smoothly increasing function over time, the average slope of which does indeed appear to be increasing over time as well (Figure 4.3).
Once we have an estimate of the trend for time \(t\) (\(\hat{m}_t\)) we can easily obtain an estimate of the seasonal effect at time \(t\) (\(\hat{s}_t\)) by subtraction
-\[\begin{equation} +\[\begin{equation} \tag{4.4} \hat{s}_t = x_t - \hat{m}_t, -\end{equation}\] +\end{equation}\]
which is really easy to do in R:
-## seasonal effect over time
-co2.1T <- co2 - co2.trend
This estimate of the seasonal effect for each time \(t\) also contains the random error \(e_t\), however, which can be seen by plotting the time series and careful comparison of Equations (4.1) and (4.4).
-## plot the monthly seasonal effects
-plot.ts(co2.1T, ylab="Seasonal effect", xlab="Month", cex=1)
We can obtain the overall seasonal effect by averaging the estimates of \(\{\hat{s}_t\}\) for each month and repeating this sequence over all years.
-## length of ts
-ll <- length(co2.1T)
-## frequency (ie, 12)
-ff <- frequency(co2.1T)
-## number of periods (years); %/% is integer division
-periods <- ll %/% ff
-## index of cumulative month
-index <- seq(1,ll,by=ff) - 1
-## get mean by month
-mm <- numeric(ff)
-for(i in 1:ff) {
- mm[i] <- mean(co2.1T[index+i], na.rm=TRUE)
-}
-## subtract mean to make overall mean=0
-mm <- mm - mean(mm)
## length of ts
+ll <- length(co2.1T)
+## frequency (ie, 12)
+ff <- frequency(co2.1T)
+## number of periods (years); %/% is integer division
+periods <- ll %/% ff
+## index of cumulative month
+index <- seq(1,ll,by=ff) - 1
+## get mean by month
+mm <- numeric(ff)
+for(i in 1:ff) {
+ mm[i] <- mean(co2.1T[index+i], na.rm=TRUE)
+}
+## subtract mean to make overall mean=0
+mm <- mm - mean(mm)
Before we create the entire time series of seasonal effects, let’s plot them for each month to see what is happening within a year:
-## plot the monthly seasonal effects
-plot.ts(mm, ylab="Seasonal effect", xlab="Month", cex=1)
It looks like, on average, that the CO\(_2\) concentration is highest in spring (March) and lowest in summer (August) (Figure 4.5). (Aside: Do you know why this is?)
Finally, let’s create the entire time series of seasonal effects \(\{\hat{s}_t\}\):
-## create ts object for season
-co2.seas <- ts(rep(mm, periods+1)[seq(ll)],
- start=start(co2.1T),
- frequency=ff)
The last step in completing our full decomposition model is obtaining the random errors \(\{\hat{e}_t\}\), which we can get via simple subtraction
-\[\begin{equation} +\[\begin{equation} \tag{4.5} \hat{e}_t = x_t - \hat{m}_t - \hat{s}_t. -\end{equation}\] +\end{equation}\]
Again, this is really easy in R:
-## random errors over time
-co2.err <- co2 - co2.trend - co2.seas
Now that we have all 3 of our model components, let’s plot them together with the observed data \(\{x_t\}\). The results are shown in Figure 4.6.
-## plot the obs ts, trend & seasonal effect
-plot(cbind(co2,co2.trend,co2.seas,co2.err),main="",yax.flip=TRUE)
## plot the obs ts, trend & seasonal effect
+plot(cbind(co2,co2.trend,co2.seas,co2.err),main="",yax.flip=TRUE)
decompose()
for decompositionNow that we have seen how to estimate and plot the various components of a classical decomposition model in a piecewise manner, let’s see how to do this in one step in R with the function decompose()
, which accepts a ts object as input and returns an object of class decomposed.ts.
## decomposition of CO2 data
-co2.decomp <- decompose(co2)
co2.decomp
is a list with the following elements, which should be familiar by now:
x
the observed time series \(\{x_t\}\)seasonal
time series of estimated seasonal component \(\{\hat{s}_t\}\)figure
mean seasonal effect (length(figure) == frequency(x)
)trend
time series of estimated trend \(\{\hat{m}_t\}\)random
time series of random errors \(\{\hat{e}_t\}\)type
type of error ("additive"
or "multiplicative"
)```x``` the observed time series $\{x_t\}$
```seasonal``` time series of estimated seasonal component $\{\hat{s}_t\}$
```figure``` mean seasonal effect (```length(figure) == frequency(x)```)
```trend``` time series of estimated trend $\{\hat{m}_t\}$
```random``` time series of random errors $\{\hat{e}_t\}$
```type``` type of error (```"additive"``` or ```"multiplicative"```)
We can easily make plots of the output and compare them to those in Figure 4.6:
-## plot the obs ts, trend & seasonal effect
-plot(co2.decomp, yax.flip=TRUE)
decompose()
The results obtained with decompose()
(Figure 4.7) are identical to those we estimated previously.
Another nice feature of the decompose()
function is that it can be used for decomposition models with multiplicative (i.e., non-additive) errors (e.g., if the original time series had a seasonal amplitude that increased with time). To do, so pass in the argument type="multiplicative"
, which is set to type="additive"
by default.
Another nice feature of the decompose()
function is that it can be used for decomposition models with multiplicative (i.e., non-additive) errors (e.g., if the original time series had a seasonal amplitude that increased with time). To do, so pass in the argument type="multiplicative"
, which is set to type="additive"
by default.
decompose()
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,38 +461,38 @@
4.3 Differencing to remove a trend or seasonal effects
An alternative to decomposition for removing trends is differencing. We saw in lecture how the difference operator works and how it can be used to remove linear and nonlinear trends as well as various seasonal features that might be evident in the data. As a reminder, we define the difference operator as
-\[\begin{equation}
+\[\begin{equation}
\tag{4.6}
\nabla x_t = x_t - x_{t-1},
-\end{equation}\]
+\end{equation}\]
and, more generally, for order \(d\)
-\[\begin{equation}
+\[\begin{equation}
\tag{4.7}
\nabla^d x_t = (1-\mathbf{B})^d x_t,
\end{equation}\]
-
where B is the backshift operator (i.e., \(\mathbf{B}^k x_t = x_{t-k}\) for \(k \geq 1\)).
+where B is the backshift operator (i.e., \(\mathbf{B}^k x_t = x_{t-k}\) for \(k \geq 1\)).
So, for example, a random walk is one of the most simple and widely used time series models, but it is not stationary. We can write a random walk model as
-\[\begin{equation}
+\[\begin{equation}
\tag{4.8}
x_t = x_{t-1} + w_t, \text{ with } w_t \sim \text{N}(0,q).
-\end{equation}\]
+\end{equation}\]
Applying the difference operator to Equation (4.8) will yield a time series of Gaussian white noise errors \(\{w_t\}\):
-\[\begin{equation}
+\[\begin{equation}
\tag{4.9}
\begin{aligned}
\nabla (x_t &= x_{t-1} + w_t) \\
x_t - x_{t-1} &= x_{t-1} - x_{t-1} + w_t \\
x_t - x_{t-1} &= w_t
\end{aligned}
-\end{equation}\]
+\end{equation}\]
4.3.1 Using the diff()
function
In R we can use the diff()
function for differencing a time series, which requires 3 arguments: x
(the data), lag
(the lag at which to difference), and differences
(the order of differencing; \(d\) in Equation (4.7)). For example, first-differencing a time series will remove a linear trend (i.e., differences=1
); twice-differencing will remove a quadratic trend (i.e., differences=2
). In addition, first-differencing a time series at a lag equal to the period will remove a seasonal trend (e.g., set lag=12
for monthly data).
Let’s use diff()
to remove the trend and seasonal signal from the CO\(_2\) time series, beginning with the trend. Close inspection of Figure 4.1 would suggest that there is a nonlinear increase in CO\(_2\) concentration over time, so we’ll set differences=2
):
-## twice-difference the CO2 data
-co2.D2 <- diff(co2, differences=2)
-## plot the differenced data
-plot(co2.D2, ylab=expression(paste(nabla^2,"CO"[2])))
+## twice-difference the CO2 data
+co2.D2 <- diff(co2, differences=2)
+## plot the differenced data
+plot(co2.D2, ylab=expression(paste(nabla^2,"CO"[2])))
@@ -475,11 +501,11 @@ 4.3.1 Using the diff()
We were apparently successful in removing the trend, but the seasonal effect still appears obvious (Figure 4.8). Therefore, let’s go ahead and difference that series at lag-12 because our data were collected monthly.
-## difference the differenced CO2 data
-co2.D2D12 <- diff(co2.D2, lag=12)
-## plot the newly differenced data
-plot(co2.D2D12,
- ylab=expression(paste(nabla,"(",nabla^2,"CO"[2],")")))
+## difference the differenced CO2 data
+co2.D2D12 <- diff(co2.D2, lag=12)
+## plot the newly differenced data
+plot(co2.D2D12,
+ ylab=expression(paste(nabla,"(",nabla^2,"CO"[2],")")))
@@ -501,24 +527,24 @@ 4.3.1 Using the diff()
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,13 +461,13 @@
4.8 Moving-average (MA) models
A moving-averge process of order \(q\), or MA(\(q\)), is a weighted sum of the current random error plus the \(q\) most recent errors, and can be written as
-\[\begin{equation}
+\[\begin{equation}
\tag{4.23}
x_t = w_t + \theta_1 w_{t-1} + \theta_2 w_{t-2} + \dots + \theta_q w_{t-q},
-\end{equation}\]
+\end{equation}\]
where \(\{w_t\}\) is a white noise sequence with zero mean and some variance \(\sigma^2\); for our purposes we usually assume that \(w_t \sim \text{N}(0,q)\). Of particular note is that because MA processes are finite sums of stationary errors, they themselves are stationary.
Of interest to us are so-called “invertible” MA processes that can be expressed as an infinite AR process with no error term. The term invertible comes from the inversion of the backshift operator (B) that we discussed in class (i.e., \(\mathbf{B} x_t= x_{t-1}\)). So, for example, an MA(1) process with \(\theta < \lvert 1 \rvert\) is invertible because it can be written using the backshift operator as
-\[\begin{equation}
+\[\begin{equation}
\tag{4.24}
\begin{aligned}
x_t &= w_t - \theta w_{t-1} \\
@@ -452,34 +478,34 @@ 4.8 Moving-average (MA) models
+\end{equation}\]
4.8.1 Simulating an MA(\(q\)) process
We can simulate MA(\(q\)) processes just as we did for AR(\(p\)) processes using arima.sim()
. Here are 3 different ones with contrasting \(\theta\)’s:
-set.seed(123)
-## list description for MA(1) model with small coef
-MA.sm <- list(order=c(0,0,1), ma=0.2, sd=0.1)
-## list description for MA(1) model with large coef
-MA.lg <- list(order=c(0,0,1), ma=0.8, sd=0.1)
-## list description for MA(1) model with large coef
-MA.neg <- list(order=c(0,0,1), ma=-0.5, sd=0.1)
-## simulate MA(1)
-MA1.sm <- arima.sim(n=50, model=MA.sm)
-MA1.lg <- arima.sim(n=50, model=MA.lg)
-MA1.neg <- arima.sim(n=50, model=MA.neg)
+set.seed(123)
+## list description for MA(1) model with small coef
+MA.sm <- list(order=c(0,0,1), ma=0.2, sd=0.1)
+## list description for MA(1) model with large coef
+MA.lg <- list(order=c(0,0,1), ma=0.8, sd=0.1)
+## list description for MA(1) model with large coef
+MA.neg <- list(order=c(0,0,1), ma=-0.5, sd=0.1)
+## simulate MA(1)
+MA1.sm <- arima.sim(n=50, model=MA.sm)
+MA1.lg <- arima.sim(n=50, model=MA.lg)
+MA1.neg <- arima.sim(n=50, model=MA.neg)
with their associated plots.
-## setup plot region
-par(mfrow=c(1,3))
-## plot the ts
-plot.ts(MA1.sm,
- ylab=expression(italic(x)[italic(t)]),
- main=expression(paste(theta," = 0.2")))
-plot.ts(MA1.lg,
- ylab=expression(italic(x)[italic(t)]),
- main=expression(paste(theta," = 0.8")))
-plot.ts(MA1.neg,
- ylab=expression(italic(x)[italic(t)]),
- main=expression(paste(theta," = -0.5")))
+## setup plot region
+par(mfrow=c(1,3))
+## plot the ts
+plot.ts(MA1.sm,
+ ylab=expression(italic(x)[italic(t)]),
+ main=expression(paste(theta," = 0.2")))
+plot.ts(MA1.lg,
+ ylab=expression(italic(x)[italic(t)]),
+ main=expression(paste(theta," = 0.8")))
+plot.ts(MA1.neg,
+ ylab=expression(italic(x)[italic(t)]),
+ main=expression(paste(theta," = -0.5")))
@@ -492,26 +518,26 @@ 4.8.1 Simulating an MA(
4.8.2 Correlation structure of MA(\(q\)) processes
We saw in lecture and above how the ACF and PACF have distinctive features for AR(\(p\)) models, and they do for MA(\(q\)) models as well. Here are examples of four MA(\(q\)) processes. As before, we’ll use a really big \(n\) so as to make them “pure”, which will provide a much better estimate of the correlation structure.
-set.seed(123)
-## the 4 MA coefficients
-MAq <- c(0.7, 0.2, -0.1, -0.3)
-## empty list for storing models
-MA.mods <- list()
-## loop over orders of q
-for(q in 1:4) {
- ## assume SD=1, so not specified
- MA.mods[[q]] <- arima.sim(n=1000, list(ma=MAq[1:q]))
-}
+set.seed(123)
+## the 4 MA coefficients
+MAq <- c(0.7, 0.2, -0.1, -0.3)
+## empty list for storing models
+MA.mods <- list()
+## loop over orders of q
+for(q in 1:4) {
+ ## assume SD=1, so not specified
+ MA.mods[[q]] <- arima.sim(n=1000, list(ma=MAq[1:q]))
+}
Now that we have our four MA(\(q\)) models, lets look at plots of the time series, ACF’s, and PACF’s.
-## set up plot region
-par(mfrow=c(4,3))
-## loop over orders of q
-for(q in 1:4) {
- plot.ts(MA.mods[[q]][1:50],
- ylab=paste("MA(",q,")",sep=""))
- acf(MA.mods[[q]], lag.max=12)
- pacf(MA.mods[[q]], lag.max=12, ylab="PACF")
-}
+## set up plot region
+par(mfrow=c(4,3))
+## loop over orders of q
+for(q in 1:4) {
+ plot.ts(MA.mods[[q]][1:50],
+ ylab=paste("MA(",q,")",sep=""))
+ acf(MA.mods[[q]], lag.max=12)
+ pacf(MA.mods[[q]], lag.max=12, ylab="PACF")
+}
@@ -533,24 +559,24 @@ 4.8.2 Correlation structure of MA
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -436,14 +462,14 @@
4.10 Problems
We have seen how to do a variety of introductory time series analyses with R. Now it is your turn to apply the information you learned here and in lecture to complete some analyses. You have been asked by a colleague to help analyze some time series data she collected as part of an experiment on the effects of light and nutrients on the population dynamics of phytoplankton. Specifically, after controlling for differences in light and temperature, she wants to know if the natural log of population density can be modeled with some form of ARMA(\(p,q\)) model.
The data are expressed as the number of cells per milliliter recorded every hour for one week beginning at 8:00 AM on December 1, 2014. You can load the data using
-data(hourlyphyto, package = "atsalibrary")
-pDat <- hourlyphyto
+
Use the information above to do the following:
Convert pDat
, which is a data.frame object, into a ts object. This bit of code might be useful to get you started:
-## what day of 2014 is Dec 1st?
-dBegin <- as.Date("2014-12-01")
-dayOfYear <- (dBegin - as.Date("2014-01-01") + 1)
+
Plot the time series of phytoplankton density and provide a brief description of any notable features.
Although you do not have the actual measurements for the specific temperature and light regimes used in the experiment, you have been informed that they follow a regular light/dark period with accompanying warm/cool temperatures. Thus, estimating a fixed seasonal effect is justifiable. Also, the instrumentation is precise enough to preclude any systematic change in measurements over time (i.e., you can assume \(m_t = 0\) for all \(t\)). Obtain the time series of the estimated log-density of phytoplankton absent any hourly effects caused by variation in temperature or light. (Hint: You will need to do some decomposition.)
Use diagnostic tools to identify the possible order(s) of ARMA model(s) that most likely describes the log of population density for this particular experiment. Note that at this point you should be focusing your analysis on the results obtained in Question 3.
@@ -464,24 +490,24 @@ 4.10 Problems
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,29 +461,29 @@
4.6 Random walks (RW)
Random walks receive considerable attention in time series analyses because of their ability to fit a wide range of data despite their surprising simplicity. In fact, random walks are the most simple non-stationary time series model. A random walk is a time series \(\{x_t\}\) where
-\[\begin{equation}
+\[\begin{equation}
\tag{4.18}
x_t = x_{t-1} + w_t,
-\end{equation}\]
+\end{equation}\]
and \(w_t\) is a discrete white noise series where all values are independent and identically distributed (IID) with a mean of zero. In practice, we will almost always assume that the \(w_t\) are Gaussian white noise, such that \(w_t \sim \text{N}(0,q)\). We will see later that a random walk is a special case of an autoregressive model.
4.6.1 Simulating a random walk
Simulating a RW model in R is straightforward with a for loop and the use of rnorm()
to generate Gaussian errors (type ?rnorm
to see details on the function and its useful relatives dnorm()
and pnorm()
). Let’s create 100 obs (we’ll also set the random number seed so everyone gets the same results).
-## set random number seed
-set.seed(123)
-## length of time series
-TT <- 100
-## initialize {x_t} and {w_t}
-xx <- ww <- rnorm(n=TT, mean=0, sd=1)
-## compute values 2 thru TT
-for(t in 2:TT) { xx[t] <- xx[t-1] + ww[t] }
+## set random number seed
+set.seed(123)
+## length of time series
+TT <- 100
+## initialize {x_t} and {w_t}
+xx <- ww <- rnorm(n=TT, mean=0, sd=1)
+## compute values 2 thru TT
+for(t in 2:TT) { xx[t] <- xx[t-1] + ww[t] }
Now let’s plot the simulated time series and its ACF.
-## setup plot area
-par(mfrow=c(1,2))
-## plot line
-plot.ts(xx, ylab=expression(italic(x[t])))
-## plot ACF
-plot.acf(acf(xx, plot=FALSE))
+## setup plot area
+par(mfrow=c(1,2))
+## plot line
+plot.ts(xx, ylab=expression(italic(x[t])))
+## plot ACF
+plot.acf(acf(xx, plot=FALSE))
@@ -470,7 +496,7 @@ 4.6.1 Simulating a random walk
4.6.2 Alternative formulation of a random walk
As an aside, let’s use an alternative formulation of a random walk model to see an even shorter way to simulate an RW in R. Based on our definition of a random walk in Equation (4.18), it is easy to see that
-\[\begin{equation}
+\[\begin{equation}
\tag{4.19}
\begin{aligned}
x_t &= x_{t-1} + w_t \\
@@ -478,30 +504,30 @@ 4.6.2 Alternative formulation of
x_{t-2} &= x_{t-3} + w_{t-2} \\
&\; \; \vdots
\end{aligned}
-\end{equation}\]
+\end{equation}\]
Therefore, if we substitute \(x_{t-2} + w_{t-1}\) for \(x_{t-1}\) in the first equation, and then \(x_{t-3} + w_{t-2}\) for \(x_{t-2}\), and so on in a recursive manner, we get
-\[\begin{equation}
+\[\begin{equation}
\tag{4.20}
x_t = w_t + w_{t-1} + w_{t-2} + \dots + w_{t-\infty} + x_{t-\infty}.
-\end{equation}\]
+\end{equation}\]
In practice, however, the time series will not start an infinite time ago, but rather at some \(t=1\), in which case we can write
-\[\begin{equation}
+\[\begin{equation}
\tag{4.21}
\begin{aligned}
x_t &= w_1 + w_2 + \dots + w_t \\
&= \sum_{t=1}^{T} w_t.
\end{aligned}
-\end{equation}\]
+\end{equation}\]
From Equation (4.21) it is easy to see that the value of an RW process at time step \(t\) is the sum of all the random errors up through time \(t\). Therefore, in R we can easily simulate a realization from an RW process using the cumsum(x)
function, which does cumulative summation of the vector x
over its entire length. If we use the same errors as before, we should get the same results.
-## simulate RW
-x2 <- cumsum(ww)
+
Let’s plot both time series to see if it worked.
-## setup plot area
-par(mfrow=c(1,2))
-## plot 1st RW
-plot.ts(xx, ylab=expression(italic(x[t])))
-## plot 2nd RW
-plot.ts(x2, ylab=expression(italic(x[t])))
+## setup plot area
+par(mfrow=c(1,2))
+## plot 1st RW
+plot.ts(xx, ylab=expression(italic(x[t])))
+## plot 2nd RW
+plot.ts(x2, ylab=expression(italic(x[t])))
@@ -523,24 +549,24 @@ 4.6.2 Alternative formulation of
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -441,12 +467,12 @@ 4.1.1 ts objects
The first, frequency
, is a bit of a misnomer because it does not really refer to the number of cycles per unit time, but rather the number of observations/samples per cycle. So, for example, if the data were collected each hour of a day then frequency=24
.
The second, start
, specifies the first sample in terms of (\(day\), \(hour\)), (\(year\), \(month\)), etc. So, for example, if the data were collected monthly beginning in November of 1969, then frequency=12
and start=c(1969,11)
. If the data were collected annually, then you simply specify start
as a scalar (e.g., start=1991
) and omit frequency
(i.e., R will set frequency=1
by default).
The Mauna Loa time series is collected monthly and begins in March of 1958, which we can get from the data themselves, and then pass to ts()
.
-## create a time series (ts) object from the CO2 data
-co2 <- ts(data=CO2$ppm, frequency=12,
- start=c(CO2[1,"year"],CO2[1,"month"]))
+## create a time series (ts) object from the CO2 data
+co2 <- ts(data=CO2$ppm, frequency=12,
+ start=c(CO2[1,"year"],CO2[1,"month"]))
Now let’s plot the data using plot.ts()
, which is designed specifically for ts objects like the one we just created above. It’s nice because we don’t need to specify any \(x\)-values as they are taken directly from the ts object.
-## plot the ts
-plot.ts(co2, ylab=expression(paste("CO"[2]," (ppm)")))
+
@@ -459,22 +485,22 @@ 4.1.1 ts objects
4.1.2 Combining and plotting multiple ts objects
Before we examine the CO\(_2\) data further, however, let’s see a quick example of how you can combine and plot multiple time series together. We’ll use the data on monthly mean temperature anomolies for the Northern Hemisphere (Temp
). First convert Temp
to a ts
object.
-temp.ts <- ts(data=Temp$Value, frequency=12, start=c(1880,1))
+
Before we can plot the two time series together, however, we need to line up their time indices because the temperature data start in January of 1880, but the CO\(_2\) data start in March of 1958. Fortunately, the ts.intersect()
function makes this really easy once the data have been transformed to ts objects by trimming the data to a common time frame. Also, ts.union()
works in a similar fashion, but it pads one or both series with the appropriate number of NA’s. Let’s try both.
-## intersection (only overlapping times)
-datI <- ts.intersect(co2,temp.ts)
-## dimensions of common-time data
-dim(datI)
+## intersection (only overlapping times)
+datI <- ts.intersect(co2,temp.ts)
+## dimensions of common-time data
+dim(datI)
[1] 682 2
-## union (all times)
-datU <- ts.union(co2,temp.ts)
-## dimensions of all-time data
-dim(datU)
+
[1] 1647 2
As you can see, the intersection of the two data sets is much smaller than the union. If you compare them, you will see that the first 938 rows of datU
contains NA
in the co2
column.
It turns out that the regular plot()
function in R is smart enough to recognize a ts object and use the information contained therein appropriately. Here’s how to plot the intersection of the two time series together with the y-axes on alternate sides (results are shown in Figure 4.2):
-## plot the ts
-plot(datI, main="", yax.flip=TRUE)
+
@@ -495,24 +521,24 @@ 4.1.2 Combining and plotting mult
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,7 +461,7 @@
4.5 White noise (WN)
A time series \(\{w_t\}\) is a discrete white noise series (DWN) if the \(w_1, w_1, \dots, w_t\) are independent and identically distributed (IID) with a mean of zero. For most of the examples in this course we will assume that the \(w_t \sim \text{N}(0,q)\), and therefore we refer to the time series \(\{w_t\}\) as Gaussian white noise. If our time series model has done an adequate job of removing all of the serial autocorrelation in the time series with trends, seasonal effects, etc., then the model residuals (\(e_t = y_t - \hat{y}_t\)) will be a WN sequence with the following properties for its mean (\(\bar{e}\)), covariance (\(c_k\)), and autocorrelation (\(r_k\)):
-\[\begin{equation}
+\[\begin{equation}
\tag{4.17}
\begin{aligned}
\bar{x} &= 0 \\
@@ -448,25 +474,25 @@ 4.5 White noise (WN)
0 & \text{if } k \neq 1.
\end{cases}
\end{aligned}
-\end{equation}\]
+\end{equation}\]
4.5.1 Simulating white noise
Simulating WN in R is straightforward with a variety of built-in random number generators for continuous and discrete distributions. Once you know R’s abbreviation for the distribution of interest, you add an \(\texttt{r}\) to the beginning to get the function’s name. For example, a Gaussian (or normal) distribution is abbreviated \(\texttt{norm}\) and so the function is rnorm()
. All of the random number functions require two things: the number of samples from the distribution and the parameters for the distribution itself (e.g., mean & SD of a normal). Check the help file for the distribution of interest to find out what parameters you must specify (e.g., type ?rnorm
to see the help for a normal distribution).
Here’s how to generate 100 samples from a normal distribution with mean of 5 and standard deviation of 0.2, and 50 samples from a Poisson distribution with a rate (\(\lambda\)) of 20.
-set.seed(123)
-## random normal variates
-GWN <- rnorm(n=100, mean=5, sd=0.2)
-## random Poisson variates
-PWN <- rpois(n=50, lambda=20)
+set.seed(123)
+## random normal variates
+GWN <- rnorm(n=100, mean=5, sd=0.2)
+## random Poisson variates
+PWN <- rpois(n=50, lambda=20)
Here are plots of the time series. Notice that on one occasion the same number was drawn twice in a row from the Poisson distribution, which is discrete. That is virtually guaranteed to never happen with a continuous distribution.
-## set up plot region
-par(mfrow=c(1,2))
-## plot normal variates with mean
-plot.ts(GWN)
-abline(h=5, col="blue", lty="dashed")
-## plot Poisson variates with mean
-plot.ts(PWN)
-abline(h=20, col="blue", lty="dashed")
+## set up plot region
+par(mfrow=c(1,2))
+## plot normal variates with mean
+plot.ts(GWN)
+abline(h=5, col="blue", lty="dashed")
+## plot Poisson variates with mean
+plot.ts(PWN)
+abline(h=20, col="blue", lty="dashed")
@@ -475,12 +501,12 @@ 4.5.1 Simulating white noise
Now let’s examine the ACF for the 2 white noise series and see if there is, in fact, zero autocorrelation for lags \(\geq\) 1.
-## set up plot region
-par(mfrow=c(1,2))
-## plot normal variates with mean
-acf(GWN, main="", lag.max=20)
-## plot Poisson variates with mean
-acf(PWN, main="", lag.max=20)
+## set up plot region
+par(mfrow=c(1,2))
+## plot normal variates with mean
+acf(GWN, main="", lag.max=20)
+## plot Poisson variates with mean
+acf(PWN, main="", lag.max=20)
@@ -502,24 +528,24 @@ 4.5.1 Simulating white noise
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -434,7 +460,7 @@
6.8 A random walk model of animal movement
-A simple random walk model of movement with drift (directional movement) but no correlation is
+A simple random walk model of movement with drift (directional movement) but no correlation is
\[\begin{gather}
x_{1,t} = x_{1,t-1} + u_1 + w_{1,t}, \;\; w_{1,t} \sim \,\text{N}(0,\sigma^2_1)\\
x_{2,t} = x_{2,t-1} + u_2 + w_{2,t}, \;\; w_{2,t} \sim \,\text{N}(0,\sigma^2_2)
@@ -445,8 +471,8 @@ 6.8 A random walk model of animal
y_{1,t} = x_{1,t} + v_{1,t}, \;\; v_{1,t} \sim \,\text{N}(0,\eta^2_1)\\
y_{2,t} = x_{2,t} + v_{2,t}, \;\; v_{2,t} \sim \,\text{N}(0,\eta^2_2),
\tag{6.9}
-\end{gather}\]
-This model is comprised of two separate univariate state-space models. Note that \(y_1\) depends only on \(x_1\) and \(y_2\) depends only on \(x_2\). There are no actual interactions between these two univariate models. However, we can write the model down in the form of a multivariate model using diagonal variance-covariance matrices and a diagonal design (\(\mathbf{Z}\)) matrix. Because the variance-covariance matrices and \(\mathbf{Z}\) are diagonal, the \(x_1\):\(y_1\) and \(x_2\):\(y_2\) processes will be independent as intended. Here are Equations (6.8) and (6.9) written as a MARSS model (in matrix form):
+\end{gather}\]
+This model is comprised of two separate univariate state-space models. Note that \(y_1\) depends only on \(x_1\) and \(y_2\) depends only on \(x_2\). There are no actual interactions between these two univariate models. However, we can write the model down in the form of a multivariate model using diagonal variance-covariance matrices and a diagonal design (\(\mathbf{Z}\)) matrix. Because the variance-covariance matrices and \(\mathbf{Z}\) are diagonal, the \(x_1\):\(y_1\) and \(x_2\):\(y_2\) processes will be independent as intended. Here are Equations (6.8) and (6.9) written as a MARSS model (in matrix form):
\[\begin{gather}
\begin{bmatrix}x_{1,t}\\x_{2,t}\end{bmatrix}
= \begin{bmatrix}x_{1,t-1}\\x_{2,t-1}\end{bmatrix}
@@ -465,7 +491,7 @@ 6.8 A random walk model of animal
\mathbf{x}_t = \mathbf{x}_{t-1} + \mathbf{u} + \mathbf{w}_t, \;\; \mathbf{w}_t \sim \,\text{MVN}(0,\mathbf{Q}) \\
\mathbf{y}_t = \mathbf{x}_{t} + \mathbf{v}_t, \;\; \mathbf{v}_t \sim \,\text{MVN}(0,\mathbf{R}).
\tag{6.12}
-\end{gather}\]
+\end{gather}\]
@@ -479,24 +505,24 @@ 6.8 A random walk model of animal
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -436,14 +462,14 @@
6.5 Basic diagnostics
The first diagnostic that you do with any statistical analysis is check that your residuals correspond to your assumed error structure. We have two types of errors in a univariate state-space model: process errors, the \(w_t\), and observation errors, the \(v_t\).
They should not have a temporal trend. To get the residuals from most types of fits in R, you can use residuals(fit)
. MARSS()
calls the \(v_t\), “model residuals”, and the \(w_t\) “state residuals”. We can plot these using the following code (Figure 6.3).
-par(mfrow=c(1,2))
-resids <- residuals(kem.0)
-plot(resids$model.residuals[1,],
- ylab="model residual", xlab="", main="flat level")
-abline(h=0)
-plot(resids$state.residuals[1,],
- ylab="state residual", xlab="", main="flat level")
-abline(h=0)
+par(mfrow=c(1,2))
+resids <- residuals(kem.0)
+plot(resids$model.residuals[1,],
+ ylab="model residual", xlab="", main="flat level")
+abline(h=0)
+plot(resids$state.residuals[1,],
+ ylab="state residual", xlab="", main="flat level")
+abline(h=0)
@@ -452,14 +478,14 @@ 6.5 Basic diagnostics
The residuals should also not be autocorrelated in time. We can check the autocorrelation with the function acf()
. We won’t do this for the state residuals for the flat level or linear trends since for those models \(w_t=0\). The autocorrelation plots are shown in Figure 6.4. The stochastic level model looks the best in that its model residuals (the \(v_t\)) are fine but the state model still has problems. Clearly the state is not a simple random walk. This is not surprising. The Aswan Low Dam was completed in 1902 and changed the mean flow. The Aswan High Dam was completed in 1970 and also affected the flow. You can see these perturbations in Figure 6.1.
-par(mfrow=c(2,2))
-resids <- residuals(kem.0)
-acf(resids$model.residuals[1,], main="flat level v(t)")
-resids <- residuals(kem.1)
-acf(resids$model.residuals[1,], main="linear trend v(t)")
-resids <- residuals(kem.2)
-acf(resids$model.residuals[1,], main="stoc level v(t)")
-acf(resids$state.residuals[1,], main="stoc level w(t)", na.action=na.pass)
+par(mfrow=c(2,2))
+resids <- residuals(kem.0)
+acf(resids$model.residuals[1,], main="flat level v(t)")
+resids <- residuals(kem.1)
+acf(resids$model.residuals[1,], main="linear trend v(t)")
+resids <- residuals(kem.2)
+acf(resids$model.residuals[1,], main="stoc level v(t)")
+acf(resids$state.residuals[1,], main="stoc level w(t)", na.action=na.pass)
@@ -480,24 +506,24 @@ 6.5 Basic diagnostics
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -436,20 +462,20 @@
6.4 Comparing models with AIC and model weights
To get the AIC or AICc values for a model fit from a MARSS fit, use fit$AIC
or fit$AICc
. The log-likelihood is in fit$logLik
and the number of estimated parameters in fit$num.params
. For fits from other functions, try AIC(fit)
or look at the function documentation.
Let’s put the AICc values 3 Nile models together:
-nile.aic = c(kem.0$AICc, kem.1$AICc, kem.2$AICc, kem.3$AICc)
+
Then we calculate the AICc minus the minus AICc in our model set and compute the model weights. \(\Delta\text{AIC}\) is the AIC values minus the minimum AIC value in your model set.
-delAIC <- nile.aic-min(nile.aic)
-relLik <- exp(-0.5*delAIC)
-aicweight <- relLik/sum(relLik)
+
And this leads to our model weights table:
-aic.table <- data.frame(
-AICc=nile.aic,
-delAIC=delAIC,
-relLik=relLik,
-weight=aicweight)
-rownames(aic.table) <- c("flat level","linear trend", "stoc level", "stoc level w drift")
+aic.table <- data.frame(
+AICc=nile.aic,
+delAIC=delAIC,
+relLik=relLik,
+weight=aicweight)
+rownames(aic.table) <- c("flat level","linear trend", "stoc level", "stoc level w drift")
Here the table is printed using round()
to limit the number of digits shown.
-round(aic.table, digits=3)
+
AICc delAIC relLik weight
flat level 1313.155 31.379 0.000 0.000
linear trend 1290.882 9.106 0.011 0.007
@@ -468,24 +494,24 @@ 6.4 Comparing models with AIC and
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,8 +461,8 @@
6.2 Examples using the Nile river data
We will use the data from the Nile River (Figure 6.1). We will fit different flow models to the data and compare the models with AIC.
-library(datasets)
-dat <- as.vector(Nile)
+
@@ -446,14 +472,13 @@ 6.2 Examples using the Nile river
6.2.1 Flat level model
-We will start by modeling these data as a simple average river flow with variability around some level \(\mu\).
-
+We will start by modeling these data as a simple average river flow with variability around some level \(\mu\).
\[\begin{equation}
y_t = \mu + v_t \text{ where } v_t \sim \,\text{N}(0,r)
\tag{6.3}
\end{equation}\]
-
where \(y_t\) is the river flow volume at year \(t\).
-We can write this model as a univariate state-space model as follows. We use \(x_t\) to model the average flow level. \(y_t\) is just an observation of this flat \(x_t\). Work through \(x_1\), \(x_2\), \(\dots\) starting from \(x_0\) to convince yourself that \(x_t\) will always equal \(\mu\).
+where \(y_t\) is the river flow volume at year \(t\).
+We can write this model as a univariate state-space model as follows. We use \(x_t\) to model the average flow level. \(y_t\) is just an observation of this flat \(x_t\). Work through \(x_1\), \(x_2\), \(\dots\) starting from \(x_0\) to convince yourself that \(x_t\) will always equal \(\mu\).
\[\begin{equation}
\begin{gathered}
x_t = 1 \times x_{t-1}+ 0 + w_t \text{ where } w_t \sim \,\text{N}(0,0) \\
@@ -462,21 +487,21 @@ 6.2.1 Flat level model
\end{gathered}
\tag{6.4}
\end{equation}\]
-
The model is specified as a list as follows:
-mod.nile.0 <- list(
-B=matrix(1), U=matrix(0), Q=matrix(0),
-Z=matrix(1), A=matrix(0), R=matrix("r"),
-x0=matrix("mu"), tinitx=0 )
+The model is specified as a list as follows:
+mod.nile.0 <- list(
+B=matrix(1), U=matrix(0), Q=matrix(0),
+Z=matrix(1), A=matrix(0), R=matrix("r"),
+x0=matrix("mu"), tinitx=0 )
We then fit the model:
-kem.0 <- MARSS(dat, model=mod.nile.0)
+
Output not shown, but here are the estimates and AICc.
-c(coef(kem.0, type="vector"), LL=kem.0$logLik, AICc=kem.0$AICc)
+
R.r x0.mu LL AICc
28351.5675 919.3500 -654.5157 1313.1552
6.2.2 Linear trend in flow model
-Figure 6.2 shows the fit for the flat average river flow model. Looking at the data, we might expect that a declining average river flow would be better. In MARSS form, that model would be:
+Figure 6.2 shows the fit for the flat average river flow model. Looking at the data, we might expect that a declining average river flow would be better. In MARSS form, that model would be:
\[\begin{equation}
\begin{gathered}
x_t = 1 \times x_{t-1}+ u + w_t \text{ where } w_t \sim \,\text{N}(0,0) \\
@@ -485,23 +510,23 @@ 6.2.2 Linear trend in flow model<
\end{gathered}
\tag{6.5}
\end{equation}\]
-
where \(u\) is now the average per-year decline in river flow volume. The model is specified as follows:
-mod.nile.1 <- list(
-B=matrix(1), U=matrix("u"), Q=matrix(0),
-Z=matrix(1), A=matrix(0), R=matrix("r"),
-x0=matrix("mu"), tinitx=0 )
+where \(u\) is now the average per-year decline in river flow volume. The model is specified as follows:
+mod.nile.1 <- list(
+B=matrix(1), U=matrix("u"), Q=matrix(0),
+Z=matrix(1), A=matrix(0), R=matrix("r"),
+x0=matrix("mu"), tinitx=0 )
We then fit the model:
-kem.1 <- MARSS(dat, model=mod.nile.1)
+
Here are the estimates, log-likelihood and AICc:
-c(coef(kem.1, type="vector"), LL=kem.1$logLik, AICc=kem.1$AICc)
+
R.r U.u x0.mu LL AICc
22213.595453 -2.692106 1054.935067 -642.315910 1290.881821
Figure 6.2 shows the fits for the two models with deterministic models (flat and declining) for mean river flow along with their AICc values (smaller AICc is better). The AICc for the model with a declining river flow is lower by over 20 (which is a lot).
6.2.3 Stochastic level model
-Looking at the flow levels, we might suspect that a model that allows the average flow to change would model the data better and we might suspect that there have been sudden, and anomalous, changes in the river flow level. We will now model the average river flow at year \(t\) as a random walk, specifically an autoregressive process which means that average river flow is year \(t\) is a function of average river flow in year \(t-1\).
-
+Looking at the flow levels, we might suspect that a model that allows the average flow to change would model the data better and we might suspect that there have been sudden, and anomalous, changes in the river flow level.
+We will now model the average river flow at year \(t\) as a random walk, specifically an autoregressive process which means that average river flow is year \(t\) is a function of average river flow in year \(t-1\).
\[\begin{equation}
\begin{gathered}
x_t = x_{t-1}+w_t \text{ where } w_t \sim \,\text{N}(0,q) \\
@@ -510,24 +535,25 @@ 6.2.3 Stochastic level model
\end{gathered}
\tag{6.6}
\end{equation}\]
-
As before, \(y_t\) is the river flow volume at year \(t\). \(x_t\) is the mean level. The model is specified as:
-mod.nile.2 = list(
-B=matrix(1), U=matrix(0), Q=matrix("q"),
-Z=matrix(1), A=matrix(0), R=matrix("r"),
-x0=matrix("mu"), tinitx=0 )
+As before, \(y_t\) is the river flow volume at year \(t\). \(x_t\) is the mean level.
+The model is specified as:
+mod.nile.2 = list(
+B=matrix(1), U=matrix(0), Q=matrix("q"),
+Z=matrix(1), A=matrix(0), R=matrix("r"),
+x0=matrix("mu"), tinitx=0 )
We could also use the text shortcuts to specify the model. Because \(\mathbf{R}\) and \(\mathbf{Q}\) are \(1 \times 1\) matrices, “unconstrained”, “diagonal and unequal”, “diagonal and equal” and “equalvarcov” will all lead to a \(1 \times 1\) matrix with one estimated element. For \(\mathbf{a}\) and \(\mathbf{u}\), the following shortcut could be used:
-A <- "zero"; U <- "zero"
+
Because \(\mathbf{x}_0\) is \(1 \times 1\), it could be specified as “unequal”, “equal” or “unconstrained”.
-kem.2 <- MARSS(dat, model=mod.nile.2)
+
Here are the estimates, log-likelihood and AICc:
-c(coef(kem.2, type="vector"), LL=kem.2$logLik, AICc=kem.2$AICc)
+
R.r Q.q x0.mu LL AICc
15065.6121 1425.0030 1111.6338 -637.7631 1281.7762
6.2.4 Stochastic level model with drift
We can add a drift to term to our random walk; the \(u\) in the process model (\(x\)) is the drift term. This causes the random walk to tend to trend up or down.
-\[\begin{equation}
+\[\begin{equation}
\begin{gathered}
x_t = x_{t-1}+u+w_t \text{ where } w_t \sim \,\text{N}(0,q) \\
y_t = x_t+v_t \text{ where } v_t \sim \,\text{N}(0,r) \\
@@ -535,18 +561,16 @@ 6.2.4 Stochastic level model with
\end{gathered}
\tag{6.7}
\end{equation}\]
-
The model is then specified by changing U
to indicate that a \(u\) is estimated:
-mod.nile.3 = list(
-B=matrix(1), U=matrix("u"), Q=matrix("q"),
-Z=matrix(1), A=matrix(0), R=matrix("r"),
-x0=matrix("mu"), tinitx=0)
-kem.3 <- MARSS(dat, model=mod.nile.3)
+The model is then specified by changing U
to indicate that a \(u\) is estimated:
+mod.nile.3 = list(
+B=matrix(1), U=matrix("u"), Q=matrix("q"),
+Z=matrix(1), A=matrix(0), R=matrix("r"),
+x0=matrix("mu"), tinitx=0)
+
Here are the estimates, log-likelihood and AICc:
-c(coef(kem.3, type="vector"), LL=kem.3$logLik, AICc=kem.3$AICc)
- R.r U.u Q.q x0.mu LL
-15585.278194 -3.248793 1088.987455 1124.044484 -637.302692
- AICc
- 1283.026436
+
+ R.r U.u Q.q x0.mu LL AICc
+15585.278194 -3.248793 1088.987455 1124.044484 -637.302692 1283.026436
Figure 6.2 shows all the models along with their AICc values.
@@ -561,24 +585,24 @@ 6.2.4 Stochastic level model with
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -434,7 +460,7 @@
6.1 Fitting a state-space model with MARSS
-The MARSS package fits multivariate auto-regressive models of this form:
+The MARSS package fits multivariate auto-regressive models of this form:
\[\begin{equation}
\begin{gathered}
\mathbf{x}_t = \mathbf{B} \mathbf{x}_{t-1}+\mathbf{u}+\mathbf{w}_t \text{ where } \mathbf{w}_t \sim \,\text{N}(0,\mathbf{Q}) \\
@@ -443,10 +469,12 @@ 6.1 Fitting a state-space model w
\end{gathered}
\tag{6.1}
\end{equation}\]
-
To fit your time series model with the MARSS package, you need to put your model into the form above. The \(\mathbf{B}\), \(\mathbf{Z}\), \(\mathbf{u}\), \(\mathbf{a}\), \(\mathbf{Q}\), \(\mathbf{R}\) and \(\boldsymbol{\mu}\) are parameters that are (potentially) estimated. The \(\mathbf{y}\) are your data. The \(\mathbf{x}\) are the hidden state(s). Everything in bold is a matrix; if it is a small bolded letter, it is a matrix with 1 column.
+To fit your time series model with the MARSS package, you need to put your model into the form above. The \(\mathbf{B}\), \(\mathbf{Z}\), \(\mathbf{u}\), \(\mathbf{a}\), \(\mathbf{Q}\), \(\mathbf{R}\) and \(\boldsymbol{\mu}\) are parameters that are (potentially) estimated. The \(\mathbf{y}\) are your data. The \(\mathbf{x}\) are the hidden state(s). Everything in bold is a matrix; if it is a small bolded letter, it is a matrix with 1 column.
Important: In the state-space model equation, \(\mathbf{y}\) is always the data and \(\mathbf{x}\) is a hidden random walk estimated from the data.
-A basic MARSS()
call looks like fit=MARSS(y, model=list(...))
. The argument model
tells the function what form the parameters take. The list has the elements with the names: B
, U
, Q
, etc. The names correspond to the parameters with the same names in Equation (6.1) except that \(\boldsymbol{\mu}\) is called x0
. tinitx
indicates whether the initial \(\mathbf{x}\) is specified at \(t=0\) so \(\mathbf{x}_0\) or \(t=1\) so \(\mathbf{x}_1\).
-Here’s an example. Let’s say we want to fit a univariate AR(1) model observed with error. Here is that model:
+A basic MARSS()
call looks like
+fit=MARSS(y, model=list(...))
.
+The argument model
tells the function what form the parameters take. The list has the elements with the names: B
, U
, Q
, etc. The names correspond to the parameters with the same names in Equation (6.1) except that \(\boldsymbol{\mu}\) is called x0
. tinitx
indicates whether the initial \(\mathbf{x}\) is specified at \(t=0\) so \(\mathbf{x}_0\) or \(t=1\) so \(\mathbf{x}_1\).
+Here’s an example. Let’s say we want to fit a univariate AR(1) model observed with error. Here is that model:
\[\begin{equation}
\begin{gathered}
x_t = b x_{t-1} + w_t \text{ where } \mathbf{w}_t \sim \,\text{N}(0,q) \\
@@ -454,21 +482,21 @@ 6.1 Fitting a state-space model w
x_0 = \mu
\end{gathered}
\tag{6.2}
-\end{equation}\]
+\end{equation}\]
To fit this with MARSS()
, we need to write Equation (6.2) as Equation (6.1). Equation (6.1) is in MATRIX form. In the model list, the parameters must be written EXACTLY like they would be written for Equation (6.1). For example, 1
is the number 1 in R. It is not a matrix:
-class(1)
+
[1] "numeric"
If you need a 1 (or 0) in your model, you need to pass in the parameter as a \(1 \times 1\) matrix: matrix(1)
.
With that mind, our model list for Equation (6.2) is:
-mod.list <- list(
-B=matrix(1), U=matrix(0), Q=matrix("q"),
-Z=matrix(1), A=matrix(0), R=matrix("r"),
-x0=matrix("mu"), tinitx=0 )
+mod.list <- list(
+B=matrix(1), U=matrix(0), Q=matrix("q"),
+Z=matrix(1), A=matrix(0), R=matrix("r"),
+x0=matrix("mu"), tinitx=0 )
We can simulate some AR(1) plus error data like so
-q <- 0.1; r <- 0.1; n <- 100
-y <- cumsum(rnorm(n,0,sqrt(q)))+rnorm(n,0,sqrt(r))
+
And then fit with MARSS()
using mod.list
above:
-fit <- MARSS(y, model=mod.list)
+
Success! abstol and log-log tests passed at 16 iterations.
Alert: conv.test.slope.tol is 0.5.
Test with smaller values (<0.1) to ensure convergence.
@@ -489,8 +517,8 @@ 6.1 Fitting a state-space model w
Standard errors have not been calculated.
Use MARSSparamCIs to compute CIs and bias estimates.
If we wanted to fix \(q=0.1\), then \(\mathbf{Q}=[0.1]\) (a \(1 \times 1\) matrix with 0.1). We just change mod.list$Q
and re-fit:
-mod.list$Q <- matrix(0.1)
-fit <- MARSS(y,model=mod.list)
+
@@ -503,24 +531,24 @@ 6.1 Fitting a state-space model w
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,82 +461,82 @@
6.6 Fitting with JAGS
Here we show how to fit the stochastic level model, model 3 Equation (6.7), with JAGS. This is a model where the level is a random walk with drift and the Nile River flow is that level plus error.
-library(datasets)
-y <- as.vector(Nile)
+
This section requires that you have JAGS installed and the R2jags, rjags and coda R packages loaded.
-library(R2jags)
-library(rjags)
-library(coda)
+
The first step is to write the model for JAGS to a file (filename in model.loc
):
-model.loc <- "ss_model.txt"
-jagsscript <- cat("
- model {
- # priors on parameters
- mu ~ dnorm(Y1, 1/(Y1*100)); # normal mean = 0, sd = 1/sqrt(0.01)
- tau.q ~ dgamma(0.001,0.001); # This is inverse gamma
- sd.q <- 1/sqrt(tau.q); # sd is treated as derived parameter
- tau.r ~ dgamma(0.001,0.001); # This is inverse gamma
- sd.r <- 1/sqrt(tau.r); # sd is treated as derived parameter
- u ~ dnorm(0, 0.01);
-
- # Because init X is specified at t=0
- X0 <- mu
- X[1] ~ dnorm(X0+u,tau.q);
- Y[1] ~ dnorm(X[1], tau.r);
-
- for(i in 2:TT) {
- predX[i] <- X[i-1]+u;
- X[i] ~ dnorm(predX[i],tau.q); # Process variation
- Y[i] ~ dnorm(X[i], tau.r); # Observation variation
- }
- }
- ",file=model.loc)
+model.loc <- "ss_model.txt"
+jagsscript <- cat("
+ model {
+ # priors on parameters
+ mu ~ dnorm(Y1, 1/(Y1*100)); # normal mean = 0, sd = 1/sqrt(0.01)
+ tau.q ~ dgamma(0.001,0.001); # This is inverse gamma
+ sd.q <- 1/sqrt(tau.q); # sd is treated as derived parameter
+ tau.r ~ dgamma(0.001,0.001); # This is inverse gamma
+ sd.r <- 1/sqrt(tau.r); # sd is treated as derived parameter
+ u ~ dnorm(0, 0.01);
+
+ # Because init X is specified at t=0
+ X0 <- mu
+ X[1] ~ dnorm(X0+u,tau.q);
+ Y[1] ~ dnorm(X[1], tau.r);
+
+ for(i in 2:TT) {
+ predX[i] <- X[i-1]+u;
+ X[i] ~ dnorm(predX[i],tau.q); # Process variation
+ Y[i] ~ dnorm(X[i], tau.r); # Observation variation
+ }
+ }
+ ",file=model.loc)
Next we specify the data (and any other input) that the JAGS code needs. In this case, we need to pass in dat
and the number of time steps since that is used in the for loop. We also specify the parameters that we want to monitor. We need to specify at least one, but we will monitor all of them so we can plot them after fitting. Note, that the hidden state is a parameter in the Bayesian context (but not in the maximum likelihood context).
-jags.data <- list("Y"=y, "TT"=length(y), Y1=y[1])
-jags.params <- c("sd.q", "sd.r", "X", "mu", "u")
+
Now we can fit the model:
-mod_ss <- jags(jags.data, parameters.to.save=jags.params,
- model.file=model.loc, n.chains = 3,
- n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
+mod_ss <- jags(jags.data, parameters.to.save=jags.params,
+ model.file=model.loc, n.chains = 3,
+ n.burnin=5000, n.thin=1, n.iter=10000, DIC=TRUE)
We can then show the posteriors along with the MLEs from MARSS on top (Figure 6.5 ) using the code below.
-attach.jags(mod_ss)
-par(mfrow=c(2,2))
-hist(mu)
-abline(v=coef(kem.3)$x0, col="red")
-hist(u)
-abline(v=coef(kem.3)$U, col="red")
-hist(log(sd.q^2))
-abline(v=log(coef(kem.3)$Q), col="red")
-hist(log(sd.r^2))
-abline(v=log(coef(kem.3)$R), col="red")
+attach.jags(mod_ss)
+par(mfrow=c(2,2))
+hist(mu)
+abline(v=coef(kem.3)$x0, col="red")
+hist(u)
+abline(v=coef(kem.3)$U, col="red")
+hist(log(sd.q^2))
+abline(v=log(coef(kem.3)$Q), col="red")
+hist(log(sd.r^2))
+abline(v=log(coef(kem.3)$R), col="red")
-
detach.jags()
+
To plot the estimated states ( Figure 6.6 ), we write a helper function:
-plotModelOutput <- function(jagsmodel, Y) {
- attach.jags(jagsmodel)
- x <- seq(1,length(Y))
- XPred <- cbind(apply(X,2,quantile,0.025), apply(X,2,mean), apply(X,2,quantile,0.975))
- ylims <- c(min(c(Y,XPred), na.rm=TRUE), max(c(Y,XPred), na.rm=TRUE))
- plot(Y, col="white",ylim=ylims, xlab="",ylab="State predictions")
- polygon(c(x,rev(x)), c(XPred[,1], rev(XPred[,3])), col="grey70",border=NA)
- lines(XPred[,2])
- points(Y)
-}
-
-plotModelOutput(mod_ss, y)
+plotModelOutput <- function(jagsmodel, Y) {
+ attach.jags(jagsmodel)
+ x <- seq(1,length(Y))
+ XPred <- cbind(apply(X,2,quantile,0.025), apply(X,2,mean), apply(X,2,quantile,0.975))
+ ylims <- c(min(c(Y,XPred), na.rm=TRUE), max(c(Y,XPred), na.rm=TRUE))
+ plot(Y, col="white",ylim=ylims, xlab="",ylab="State predictions")
+ polygon(c(x,rev(x)), c(XPred[,1], rev(XPred[,3])), col="grey70",border=NA)
+ lines(XPred[,2])
+ points(Y)
+}
+
+
The following object is masked _by_ .GlobalEnv:
mu
-lines(kem.3$states[1,], col="red")
-lines(1.96*kem.3$states.se[1,]+kem.3$states[1,], col="red", lty=2)
-lines(-1.96*kem.3$states.se[1,]+kem.3$states[1,], col="red", lty=2)
-title("State estimate and data from\nJAGS (black) versus MARSS (red)")
+lines(kem.3$states[1,], col="red")
+lines(1.96*kem.3$states.se[1,]+kem.3$states[1,], col="red", lty=2)
+lines(-1.96*kem.3$states.se[1,]+kem.3$states[1,], col="red", lty=2)
+title("State estimate and data from\nJAGS (black) versus MARSS (red)")
6.6 Fitting with JAGS
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,65 +461,65 @@
6.7 Fitting with Stan
Let’s fit the same model with Stan using the rstan package. If you have not already, you will need to install the rstan package. This package depends on a number of other packages which should install automatically when you install rstan.
-library(datasets)
-library(rstan)
-y <- as.vector(Nile)
+
First we write the model. We could write this to a file (recommended), but for this example, we write as a character object. Though the syntax is different from the JAGS code, it has many similarities. Note, unlike the JAGS, the Stan does not allow any NAs in your data. Thus we have to specify the location of the NAs in our data. The Nile data does not have NAs, but we want to write the code so it would work even if there were NAs.
-scode <- "
-data {
- int<lower=0> TT;
- int<lower=0> n_pos; // number of non-NA values
- int<lower=0> indx_pos[n_pos]; // index of the non-NA values
- vector[n_pos] y;
-}
-parameters {
- real x0;
- real u;
- vector[TT] pro_dev;
- real<lower=0> sd_q;
- real<lower=0> sd_r;
-}
-transformed parameters {
- vector[TT] x;
- x[1] = x0 + u + pro_dev[1];
- for(i in 2:TT) {
- x[i] = x[i-1] + u + pro_dev[i];
- }
-}
-model {
- x0 ~ normal(y[1],10);
- u ~ normal(0,2);
- sd_q ~ cauchy(0,5);
- sd_r ~ cauchy(0,5);
- pro_dev ~ normal(0, sd_q);
- for(i in 1:n_pos){
- y[i] ~ normal(x[indx_pos[i]], sd_r);
- }
-}
-generated quantities {
- vector[n_pos] log_lik;
- for (i in 1:n_pos) log_lik[i] = normal_lpdf(y[i] | x[indx_pos[i]], sd_r);
-}
-"
+scode <- "
+data {
+ int<lower=0> TT;
+ int<lower=0> n_pos; // number of non-NA values
+ int<lower=0> indx_pos[n_pos]; // index of the non-NA values
+ vector[n_pos] y;
+}
+parameters {
+ real x0;
+ real u;
+ vector[TT] pro_dev;
+ real<lower=0> sd_q;
+ real<lower=0> sd_r;
+}
+transformed parameters {
+ vector[TT] x;
+ x[1] = x0 + u + pro_dev[1];
+ for(i in 2:TT) {
+ x[i] = x[i-1] + u + pro_dev[i];
+ }
+}
+model {
+ x0 ~ normal(y[1],10);
+ u ~ normal(0,2);
+ sd_q ~ cauchy(0,5);
+ sd_r ~ cauchy(0,5);
+ pro_dev ~ normal(0, sd_q);
+ for(i in 1:n_pos){
+ y[i] ~ normal(x[indx_pos[i]], sd_r);
+ }
+}
+generated quantities {
+ vector[n_pos] log_lik;
+ for (i in 1:n_pos) log_lik[i] = normal_lpdf(y[i] | x[indx_pos[i]], sd_r);
+}
+"
Then we call stan()
and pass in the data, names of parameter we wish to have returned, and information on number of chains, samples (iter), and thinning. The output is verbose (hidden here) and may have some warnings.
-# We pass in the non-NA ys as vector
-ypos <- y[!is.na(y)]
-n_pos <- sum(!is.na(y)) #number on non-NA ys
-indx_pos <- which(!is.na(y)) #index on the non-NAs
-mod <- rstan::stan(model_code = scode,
- data = list("y"=ypos, "TT"=length(y), "n_pos"=n_pos, "indx_pos"=indx_pos),
- pars = c("sd_q","x", "sd_r", "u", "x0"),
- chains = 3, iter = 1000, thin = 1)
+# We pass in the non-NA ys as vector
+ypos <- y[!is.na(y)]
+n_pos <- sum(!is.na(y)) #number on non-NA ys
+indx_pos <- which(!is.na(y)) #index on the non-NAs
+mod <- rstan::stan(model_code = scode,
+ data = list("y"=ypos, "TT"=length(y), "n_pos"=n_pos, "indx_pos"=indx_pos),
+ pars = c("sd_q","x", "sd_r", "u", "x0"),
+ chains = 3, iter = 1000, thin = 1)
We use extract()
to extract the parameters from the fitted model and we can plot. The estimated level is x
and we will plot that with the 95% credible intervals.
-pars <- rstan::extract(mod)
-pred_mean <- apply(pars$x, 2, mean)
-pred_lo <- apply(pars$x, 2, quantile, 0.025)
-pred_hi <- apply(pars$x, 2, quantile, 0.975)
-plot(pred_mean, type = "l", lwd = 3, ylim = range(c(pred_mean,
- pred_lo, pred_hi)), ylab = "Nile River Level")
-lines(pred_lo)
-lines(pred_hi)
-points(y, col = "blue")
+pars <- rstan::extract(mod)
+pred_mean <- apply(pars$x, 2, mean)
+pred_lo <- apply(pars$x, 2, quantile, 0.025)
+pred_hi <- apply(pars$x, 2, quantile, 0.975)
+plot(pred_mean, type = "l", lwd = 3,
+ ylim = range(c(pred_mean, pred_lo, pred_hi)), ylab = "Nile River Level")
+lines(pred_lo)
+lines(pred_hi)
+points(y, col="blue")
6.7 Fitting with Stan
Here is a ggplot()
version of the plot.
-library(ggplot2)
-nile <- data.frame(y = y, year = 1871:1970)
-h <- ggplot(nile, aes(year))
-h + geom_ribbon(aes(ymin = pred_lo, ymax = pred_hi), fill = "grey70") +
- geom_line(aes(y = pred_mean), size = 1) + geom_point(aes(y = y),
- color = "blue") + labs(y = "Nile River level")
+library(ggplot2)
+nile <- data.frame(y=y, year=1871:1970)
+h <- ggplot(nile, aes(year))
+h + geom_ribbon(aes(ymin = pred_lo, ymax = pred_hi), fill = "grey70") +
+ geom_line(aes(y = pred_mean), size=1) +
+ geom_point(aes(y = y), color="blue") +
+ labs(y = "Nile River level")
6.7 Fitting with Stan
We can plot the histogram of the samples against the values estimated via maximum likelihood.
-par(mfrow = c(2, 2))
-hist(pars$x0)
-abline(v = coef(kem.3)$x0, col = "red")
-hist(pars$u)
-abline(v = coef(kem.3)$U, col = "red")
-hist(log(pars$sd_q^2))
-abline(v = log(coef(kem.3)$Q), col = "red")
-hist(log(pars$sd_r^2))
-abline(v = log(coef(kem.3)$R), col = "red")
+par(mfrow = c(2, 2))
+hist(pars$x0)
+abline(v = coef(kem.3)$x0, col = "red")
+hist(pars$u)
+abline(v = coef(kem.3)$U, col = "red")
+hist(log(pars$sd_q^2))
+abline(v = log(coef(kem.3)$Q), col = "red")
+hist(log(pars$sd_r^2))
+abline(v = log(coef(kem.3)$R), col = "red")
6.7 Fitting with Stan
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -437,14 +463,14 @@ 6.9 Problems
Write the equations for each of these models: ARIMA(0,0,0), ARIMA(0,1,0), ARIMA(1,0,0), ARIMA(0,0,1), ARIMA(1,0,1). Read the help file for the Arima()
function (in the forecast package) if you are fuzzy on the arima notation.
The MARSS package includes a data set of sharp-tailed grouse in Washington. Load the data to use as follows:
-library(MARSS)
-dat=log(grouse[,2])
+
Consider these two models for the data:
Model 1 random walk with no drift observed with no error
Model 2 random walk with drift observed with no error
-Written as a univariate state-space model, model 1 is
+Written as a univariate state-space model, model 1 is
\[\begin{equation}
\begin{gathered}
x_t = x_{t-1}+w_t \text{ where } w_t \sim \,\text{N}(0,q)\\
@@ -462,27 +488,27 @@ 6.9 Problems
\end{gathered}
\tag{6.14}
\end{equation}\]
-
\(y\) is the log grouse count in year \(t\).
+\(y\) is the log grouse count in year \(t\).
Plot the data. The year is in column 1 of grouse
.
Fit each model using MARSS()
.
Which one appears better supported given AICc?
-- Load the forecast package. Use
?auto.arima
to learn what it does. Then use auto.arima(dat)
to fit the data. Next run auto.arima(dat, trace=TRUE)
to see all the ARIMA models that the function compared. Note, ARIMA(0,1,0) is a random walk with b=1. ARIMA(0,1,0) with drift would be a random walk (b=1) with drift (with \(u\)).
-
+Load the forecast package. Use ?auto.arima
to learn what it does. Then use auto.arima(dat)
to fit the data. Next run auto.arima(dat, trace=TRUE)
to see all the ARIMA models that the function compared. Note, ARIMA(0,1,0) is a random walk with b=1. ARIMA(0,1,0) with drift would be a random walk (b=1) with drift (with \(u\)).
+
Is the difference in the AICc values between a random walk with and without drift comparable between MARSS() and auto.arima()?
Note when using auto.arima()
, an AR(1) model of the following form will be fit (notice the \(b\)): \(x_t = b x_{t-1}+w_t\). auto.arima()
refers to this model \(x_t = x_{t-1}+w_t\), which is also AR(1) but with \(b=1\), as ARIMA(0,1,0). This says that the first difference of the data (that’s the 1 in the middle) is a ARMA(0,0) process (the 0s in the 1st and 3rd spots). So ARIMA(0,1,0) means this: \(x_t - x_{t-1} = w_t\).
Create a random walk with drift time series using cumsum()
and rnorm()
. Look at the rnorm()
help file (?rnorm
) to make sure you know what the arguments to the rnorm()
are.
-dat <- cumsum(rnorm(100,0.1,1))
+
-What is the order of this random walk written as ARIMA(p, d, q)? “what is the order” means “what is \(p\), \(d\), and \(q\). Model”order" is how arima()
and Arima()
specify arima models.
+What is the order of this random walk written as ARIMA(p, d, q)? “what is the order” means “what is \(p\), \(d\), and \(q\). Model”order" is how arima()
and Arima()
specify arima models.
Fit that model using Arima()
in the forecast package. You’ll need to specify the arguments order
and include.drift
. Use ?Arima
to review what that function does if needed.
Write out the equation for this random walk as a univariate state-space model. Notice that there is no observation error, but still write this as a state-space model.
Fit that model with MARSS()
.
How are the two estimates from Arima()
and MARSS()
different?
The first-difference of dat
used in the previous problem is:
-diff.dat=diff(dat)
+
Use ?diff
to check what the diff()
function does.
If \(x_t\) denotes a time series. What is the first difference of \(x\)? What is the second difference?
@@ -491,12 +517,12 @@ 6.9 Problems
Fit with MARSS()
. You will need to write the model for diff.dat
as a state-space model. If you’ve done this right, the estimated parameters using Arima()
and MARSS()
will now be the same.
This question should clue you into the fact that Arima()
is not exactly fitting Equation (6.1). It’s very similar, but not quite written that way. By the way, Equation (6.1) is how structural time series observed with error are written (state-space models). To recover the estimates that a function like arima()
or Arima()
returns, you need to write your state-space model in a specific way (as seen above).
-Arima()
will also fit what it calls an “AR(1) with drift”. An AR(1) with drift is NOT this model:
+Arima()
will also fit what it calls an “AR(1) with drift”. An AR(1) with drift is NOT this model:
\[\begin{equation}
x_t = b x_{t-1}+u+w_t \text{ where } w_t \sim \,\text{N}(0,q)
\tag{6.15}
\end{equation}\]
-
In the population dynamics literature, this equation is called the Gompertz model and is a type of density-dependent population model.
+In the population dynamics literature, this equation is called the Gompertz model and is a type of density-dependent population model.
Write R code to simulate Equation (6.15). Make \(b\) less than 1 and greater than 0. Set \(u\) and \(x_0\) to whatever you want. You can use a for loop.
Plot the trajectories and show that this model does not “drift” upward or downward. It fluctuates about a mean value.
@@ -506,9 +532,9 @@ 6.9 Problems
We will fit what Arima()
calls “AR(1) with drift” models in the chapter on MARSS models with covariates.
The MARSS package includes a data set of gray whales. Load the data to use as follows:
-library(MARSS)
-dat <- log(graywhales[,2])
-Fit a random walk with drift model observed with error to the data:
+
+Fit a random walk with drift model observed with error to the data:
\[\begin{equation}
\begin{gathered}
x_t = x_{t-1}+u+w_t \text{ where } w_t \sim \,\text{N}(0,q) \\
@@ -517,7 +543,7 @@ 6.9 Problems
\end{gathered}
\tag{6.16}
\end{equation}\]
-
\(y\) is the whale count in year \(t\). \(x\) is interpreted as the ‘true’ unknown population size that we are trying to estimate.
+\(y\) is the whale count in year \(t\). \(x\) is interpreted as the ‘true’ unknown population size that we are trying to estimate.
Fit this model with MARSS()
Plot the estimated \(x\) as a line with the actual counts added as points. \(x\) is in fit$states
. It is a matrix. To plot using plot()
, you will need to change it to a vector using as.vector()
or fit$states[1,]
@@ -536,16 +562,17 @@ 6.9 Problems
Compute the AICc’s for each model and likelihood or deviance (-2 * log likelihood). Where to find these? Try names(fit)
. logLik()
is the standard R function to return log-likelihood from fits.
Calculate a table of \(\Delta\text{AICc}\) values and AICc weights.
Show the acf of the model and state residuals for the best model. You will need a vector of the residuals to do this. If fit
is the fit from a fit call like fit = MARSS(dat)
, you get the residuals using this code:
-residuals(fit)$state.residuals[1,]
-residuals(fit)$model.residuals[1,]
+
Do the acf’s suggest any problems?
-Evaluate the predictive accuracy of forecasts using the forecast package using the airmiles
dataset. Load the data to use as follows:
-library(forecast)
-dat <- log(airmiles)
-n <- length(dat)
-training.dat <- dat[1:(n-3)]
-test.dat <- dat[(n-2):n]
+Evaluate the predictive accuracy of forecasts using the forecast package using the airmiles
dataset.
+Load the data to use as follows:
+library(forecast)
+dat <- log(airmiles)
+n <- length(dat)
+training.dat <- dat[1:(n-3)]
+test.dat <- dat[(n-2):n]
This will prepare the training data and set aside the last 3 data points for validation.
Fit the following four models using Arima()
: ARIMA(0,0,0), ARIMA(1,0,0), ARIMA(0,0,1), ARIMA(1,0,1).
@@ -555,17 +582,17 @@ 6.9 Problems
Which model is best supported based on the MASE statistic?
The WhaleNet Archive of STOP Data has movement data on loggerhead turtles on the east coast of the US from ARGOS tags. The MARSS package loggerheadNoisy
dataset is lat/lot data on eight individuals, however we have corrupted this data severely by adding random errors in order to create a “bad tag” problem (very noisy). Use head(loggerheadNoisy)
to get an idea of the data. Then load the data on one turtle, MaryLee. MARSS needs time across the columns to you need to use transpose the data (as shown).
-turtlename <- "MaryLee"
-dat <- loggerheadNoisy[which(loggerheadNoisy$turtle==turtlename),5:6]
-dat <- t(dat)
+turtlename <- "MaryLee"
+dat <- loggerheadNoisy[which(loggerheadNoisy$turtle==turtlename),5:6]
+dat <- t(dat)
Plot MaryLee’s locations (as a line not dots). Put the latitude locations on the y-axis and the longitude on the y-axis. You can use rownames(dat)
to see which is in which row. You can just use plot()
for the homework. But if you want, you can look at the MARSS Manual chapter on animal movement to see how to plot the turtle locations on a map using the maps package.
Analyze the data with a state-space model (movement observed with error) using
-fit0 <- MARSS(dat)
+
Look at the output from the above MARSS call. What is the meaning of the parameters output from MARSS in terms of turtle movement? What exactly is the \(u\) estimate for example? Look at the data and think about the model you fit.
What assumption did the default MARSS model make about observation error and process error? What does that assumption mean in terms of how steps in the N-S and E-W directions are related? What does that assumption mean in terms of our assumption about the latitudal and longitudinal observation errors?
Does MaryLee move faster in the latitude direction versus longitude direction?
-Add MaryLee’s estimated “true” positions to your plot of her locations. You can use lines(x, y, col="red")
(with x and y replaced with your x and y data). The true position is the “state”. This is in the states element of an output from MARSS fit0$states
.
+Add MaryLee’s estimated “true” positions to your plot of her locations. You can use lines(x, y, col="red")
(with x and y replaced with your x and y data). The true position is the “state”. This is in the states element of an output from MARSS fit0$states
.
Fit the following models with different assumptions regarding the movement in the lat/lon direction:
Lat/lon movements are independent but the variance is the same
@@ -573,10 +600,10 @@ 6.9 Problems
Lat/lon movements are correlated and the lat/lon variances are the same.
You only need to change Q
specification. Your MARSS call will now look like the following with ...
replaced with your Q
specification.
-fit1 <- MARSS(dat, list(Q=...))
+
Plot your state residuals (true location residuals). What are the problems? Discuss in reference to your plot of the location data. Here is how to get state residuals from MARSS()
output:
-resids <- residuals(fit0)$state.residuals
-The lon residuals are in row 1 and lat residuals are in row 2 (same order as the data).
+
+The lon residuals are in row 1 and lat residuals are in row 2 (same order as the data).
@@ -593,24 +620,24 @@ 6.9 Problems
+
+
@@ -41,6 +40,9 @@
+
+
+
@@ -51,41 +53,66 @@
@@ -109,7 +136,6 @@
- Authors
- Citation
-- Preface
- 1 Basic matrix math in R
- 1.1 Creating matrices in R
- 1.2 Matrix multiplication, addition and transpose
@@ -435,8 +461,8 @@
6.3 The StructTS function
The StructTS
function in the stats package in R will also fit the stochastic level model:
-fit.sts <- StructTS(dat, type="level")
-fit.sts
+
Call:
StructTS(x = dat, type = "level")
@@ -446,22 +472,22 @@ 6.3 The StructTS function
1469 15099
The estimates from StructTS()
will be different (though similar) from MARSS()
because StructTS()
uses \(x_1 = y_1\), that is the hidden state at \(t=1\) is fixed to be the data at \(t=1\). That is fine if you have a long data set, but would be disastrous for the short data sets typical in fisheries and ecology.
StructTS()
is much, much faster for long time series. The example in ?StructTS
is pretty much instantaneous with StructTS()
but takes minutes with the EM algorithm that is the default in MARSS()
. With the BFGS algorithm, it is much closer to StructTS()
:
-trees <- window(treering, start = 0)
-fitts <- StructTS(trees, type = "level")
-fitem <- MARSS(as.vector(trees),mod.nile.2)
-fitbf <- MARSS(as.vector(trees),mod.nile.2, method="BFGS")
+trees <- window(treering, start = 0)
+fitts <- StructTS(trees, type = "level")
+fitem <- MARSS(as.vector(trees),mod.nile.2)
+fitbf <- MARSS(as.vector(trees),mod.nile.2, method="BFGS")
Note that mod.nile.2
specifies a univariate stochastic level model so we can use it just fine with other univariate data sets.
In addition, fitted(fit.sts)
where fit.sts
is a fit from StructTS()
is very different than fit.marss$states
from MARSS()
.
-t=10
-fitted(fit.sts)[t]
+
[1] 1162.904
is the expected value of \(y_{t+1}\) (in this case \(y_{11}\) since we set \(t=10\)) given the data up to \(y_t\) (in this case, up to \(y_{10}\)). It is called the one-step ahead prediction.
We are not going to use the one-step ahead predictions unless we are forecasting or doing cross-validation.
Typically, when we analyze fisheries and ecological data, we want to know the estimate of the state, the \(x_t\), given ALL the data. For example, we might need an estimate of the population size in year 1990 given a time series of counts from 1930 to 2015. We don’t want to use only the data up to 1989; we want to use all the information. fit.marss$states
from MARSS()
is the expected value of \(x_t\) given all the data. For the stochastic level model, that is equal to the expected value of \(y_t\) given all the data except \(y_t\).
If you needed the one-step predictions from MARSS()
, you can get them from the Kalman filter output:
-kf=print(kem.2, what="kfs")
-kf$xtt1[1,t]
-Passing in what="kfs"
returns the Kalman filter/smoother output. The expected value of \(x_t\) conditioned on \(y_1\) to \(y_{t-1}\) is in kf$xtt1
. The expected value of \(x_t\) conditioned on all the data is in kf$xtT
.
+
+Passing in what="kfs"
returns the Kalman filter/smoother output. The expected value of \(x_t\) conditioned on \(y_1\) to \(y_{t-1}\) is in kf$xtt1
. The expected value of \(x_t\) conditioned on all the data is in kf$xtT
.
@@ -481,24 +507,24 @@ 6.3 The StructTS function
+
+