forked from chbianco/GCAM-SoilC-Dynamics
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Project_Markdown.Rmd
514 lines (414 loc) · 27.5 KB
/
Project_Markdown.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
---
title: "Comparing GCAM soil organic carbon predictions to empirical land use results"
output: bookdown::html_document2
date: "`r format(Sys.time(), '%d %B %Y')`"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Abstract
Much of the world's carbon is sequestered in soil. Changes in land use greatly impact whether this carbon remains stored or if it is emitted back into the atmosphere as carbon dioxide, which as broad climate change and global warming implications. These changes to the carbon cycle and their effects are modeled by many Integrated Assessment Models (IAMs). One notable IAM is the Global Change Analysis Model (GCAM). GCAM uses a highly simplified process to represent soil carbon changes, which has not yet been compared to experimental process. This study compared data from two papers aggregating experimental SOC data from previous studies to GCAM's soil carbon output data to determine the accuracy of GCAM's modeling. A meta-analysis was used to determine if such a difference existed and whether the difference was consistent across all tested geographical areas. We found that both the experimental rates of SOC change and the turnover rate constant differed significantly from the same values outputted from GCAM. Specifically, we found that GCAM significantly overestimates the rate of SOC turnover during land use transition. These results highlight the importance of updating how GCAM models soil carbon content change during land use transition.
# Introduction
There has been ample research on empirical soil organic carbon (SOC) levels worldwide, and several papers have compiled the results of these studies (Post & Kwon 200, Wei et al. 2014). Specifically, there are many measurements of the rate of SOC change during land use transition. SOC measurements have also been made for many different types of land use transitions, providing a wide range of measurements across different environmental conditions. However, there has not yet been an effort to directly compare these measured SOC rates to those projected by Integrated Assessment Models (IAMs) . The factors that influence these rates are highly varied and complex, leading to high levels of uncertainty (Sulman et al. 2018). This highlights a need to check how SOC rates are modeled against experimental results.
\newline
SOC storage has wide ranging climate impacts: roughly 25% of anthropogenic carbon emissions are absorbed by land ecosystems (Reuhr et al. 2023). Additionally, land use modification is a potential strategy to limit atmospheric CO2 and reduce the overall need to lower carbon emissions (Wise et al. 2009). For these reasons, accurate modeling and predicting of SOC levels and change is important to correctly gauge the effectiveness of various climate change mitigation strategies.
\newline
One of the IAMs that models SOC is the Global Change Assessment Model (GCAM ). GCAM is an open source, multi-sector dynamics model developed primarily by Pacific Northwest National Laboratory's Joint Global Change Research Institute. GCAM represents the behaviors of and interactions between earth's energy and water systems, the climate, and the global economy to answer "what if" scenarios about future global change. A major component of GCAM's analysis is land use worldwide. GCAM splits the world into 384 land use regions determined by the intersection of river basins and sociopolitical boundaries. Each of these 384 land use regions is then assigned different proportions of up to 8 different land types based on economic factors. These proportions change with time, allowing GCAM to model land use change based on economic, environmental, and sociopolitical factors.
\newline
One of the parameters measured by GCAM is the soil organic carbon (SOC) content of soil. SOC arises from decomposing plants and animals that are broken down by microbes. While some of the carbon from this organic matter is emitted back into the atmosphere, much of it remains in the soil. However, many factors can cause SOC to be emitted back into the atmosphere even after it has already been sequestered. One primary cause of this change is land use transition. For example, when grassland soil is tilled for conversion to farmland, much of the surface soil is disturbed and exposed to aeration, emitting CO2. In GCAM, SOC change during land use transition is solely dependent on land use and region. When land use transition occurs, the change in SOC content is simply modeled as a [shape ] change between the two SOC values for the initial and final land use over a prescribed number of years, called the soil timescale. Each of GCAM's 32 geopolitical regions have a set soil timescale, ranging from 25 years (for the United States) to 200 years (Russia). While this kind of simplified modeling approach can be accurate, GCAMs approach should still be tested (Menichetti et al. 2019). The origin of the soil timescale values is also unclear (?) and this project aims to help determine their accuracy.
# Methods
To begin the analysis, SOC data was taken from a basic run of GCAM, called a reference scenario. This models “business as usual” for the world, without any extra climate policies and assuming a certain level of overall warming . Table S1 contains all relevant data from this reference scenario used in our analysis, which was performed using R. Note that the SOC data, GCAM's soil timescale list, as well as both the geopolitical regions list and the land use regions list are separate files, and must be merged before any additional steps can be performed. Below is the code used to load the data and libraries, as well as to merge the four files listed above. The final product is a data set where each SOC data point has an associated timescale value, land use region, and geopolitical region.
```{r Read-In-GCAM-Data, message=FALSE}
library(dplyr)
library(ggplot2)
library(metafor)
#GCAM's SOC data
soilC <- read.csv(file = 'Data/GCAM_soilC.csv')
#GCAM's soil timescale list
read.csv(file = 'Data/soil_timescales.csv') %>%
mutate(GCAM_region_ID = seq(1,32,by=1))-> timescales
#GCAM's land use region list
# answered below [what does GLU stand for?????]
#Geographic Land Unit
#https://gcims.pnnl.gov/modeling/moirai-land-data-amalgamation-gcams-aglu-module
glus <- read.csv(file = 'Data/GLU_codes.csv')
#GCAM's geopolitical region list
regions <- read.csv('Data/GCAM_regions.csv')
#Merging the four above data sets
#flagging
#issue here for me -> "GCAM_region_ID" does not exist in original file versions
soilC %>%
mutate(GLU_code = GLU) %>%
right_join(glus, by='GLU_code') %>%
right_join(regions, by='GCAM_region_ID') %>%
right_join(timescales, by='GCAM_region_ID')-> soilC_regions
```
Much of the data contained within the SOC data set is irrelevant to this analysis, so it was removed.
```{r Select-relevant-data}
soilC_regions %>%
select(Land_Type, soil_c, GLU_code, soilTimeScale, GCAM_region_ID, Basin_long_name) -> simple_soilC_regions
```
The experimental data compared to GCAM's outputs was taken from Post & Kwon (2000) and Wei et al. (2014), both reviews of SOC measurements from the past 70 years. This data was manually digitized into a CSV file for analysis. This data did not explicitly list site locations for each measurement, so this information was determined from the paper referenced by the two studies. Once the location was determined, it was compared to GCAM's 384 land use regions using the rmap tool and GCAM's basin code mapping file. Then, the GLU code was manually added to the digitized data from Post & Kwon.
Once fully digitized, the experimental data was loaded into R as well. Note that there were several entries with missing timeframes or where the location of the sample could not be determined. These were assigned a value of 'NA'.
```{r Read-In-Experimental-Data}
PostKwon <- read.csv(file= 'Data/Experimental Data.csv', na.strings = c("", "NA"))
Wei <- read.csv(file= 'Data/Wei et al Data.csv', na.strings = c("", "NA"))
```
Once the experimental data was loaded in, each observation was mapped to is corresponding GCAM data point. This was done by simply adding the previously created GCAM data set to the experimental one by land use region, geopolitical region, and land use. This process was done twice--once for the initial land use and once for the final land use. This meant that, ultimately, each experimental data point had two associated GCAM SOC values--one for the initial land use, and one for the final land use.
This then allowed us to compute the GCAM SOC rate during land use change. This was done by simply subtracting the initial SOC value from the final value and dividing that by GCAMs soil timescale parameter. Finally, the k value (explained in depth in Wei et al.) was computed for both the experimental and GCAM data. The experimental and GCAM k values were computed using equations 1 and 2, respectively, below.
\begin{equation}
\tag{1}
k = -\frac{ln \left(\frac{C_f}{C_0}\right)}{t}
\end{equation}
\begin{equation}
\tag{2}
k = -\frac{ln (Rate \times t + 1)}{t}
\end{equation}
\newline
Experimental data from Wei et al. did not contain any raw rate information. Instead, the study only reported percent decrease in SOC concentration. This means we were only able to compare k values between the Wei et. al. data and GCAM. Additionally, k had to be computed with a modified equation, 3, below.
\begin{equation}
\tag{3}
k = -\frac{ln \left(\frac{1}{\frac{|OC Decrease|}{100} + 1}\right)}{t}
\end{equation}
The code for merging the GCAM and experimental, as well as for computing the rate and k value is below. The process was the same for the Wei et al. data.
```{r create_comparison_Post_Kwon}
PostKwon %>%
select(Initial_Land_Use, Final_Land_Use, GLU_code, GCAM_region_ID, Time, Exp_Rate) %>%
na.omit() %>%
mutate(Land_Type = Initial_Land_Use) %>%
right_join(simple_soilC_regions, by = c('GLU_code', 'Land_Type', 'GCAM_region_ID')) %>%
rename(initial_soil_c = soil_c) %>%
select(-Land_Type) %>%
mutate(Land_Type = Final_Land_Use) %>%
right_join(simple_soilC_regions, by = c('GLU_code', 'Land_Type', 'GCAM_region_ID')) %>%
rename(final_soil_c = soil_c) %>%
select(-Land_Type, -soilTimeScale.x, -Basin_long_name.x) %>%
rename(soilTimeScale = soilTimeScale.y, Basin_long_name = Basin_long_name.y) %>%
na.omit() %>%
mutate(GCAM_Rate = (final_soil_c - initial_soil_c)/soilTimeScale, Rate_Difference = Exp_Rate - GCAM_Rate,
Exp_k = -log(abs(Exp_Rate)*Time +1)/Time,
GCAM_k = -log(final_soil_c/initial_soil_c)/soilTimeScale,
source = 'Post & Kwon'
) %>%
#This next line corrects the sign of Exp_k--we had to take the absolute value to avoid NaNs, so this accounts for that
#This did not have to be done for the GCAM_k because the argument inside the natural log will always be positive--it is a positive value divided by another positive value--so sign correction was not necesary.
mutate(Exp_k = ifelse(sign(Exp_k) == sign(Exp_Rate), Exp_k*(-1), Exp_k)) -> PostKwon_Comparison
```
Note that, because of how the k value is computed for the experimental data, the sign was often mismatched. The sign of k corresponds to the opposite of the sign of the rate, which had to be corrected manually.
```{r create_comparison_Wei, echo = FALSE}
#Creating Wei comparison data
Wei %>%
select(Initial_Land_Use, Final_Land_Use, GLU_code, GCAM_region_ID, OC_decrease, Time) %>%
na.omit() %>%
mutate(Land_Type = Initial_Land_Use) %>%
right_join(simple_soilC_regions, by = c('GLU_code', 'Land_Type', 'GCAM_region_ID')) %>%
rename(initial_soil_c = soil_c) %>%
select(-Land_Type) %>%
mutate(Land_Type = Final_Land_Use) %>%
right_join(simple_soilC_regions, by = c('GLU_code', 'Land_Type', 'GCAM_region_ID')) %>%
rename(final_soil_c = soil_c) %>%
select(-Land_Type, -soilTimeScale.x, -Basin_long_name.x) %>%
rename(soilTimeScale = soilTimeScale.y, Basin_long_name = Basin_long_name.y) %>%
na.omit() %>%
mutate(GCAM_Rate = (final_soil_c - initial_soil_c)/soilTimeScale,
GCAM_k = -log(final_soil_c/initial_soil_c)/soilTimeScale,
Exp_k = -log(1/((abs(OC_decrease)/100) +1))/Time,
source = 'Wei et al'
) %>%
#This next line corrects the sign of Exp_k--we had to take the absolute value to avoid NaNs, so this accounts for that
mutate(Exp_k = ifelse(sign(Exp_k) == sign(OC_decrease), Exp_k, Exp_k*(-1))) -> Wei_Comparison
```
\Newline
Once the experimental data was matched to its corresponding GCAM outputs, the two data sets (Post& Kwon and Wei et al.) were merged. Importantly, because the rate data does not exist for Wei et al., the rate data was not included in the aggregated data set.
```{r Create-Full-Comparison}
Full_Comparison <- bind_rows(
select(PostKwon_Comparison, -Exp_Rate, -Rate_Difference),
select(Wei_Comparison, -OC_decrease)
)
#Now, we want to create another data frame that has all the repeated regions between the two studies
Full_Comparison %>%
mutate(source_short = ifelse(source == 'Post & Kwon', 'PostKwon', 'Wei')) %>%
mutate(basin_source = paste(Basin_long_name, source_short)) %>%
filter((paste(Basin_long_name, 'PostKwon') %in% basin_source) & (paste(Basin_long_name, 'Wei') %in% basin_source)) -> Duplicate_Comparison
counts = table(Duplicate_Comparison$basin_source)
Duplicate_Comparison %>%
filter(
(counts[paste(Basin_long_name, 'Wei')] > 1) & (counts[paste(Basin_long_name, 'PostKwon')] > 1)
) -> Duplicate_Comparison
```
Once the k values were calculated, meta-analyses were performed on the data. Three separate analyses were performed--one on the Post & Kwon data, one on the Wei et al. data, and one on the aggregated data set. All three examined the k value data so the results from all three tests could be reasonably compared to one another. The Standardized Mean Difference effect size was used for the analyses. Specifically, Hedges' g was used to calculate the Standardized Mean difference once the data was grouped by GCAM land use region. This allowed us to determine if there was a pattern in any potential discrepancies between GCAM and experimental data, or if any variance was localized geographically and/or random.
\newline
A fixed effect model was used for the analysis. This means that it is assumed that all the data being examined (ie all the SOC measurements) come from the same general population, and that all the data was collected in the same manner. This obviously applies to the GCAM data, as it is all from the same run of a basic GCAM scenario. Because data was reported the same way regardless of the collection location, the same assumptions apply to the Post & Kwon and Wei et al. data.
```{r Meta-Analysis-full, warning=FALSE}
#Meta-analysis for the aggregated data set
#First, we need to get mean, std dev, and n
Full_Comparison %>%
group_by(Basin_long_name) %>%
summarize(mean_control = mean(Exp_k), sd_control= sd(Exp_k), n_control = n(),
mean_GCAM = mean(GCAM_k), sd_GCAM = sd(GCAM_k), n_GCAM = n()
) -> Full_MA_data
#Now, we can use escalc to get the standardized mean difference
Full_effect_sizes <-
escalc('SMD',
m1i = mean_control, n1i = n_control, sd1i = sd_control,
m2i = mean_GCAM, n2i = n_GCAM, sd2i = sd_GCAM,
data = Full_MA_data
)
#We're going to use a fixed effect model for this analysis, which we set up below
Full_fixed_effect_results <- rma(yi, vi, method = 'FE',
slab = Basin_long_name,
data = Full_effect_sizes)
```
```{r meta-analyses-individual, echo = FALSE, message=FALSE, warning=FALSE}
#Now, for the Post & Kwon rates
#First, we need to get mean, std dev, and n
PostKwon_Comparison %>%
group_by(Basin_long_name) %>%
summarize(mean_control = mean(Exp_Rate), sd_control= sd(Exp_Rate), n_control = n(),
mean_GCAM = mean(GCAM_Rate), sd_GCAM = sd(GCAM_Rate), n_GCAM = n()
) -> PostKwon_MA_data
#Now, we can use escalc to get the standardized mean difference
PostKwon_effect_sizes <-
escalc('SMD',
m1i = mean_control, n1i = n_control, sd1i = sd_control,
m2i = mean_GCAM, n2i = n_GCAM, sd2i = sd_GCAM,
data = PostKwon_MA_data
)
#We're going to use a fixed effect model for this analysis, which we set up below
PostKwon_fixed_effect_results <- rma(yi, vi, method = 'FE',
slab = Basin_long_name,
data = PostKwon_effect_sizes)
#Now, we'll do one for the Post & Kwon k values
#First, we need to get mean, std dev, and n
PostKwon_Comparison %>%
group_by(Basin_long_name) %>%
summarize(mean_control = mean(Exp_k), sd_control= sd(Exp_k), n_control = n(),
mean_GCAM = mean(GCAM_k), sd_GCAM = sd(GCAM_k), n_GCAM = n()
) -> PostKwon_MA_k_data
#Now, we can use escalc to get the standardized mean difference
PostKwon_k_effect_sizes <-
escalc('SMD',
m1i = mean_control, n1i = n_control, sd1i = sd_control,
m2i = mean_GCAM, n2i = n_GCAM, sd2i = sd_GCAM,
data = PostKwon_MA_k_data
)
#We're going to use a fixed effect model for this analysis, which we set up below
PostKwon_k_fixed_effect_results <- rma(yi, vi, method = 'FE',
slab = Basin_long_name,
data = PostKwon_k_effect_sizes)
#Now we do the same for Wei et al.
#First, we need to get mean, std dev, and n
Wei_Comparison %>%
group_by(Basin_long_name) %>%
summarize(mean_control = mean(Exp_k), sd_control= sd(Exp_k), n_control = n(),
mean_GCAM = mean(GCAM_k), sd_GCAM = sd(GCAM_k), n_GCAM = n()
) -> Wei_MA_data
#Now, we can use escalc to get the standardized mean difference
Wei_effect_sizes <-
escalc('SMD',
m1i = mean_control, n1i = n_control, sd1i = sd_control,
m2i = mean_GCAM, n2i = n_GCAM, sd2i = sd_GCAM,
data = Wei_MA_data
)
#We're going to use a fixed effect model for this analysis, which we set up below
Wei_fixed_effect_results <- rma(yi, vi, method = 'FE',
slab = Basin_long_name,
data = Wei_effect_sizes)
#Now for the whole data set by transition type
#First, we need to create a column in the merged data for transition type
Full_Comparison %>%
mutate(change = paste(Initial_Land_Use, Final_Land_Use, sep = '')) -> Full_Comparison_change
#Now, we need to get mean, std dev, and n
Full_Comparison_change %>%
group_by(change) %>%
summarize(mean_control = mean(Exp_k), sd_control= sd(Exp_k), n_control = n(),
mean_GCAM = mean(GCAM_k), sd_GCAM = sd(GCAM_k), n_GCAM = n()
) -> Full_MA_data_change
#Now, we can use escalc to get the standardized mean difference
Full_effect_sizes_change <-
escalc('SMD',
m1i = mean_control, n1i = n_control, sd1i = sd_control,
m2i = mean_GCAM, n2i = n_GCAM, sd2i = sd_GCAM,
data = Full_MA_data_change
)
#We're going to use a fixed effect model for this analysis, which we set up below
Full_fixed_effect_results_change <- rma(yi, vi, method = 'FE',
slab = change,
data = Full_effect_sizes_change)
```
# Results
Overall, GCAMs rate and k value outputs matched experimental values.
```{r Historgram_Rate-Post-Kwon, echo=FALSE, fig.cap="Histograms showing overlapping GCAM and experimental data for raw soilC rates during land use transition. Here, the experimental data is taken from Post & Kwon.", message=FALSE}
#Plot overlapping rate histograms for the different rate sources
ggplot() +
geom_histogram(aes(x = PostKwon_Comparison$Exp_Rate, fill ='Post & Kwon' ), alpha = 0.5) +
geom_histogram(aes(x = PostKwon_Comparison$GCAM_Rate, fill = 'GCAM Rate'), alpha = 0.5) +
xlab(expression(Rate~(kg~C/m^2))) + ylab('Count') +
scale_fill_manual(name = "Data Source", values = c('Post & Kwon' = '#3584B0', 'GCAM Rate'='#e3962b')) +
theme_light()
```
```{r Histogram_K-Post-Kwon, echo=FALSE, fig.cap="Histograms showing overlapping GCAM and experimental data for computed k values during land use change. Here, the experimental data is taken from Post & Kwon.", message=FALSE}
#Plot overlapping k histograms for the different k sources
ggplot() +
geom_histogram(aes(x = PostKwon_Comparison$Exp_k,fill ='Post & Kwon'), alpha = 0.5) +
geom_histogram(aes(x = PostKwon_Comparison$GCAM_k, fill = 'GCAM'), alpha = 0.5) +
xlab(expression(k~(y^-1))) + ylab('Count') +
scale_fill_manual(name = "Data Source", values = c('Post & Kwon' = '#3584B0', 'GCAM'='#e3962b')) +
theme_light()
```
```{r Histogram_Wei, echo=FALSE, fig.cap="Histograms showing overlapping GCAM and experimental data for computed k values during land use change. Here, the experimental data is taken from Wei et al", message=FALSE}
#Plot overlapping k histograms for the different k sources
ggplot() +
geom_histogram(aes(x = Wei_Comparison$Exp_k,fill ='Post & Kwon'), alpha = 0.5) +
geom_histogram(aes(x = Wei_Comparison$GCAM_k, fill = 'GCAM'), alpha = 0.5) +
xlab(expression(k~(y^-1))) + ylab('Count') +
scale_fill_manual(name = "Data Source", values = c('Post & Kwon' = '#3584B0', 'GCAM'='#e3962b')) +
theme_light()
```
```{r k-Histogram-full, echo=FALSE, fig.cap="Histograms showing overlapping GCAM and experimental data for computed k values during land use change. Here, the experimental data is taken from both Post & Kwon and Wei et al.", message=FALSE}
#Plot overlapping k histograms for the different k sources
ggplot() +
geom_histogram(aes(x = Full_Comparison$Exp_k, fill = Full_Comparison$source), alpha = 0.5, bins = 75) +
geom_histogram(aes(x = Full_Comparison$GCAM_k, fill = 'GCAM'), alpha = 0.5, bins = 75) +
xlab(expression(k~(y^-1))) + ylab('Count') +
scale_fill_manual(name = "Data Source",
values = c('GCAM'='#e3962b', 'Wei et al' = '#45912c', 'Post & Kwon' = '#3584B0')) +
theme_light()
```
Several t-tests were also performed. One was performed on each of the different k value sources (Post & Kwon, Wei et al., and the aggregated data), and another was performed on the Post & Kwon rate data. As can be seen below, the difference between means of the rate data sets was not statistically significant, but it was for the k values.
```{r t-test-Post-Kwon}
t.test(PostKwon_Comparison$Exp_Rate, PostKwon_Comparison$GCAM_Rate, alternative = 'two.sided')
t.test(PostKwon_Comparison$Exp_k, PostKwon_Comparison$GCAM_k, alternative = 'two.sided')
```
```{r t-test-Wei}
t.test(Wei_Comparison$Exp_k, Wei_Comparison$GCAM_k, alternative = 'two.sided')
```
```{r t-test-full}
t.test(Full_Comparison$Exp_k, Full_Comparison$GCAM_k, alternative = 'two.sided')
```
\newline
Results from the analysis on the Post & Kwon data:
```{r meta-analysis-results-Post-Kwon}
PostKwon_k_fixed_effect_results
```
```{r Forest_Plot-Post-Kwon, echo = FALSE, fig.cap ='Forest plot detailing the meta analysis performed on the Post & Kwon SOC Data. The data was grouped by GCAM land region (represented by GLU code).' , message = FALSE}
#Forest plot for Post & Kwon k vals
forest(
PostKwon_k_effect_sizes$yi, PostKwon_k_effect_sizes$vi,
annotate = FALSE, showweights = TRUE,
header = c('Region', 'Weight SMD [95% CI]'),
slab = PostKwon_k_fixed_effect_results$slab,
xlab = 'Standardized Mean Difference',
#Below sets the size of study labels, shape of bars, and size of x labels
cex = .8, pch = 15, cex.lab = 1
)
#Adding the summary effect size
addpoly(
PostKwon_k_fixed_effect_results,
col = 'orange', cex = 1, annotate = TRUE, mlab = 'Summary'
)
```
Results from the analysis on the Wei et al. data:
```{r meta-analysis-results-Wei}
Wei_fixed_effect_results
```
```{r Forest_Plot-Wei, echo = FALSE, fig.cap ='Forest plot detailing the meta analysis performed on the Wei et al. SOC Data. The data was grouped by GCAM land region (represented by GLU code).' , message = FALSE}
#Forest plot for Wei
forest(
Wei_effect_sizes$yi, Wei_effect_sizes$vi,
annotate = FALSE, showweights = TRUE,
header = c('Region', 'Weight SMD [95% CI]'),
slab = Wei_fixed_effect_results$slab,
xlab = 'Standardized Mean Difference',
#Below sets the size of study labels, shape of bars, and size of x labels
cex = .8, pch = 15, cex.lab = 1
)
#Adding the summary effect size
addpoly(
Wei_fixed_effect_results,
col = 'orange', cex = 1, annotate = TRUE, mlab = 'Summary'
)
```
Results from the analysis on the data aggregated by transition type:
```{r Forest_Plot_transition_type, echo = FALSE, fig.cap ='Forest plot detailing the meta analysis performed on the aggregated SOC Data. The data was grouped by GCAM land region (represented by GLU code).' , message = FALSE}
#Now, a meta analysis by transition type for the entire dataset
#First, we need to create a column in the merged data for transition type
Full_Comparison %>%
mutate(change = paste(Initial_Land_Use, Final_Land_Use, sep = '')) -> Full_Comparison_change
#First, we need to get mean, std dev, and n
Full_Comparison_change %>%
group_by(change) %>%
summarize(mean_control = mean(Exp_k), sd_control= sd(Exp_k), n_control = n(),
mean_GCAM = mean(GCAM_k), sd_GCAM = sd(GCAM_k), n_GCAM = n()
) -> Full_MA_data_change
#Now, we can use escalc to get the standardized mean difference
Full_effect_sizes_change <-
escalc('SMD',
m1i = mean_control, n1i = n_control, sd1i = sd_control,
m2i = mean_GCAM, n2i = n_GCAM, sd2i = sd_GCAM,
data = Full_MA_data_change
)
#We're going to use a fixed effect model for this analysis, which we set up below
Full_fixed_effect_results_change <- rma(yi, vi, method = 'FE',
slab = change,
data = Full_effect_sizes_change)
#Forest plot for full dataset
forest(
Full_effect_sizes_change$yi, Full_effect_sizes_change$vi,
annotate = TRUE, showweights = TRUE,
header = c('Transition Type', 'Weight SMD [95% CI]'),
slab = Full_fixed_effect_results_change$slab,
xlab = 'Standardized Mean Difference',
#Below sets the size of study labels, shape of bars, and size of x labels
cex = .8, pch = 15, cex.lab = 1
)
#Adding the summary effect size
addpoly(
Full_fixed_effect_results_change,
col = 'orange', cex = 1, annotate = TRUE, mlab = 'Summary'
)
```
```{r meta-analysis-results-Full}
Full_fixed_effect_results_change
```
A meta-analysis was also performed only on the regions that were examined by both Post & Kwon and Wei et al.
```{r meta_analysis_duplicate_forest}
#Meta analysis for the shared regions
#First, we need to get mean, std dev, and n
Duplicate_Comparison %>%
group_by(basin_source) %>%
summarize(mean_control = mean(Exp_k), sd_control= sd(Exp_k), n_control = n(),
mean_GCAM = mean(GCAM_k), sd_GCAM = sd(GCAM_k), n_GCAM = n()
) -> Duplicate_MA_data
#Now, we can use escalc to get the standardized mean difference
Duplicate_effect_sizes <-
escalc('SMD',
m1i = mean_control, n1i = n_control, sd1i = sd_control,
m2i = mean_GCAM, n2i = n_GCAM, sd2i = sd_GCAM,
data = Duplicate_MA_data
)
#We're going to use a fixed effect model for this analysis, which we set up below
Duplicate_fixed_effect_results <- rma(yi, vi, method = 'FE',
slab = basin_source,
data = Duplicate_effect_sizes)
#Forest plot for duplicates
forest(
Duplicate_effect_sizes$yi, Duplicate_effect_sizes$vi,
annotate = TRUE,showweights = TRUE,
header = c('Region', 'Weight SMD [95% CI]'),
slab = Duplicate_fixed_effect_results$slab,
xlab = 'Standardized Mean Difference',
#Below sets the size of study labels, shape of bars, and size of x labels
cex = .8, pch = 15, cex.lab = 1
)
#Adding the summary effect size
addpoly(
Duplicate_fixed_effect_results,
col = 'orange', cex = 1, annotate = TRUE, mlab = 'Summary'
)
```
```{r meta_analysis_duplicate_results}
Duplicate_fixed_effect_results
```
# Discussion
# Conclusion