Data dictionary (#89)

RMI-PACTA · Sep 4, 2024 · 5d291f2 · 5d291f2
1 parent 6d5ce38
commit 5d291f2
Show file tree

Hide file tree

Showing 7 changed files with 677 additions and 1 deletion.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -59,11 +59,13 @@ Suggests:
     rmarkdown,
     usethis,
     testthat (>= 3.1.9),
-    writexl
+    writexl,
+    DT
 Config/testthat/edition: 3
 Roxygen: list(markdown = TRUE)
 Config/Needs/website: rmi-pacta/pacta.pkgdown.rmitemplate
 VignetteBuilder: knitr
 URL:
     https://rmi-pacta.github.io/workflow.multi.loanbook/,
     https://github.com/RMI-PACTA/workflow.multi.loanbook/
+LazyData: true
diff --git a/R/data.R b/R/data.R
@@ -0,0 +1,16 @@
+#' Data dictionary
+#'
+#' An overview of the output data sets generated by the package, their data types,
+#' and the definitions of the variables.
+#'
+#' @format ## `data_dictionary`
+#' \describe{
+#'   \item{dataset}{Name of the dataset}
+#'   \item{column}{Name of the column}
+#'   \item{typeof}{Data type of the column}
+#'   \item{definition}{Description of what the column stands for}
+#'   \item{value}{Which values are allowed for the column}
+#'   ...
+#' }
+#' @source internal
+"data_dictionary"
diff --git a/R/sysdata.rda b/R/sysdata.rda
diff --git a/data-raw/data_dictionary.R b/data-raw/data_dictionary.R
diff --git a/data/data_dictionary.rda b/data/data_dictionary.rda
diff --git a/man/data_dictionary.Rd b/man/data_dictionary.Rd
diff --git a/vignettes/data_dictionary.Rmd b/vignettes/data_dictionary.Rmd
@@ -0,0 +1,267 @@
+---
+title: data dictionary
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{data_dictionary}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+```
+
+```{r setup}
+library(workflow.multi.loanbook)
+```
+# Intro
+
+In many cases, users of this package will want to use the outputs of the analyses for further processing, such as additional analyses or making visualizations based on the design guide of their own organisation. To facilitate such additional use cases, but also simplify interpretation of the outputs generated with this package, this data dictionary documents each type of output table in detail, focusing on data types and definitions.
+
+This article is structured based on the output tables generated by `workflow.multi.loanbook` and follows the standard flow of the user experience as much as possible, so it can be read in the same sequence as the analysis is run.
+
+# Tables
+
+The main steps that generate output tables are:
+
+* Diagnostics and coverage
+* Standard PACTA analysis
+* Aggregated PACTA metrics
+
+## Diagnostics
+
+The diagnostics section is split into determining the match success rate of the loan books analysed and inspecting the real economy activity related to the financing made by the banks through the matched loan books. The former is influenced by the quality of the input loan book data and the completeness of the reference production data against which the loan books are matched. The latter, while it depends on a solid match success rate, is mainly driven by the financing decisions and the portfolio allocation made by the banks.
+
+### Match success rate
+
+```{r dd_lbk_match_success_rate}
+dd_lbk_match_success_rate <- dplyr::filter(data_dictionary, .data[["dataset"]] == "lbk_match_success_rate")
+DT::datatable(dd_lbk_match_success_rate)
+
+```
+
+### Loan book coverage
+
+```{r dd_summary_statistics_loanbook_coverage}
+dd_summary_statistics_loanbook_coverage <- dplyr::filter(data_dictionary, .data[["dataset"]] == "summary_statistics_loanbook_coverage")
+DT::datatable(dd_summary_statistics_loanbook_coverage)
+
+```
+
+## Standard PACTA analysis
+
+The standard PACTA analysis is run across all input banking books, but produces the same output metrics as known from the `r2dii.*` packages. Results are given at portfolio level grouped by banking book. Beyond the standard output format, tables are provided that can be used as input for visualizations, for each of the standard sectors and technologies.
+
+### Target Market Share results (all groups)
+
+Target market share results at the portfolio level for each included banking book
+
+```{r dd_tms_results_all_groups}
+dd_tms_results_all_groups <- dplyr::filter(data_dictionary, .data[["dataset"]] == "tms_results_all_groups")
+DT::datatable(dd_tms_results_all_groups)
+
+```
+
+
+### Sectoral Decarbonization Approach results (all groups)
+
+SDA results at the portfolio level for each included banking book
+
+```{r dd_sda_results_all_groups}
+dd_sda_results_all_groups <- dplyr::filter(data_dictionary, .data[["dataset"]] == "sda_results_all_groups")
+DT::datatable(dd_sda_results_all_groups)
+
+```
+
+
+### Data tech mix
+
+Results for a given portfolio and sector, tailored to be used in the tech mix chart
+
+```{r dd_data_tech_mix}
+dd_data_tech_mix <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_tech_mix")
+DT::datatable(dd_data_tech_mix)
+
+```
+
+
+### Data trajectory
+
+Results for a given portfolio, sector and technology, tailored to be used in the volume trajectory chart
+
+```{r dd_data_trajectory}
+dd_data_trajectory <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_trajectory")
+DT::datatable(dd_data_trajectory)
+
+```
+
+
+### Data emission intensity
+
+Results for a given portfolio and sector, tailored to be used in the emission intensity chart
+
+```{r dd_data_emission_intensity}
+dd_data_emission_intensity <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_trajectory")
+DT::datatable(dd_data_emission_intensity)
+
+```
+
+
+### Companies included
+
+Lists all companies including exposures, that were analysed for the given loan book and that are therefore included in the data to be visualized.
+
+```{r dd_companies_included}
+dd_companies_included <- dplyr::filter(data_dictionary, .data[["dataset"]] == "companies_included")
+DT::datatable(dd_companies_included)
+
+```
+
+
+## Aggregated PACTA metrics
+
+The aggregated PACTA metrics are also run across all input banking books. The calculations produce the net aggregate alignment metric, which is defined in `pacta.multi.loanbook.analysis` and allows producing the corresponding plots using `pacta.multi.loanbook.plot`. Results are grouped at the level defined by the `by_group` parameter.
+
+### Company technology deviation
+
+For each company in the analyzed banking books, shows the deviation of the technology build-out in the final year of the analysis from the corresponding allocated scenario value. This is an intermediate result that is further processed in the calculation of the net aggregate alignment metric. Only available for sectors, which have technology level calculations using the target market share, namely `automotive, coal, oil and gas, power`.
+
+```{r dd_company_technology_deviation_tms}
+dd_company_technology_deviation_tms <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_technology_deviation_tms")
+DT::datatable(dd_company_technology_deviation_tms)
+
+```
+
+### Company net alignment metric for TMS sectors
+
+For each company in the analyzed banking books, shows the net aggregate alignment metric for sectors, which have technology level calculations using the target market share, namely `automotive, coal, oil and gas, power`. See the [`pacta.multi.loanbook.analysis` wensite](https://rmi-pacta.github.io/pacta.multi.loanbook.analysis/articles/company_alignment_metric.html) for methodological documentation.
+
+```{r dd_company_alignment_net_tms}
+dd_company_alignment_net_tms <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_alignment_net_tms")
+DT::datatable(dd_company_alignment_net_tms)
+
+```
+
+### Disaggregated company buildout/phaseout alignment metric for TMS sectors
+
+For each company in the analyzed banking books, shows the aggregate alignment metric - disaggregated into its buildout and phaseout components - for sectors, which have technology level calculations using the target market share, namely `automotive, coal, oil and gas, power`. See the [`pacta.multi.loanbook.analysis` wensite](https://rmi-pacta.github.io/pacta.multi.loanbook.analysis/articles/company_alignment_metric.html) for methodological documentation.
+
+```{r dd_company_alignment_bo_po_tms}
+dd_company_alignment_bo_po_tms <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_alignment_bo_po_tms")
+DT::datatable(dd_company_alignment_bo_po_tms)
+
+```
+
+### Company net alignment metric for SDA sectors
+
+For each company in the analyzed banking books, shows the net aggregate alignment metric for sectors, which have sector level calculations using the sectoral decarbonization approach (SDA), namely `aviation, cement, steel`. See the [`pacta.multi.loanbook.analysis` wensite](https://rmi-pacta.github.io/pacta.multi.loanbook.analysis/articles/company_alignment_metric.html) for methodological documentation.
+
+```{r dd_company_alignment_net_sda}
+dd_company_alignment_net_sda <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_alignment_net_sda")
+DT::datatable(dd_company_alignment_net_sda)
+
+```
+
+### Company net aggregate alignment metric with financial exposures
+
+For each company in the analyzed banking books, shows the net aggregate alignment metric for all available sectors. This table includes the financial exposure to each of the analyzed parts of the banking books, split as defined in `by_group`.
+
+```{r dd_company_exposure_net_aggregate_alignment}
+dd_company_exposure_net_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_exposure_net_aggregate_alignment")
+DT::datatable(dd_company_exposure_net_aggregate_alignment)
+
+```
+
+### Disaggregated company buildout/phaseout alignment metric with financial exposures
+
+For each company in the analyzed banking books, shows the net aggregate alignment metric - disaggregated by its buildout and phaseout components - for all sectors that use technology level TMS calculations, namely `automotive, coal, oil and gas, power`. This table includes the financial exposure to each of the analyzed parts of the banking books, split as defined in `by_group`. Note that the financial exposure is not disaggregated, the alignment metric is.
+
+```{r dd_company_exposure_bo_po_aggregate_alignment}
+dd_company_exposure_bo_po_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_exposure_bo_po_aggregate_alignment")
+DT::datatable(dd_company_exposure_bo_po_aggregate_alignment)
+
+```
+
+### Loan book net aggregate alignment metric with financial exposures
+
+For each loan book level group (split as defined in `by_group`), shows the net aggregate alignment metric for all available sectors. This table includes the financial exposure to each of the analyzed parts of the banking books. Company level results are aggregated to the loan book level, using their relative financial exposure as weights.
+
+```{r dd_loanbook_exposure_net_aggregate_alignment}
+dd_loanbook_exposure_net_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "loanbook_exposure_net_aggregate_alignment")
+DT::datatable(dd_loanbook_exposure_net_aggregate_alignment)
+
+```
+
+### Disaggregated loan book buildout/phaseout alignment metric with financial exposures
+
+For each loan book level group (split as defined in `by_group`), shows the net aggregate alignment metric - disaggregated by its buildout and phaseout components - for all sectors using technology level TMS calculations, namely `automotive, coal, oil and gas, power`. Company level results are aggregated to the loan book level, using their relative financial exposure as weights.
+
+```{r dd_loanbook_exposure_bo_po_aggregate_alignment}
+dd_loanbook_exposure_bo_po_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "loanbook_exposure_bo_po_aggregate_alignment")
+DT::datatable(dd_loanbook_exposure_bo_po_aggregate_alignment)
+
+```
+
+### Input data for Sankey plot
+
+Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_sankey()`.
+
+```{r dd_data_sankey}
+dd_data_sankey <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_sankey")
+DT::datatable(dd_data_sankey)
+
+``` 
+
+### Input data for alignment-by-exposure scatter plot
+
+Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_scatter_alignment_exposure()`.
+
+```{r dd_data_scatter_alignment_exposure}
+dd_data_scatter_alignment_exposure <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_scatter_alignment_exposure")
+DT::datatable(dd_data_scatter_alignment_exposure)
+
+``` 
+
+### Input data for buildout/phaseout scatter plot
+
+Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_scatter()`.
+
+```{r dd_data_scatter_sector}
+dd_data_scatter_sector <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_scatter_sector")
+DT::datatable(dd_data_scatter_sector)
+
+``` 
+
+### Input data for animated buildout/phaseout scatter plot
+
+Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_scatter_animated()`.
+
+```{r dd_data_scatter_sector_animated}
+dd_data_scatter_sector_animated <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_scatter_sector_animated")
+DT::datatable(dd_data_scatter_sector_animated)
+
+``` 
+
+### Input data for net timline plot
+
+Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_timeline()`.
+
+```{r dd_data_timeline_net}
+dd_data_timeline_net <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_timeline_net")
+DT::datatable(dd_data_timeline_net)
+
+``` 
+
+### Input data for buildout/phaseout timline plot
+
+Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_timeline()`.
+
+```{r dd_data_timeline_bo_po}
+dd_data_timeline_bo_po <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_timeline_bo_po")
+DT::datatable(dd_data_timeline_bo_po)
+
+``` 
+