Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data dictionary #89

Merged
merged 18 commits into from
Sep 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,13 @@ Suggests:
rmarkdown,
usethis,
testthat (>= 3.1.9),
writexl
writexl,
DT
Config/testthat/edition: 3
Roxygen: list(markdown = TRUE)
Config/Needs/website: rmi-pacta/pacta.pkgdown.rmitemplate
VignetteBuilder: knitr
URL:
https://rmi-pacta.github.io/workflow.multi.loanbook/,
https://github.com/RMI-PACTA/workflow.multi.loanbook/
LazyData: true
16 changes: 16 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
#' Data dictionary
#'
#' An overview of the output data sets generated by the package, their data types,
#' and the definitions of the variables.
#'
#' @format ## `data_dictionary`
#' \describe{
#' \item{dataset}{Name of the dataset}
#' \item{column}{Name of the column}
#' \item{typeof}{Data type of the column}
#' \item{definition}{Description of what the column stands for}
#' \item{value}{Which values are allowed for the column}
#' ...
#' }
#' @source internal
"data_dictionary"
Binary file modified R/sysdata.rda
Binary file not shown.
361 changes: 361 additions & 0 deletions data-raw/data_dictionary.R

Large diffs are not rendered by default.

Binary file added data/data_dictionary.rda
Binary file not shown.
30 changes: 30 additions & 0 deletions man/data_dictionary.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

267 changes: 267 additions & 0 deletions vignettes/data_dictionary.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
---
title: data dictionary
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{data_dictionary}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```

```{r setup}
library(workflow.multi.loanbook)
```
# Intro

In many cases, users of this package will want to use the outputs of the analyses for further processing, such as additional analyses or making visualizations based on the design guide of their own organisation. To facilitate such additional use cases, but also simplify interpretation of the outputs generated with this package, this data dictionary documents each type of output table in detail, focusing on data types and definitions.

This article is structured based on the output tables generated by `workflow.multi.loanbook` and follows the standard flow of the user experience as much as possible, so it can be read in the same sequence as the analysis is run.

# Tables

The main steps that generate output tables are:

* Diagnostics and coverage
* Standard PACTA analysis
* Aggregated PACTA metrics

## Diagnostics

The diagnostics section is split into determining the match success rate of the loan books analysed and inspecting the real economy activity related to the financing made by the banks through the matched loan books. The former is influenced by the quality of the input loan book data and the completeness of the reference production data against which the loan books are matched. The latter, while it depends on a solid match success rate, is mainly driven by the financing decisions and the portfolio allocation made by the banks.

### Match success rate

```{r dd_lbk_match_success_rate}
dd_lbk_match_success_rate <- dplyr::filter(data_dictionary, .data[["dataset"]] == "lbk_match_success_rate")
DT::datatable(dd_lbk_match_success_rate)

```

### Loan book coverage

```{r dd_summary_statistics_loanbook_coverage}
dd_summary_statistics_loanbook_coverage <- dplyr::filter(data_dictionary, .data[["dataset"]] == "summary_statistics_loanbook_coverage")
DT::datatable(dd_summary_statistics_loanbook_coverage)

```

## Standard PACTA analysis

The standard PACTA analysis is run across all input banking books, but produces the same output metrics as known from the `r2dii.*` packages. Results are given at portfolio level grouped by banking book. Beyond the standard output format, tables are provided that can be used as input for visualizations, for each of the standard sectors and technologies.

### Target Market Share results (all groups)

Target market share results at the portfolio level for each included banking book

```{r dd_tms_results_all_groups}
dd_tms_results_all_groups <- dplyr::filter(data_dictionary, .data[["dataset"]] == "tms_results_all_groups")
DT::datatable(dd_tms_results_all_groups)

```


### Sectoral Decarbonization Approach results (all groups)

SDA results at the portfolio level for each included banking book

```{r dd_sda_results_all_groups}
dd_sda_results_all_groups <- dplyr::filter(data_dictionary, .data[["dataset"]] == "sda_results_all_groups")
DT::datatable(dd_sda_results_all_groups)

```


### Data tech mix

Results for a given portfolio and sector, tailored to be used in the tech mix chart

```{r dd_data_tech_mix}
dd_data_tech_mix <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_tech_mix")
DT::datatable(dd_data_tech_mix)

```


### Data trajectory

Results for a given portfolio, sector and technology, tailored to be used in the volume trajectory chart

```{r dd_data_trajectory}
dd_data_trajectory <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_trajectory")
DT::datatable(dd_data_trajectory)

```


### Data emission intensity

Results for a given portfolio and sector, tailored to be used in the emission intensity chart

```{r dd_data_emission_intensity}
dd_data_emission_intensity <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_trajectory")
DT::datatable(dd_data_emission_intensity)

```


### Companies included

Lists all companies including exposures, that were analysed for the given loan book and that are therefore included in the data to be visualized.

```{r dd_companies_included}
dd_companies_included <- dplyr::filter(data_dictionary, .data[["dataset"]] == "companies_included")
DT::datatable(dd_companies_included)

```


## Aggregated PACTA metrics

The aggregated PACTA metrics are also run across all input banking books. The calculations produce the net aggregate alignment metric, which is defined in `pacta.multi.loanbook.analysis` and allows producing the corresponding plots using `pacta.multi.loanbook.plot`. Results are grouped at the level defined by the `by_group` parameter.

### Company technology deviation

For each company in the analyzed banking books, shows the deviation of the technology build-out in the final year of the analysis from the corresponding allocated scenario value. This is an intermediate result that is further processed in the calculation of the net aggregate alignment metric. Only available for sectors, which have technology level calculations using the target market share, namely `automotive, coal, oil and gas, power`.

```{r dd_company_technology_deviation_tms}
dd_company_technology_deviation_tms <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_technology_deviation_tms")
DT::datatable(dd_company_technology_deviation_tms)
cjyetman marked this conversation as resolved.
Show resolved Hide resolved

```

### Company net alignment metric for TMS sectors

For each company in the analyzed banking books, shows the net aggregate alignment metric for sectors, which have technology level calculations using the target market share, namely `automotive, coal, oil and gas, power`. See the [`pacta.multi.loanbook.analysis` wensite](https://rmi-pacta.github.io/pacta.multi.loanbook.analysis/articles/company_alignment_metric.html) for methodological documentation.

```{r dd_company_alignment_net_tms}
dd_company_alignment_net_tms <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_alignment_net_tms")
DT::datatable(dd_company_alignment_net_tms)

```

### Disaggregated company buildout/phaseout alignment metric for TMS sectors

For each company in the analyzed banking books, shows the aggregate alignment metric - disaggregated into its buildout and phaseout components - for sectors, which have technology level calculations using the target market share, namely `automotive, coal, oil and gas, power`. See the [`pacta.multi.loanbook.analysis` wensite](https://rmi-pacta.github.io/pacta.multi.loanbook.analysis/articles/company_alignment_metric.html) for methodological documentation.

```{r dd_company_alignment_bo_po_tms}
dd_company_alignment_bo_po_tms <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_alignment_bo_po_tms")
DT::datatable(dd_company_alignment_bo_po_tms)

```

### Company net alignment metric for SDA sectors

For each company in the analyzed banking books, shows the net aggregate alignment metric for sectors, which have sector level calculations using the sectoral decarbonization approach (SDA), namely `aviation, cement, steel`. See the [`pacta.multi.loanbook.analysis` wensite](https://rmi-pacta.github.io/pacta.multi.loanbook.analysis/articles/company_alignment_metric.html) for methodological documentation.

```{r dd_company_alignment_net_sda}
dd_company_alignment_net_sda <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_alignment_net_sda")
DT::datatable(dd_company_alignment_net_sda)

```

### Company net aggregate alignment metric with financial exposures

For each company in the analyzed banking books, shows the net aggregate alignment metric for all available sectors. This table includes the financial exposure to each of the analyzed parts of the banking books, split as defined in `by_group`.

```{r dd_company_exposure_net_aggregate_alignment}
dd_company_exposure_net_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_exposure_net_aggregate_alignment")
DT::datatable(dd_company_exposure_net_aggregate_alignment)

```

### Disaggregated company buildout/phaseout alignment metric with financial exposures

For each company in the analyzed banking books, shows the net aggregate alignment metric - disaggregated by its buildout and phaseout components - for all sectors that use technology level TMS calculations, namely `automotive, coal, oil and gas, power`. This table includes the financial exposure to each of the analyzed parts of the banking books, split as defined in `by_group`. Note that the financial exposure is not disaggregated, the alignment metric is.

```{r dd_company_exposure_bo_po_aggregate_alignment}
dd_company_exposure_bo_po_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "company_exposure_bo_po_aggregate_alignment")
DT::datatable(dd_company_exposure_bo_po_aggregate_alignment)

```

### Loan book net aggregate alignment metric with financial exposures

For each loan book level group (split as defined in `by_group`), shows the net aggregate alignment metric for all available sectors. This table includes the financial exposure to each of the analyzed parts of the banking books. Company level results are aggregated to the loan book level, using their relative financial exposure as weights.

```{r dd_loanbook_exposure_net_aggregate_alignment}
dd_loanbook_exposure_net_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "loanbook_exposure_net_aggregate_alignment")
DT::datatable(dd_loanbook_exposure_net_aggregate_alignment)

```

### Disaggregated loan book buildout/phaseout alignment metric with financial exposures

For each loan book level group (split as defined in `by_group`), shows the net aggregate alignment metric - disaggregated by its buildout and phaseout components - for all sectors using technology level TMS calculations, namely `automotive, coal, oil and gas, power`. Company level results are aggregated to the loan book level, using their relative financial exposure as weights.

```{r dd_loanbook_exposure_bo_po_aggregate_alignment}
dd_loanbook_exposure_bo_po_aggregate_alignment <- dplyr::filter(data_dictionary, .data[["dataset"]] == "loanbook_exposure_bo_po_aggregate_alignment")
DT::datatable(dd_loanbook_exposure_bo_po_aggregate_alignment)

```

### Input data for Sankey plot

Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_sankey()`.

```{r dd_data_sankey}
dd_data_sankey <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_sankey")
DT::datatable(dd_data_sankey)

```

### Input data for alignment-by-exposure scatter plot

Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_scatter_alignment_exposure()`.

```{r dd_data_scatter_alignment_exposure}
dd_data_scatter_alignment_exposure <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_scatter_alignment_exposure")
DT::datatable(dd_data_scatter_alignment_exposure)

```

### Input data for buildout/phaseout scatter plot

Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_scatter()`.

```{r dd_data_scatter_sector}
dd_data_scatter_sector <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_scatter_sector")
DT::datatable(dd_data_scatter_sector)

```

### Input data for animated buildout/phaseout scatter plot

Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_scatter_animated()`.

```{r dd_data_scatter_sector_animated}
dd_data_scatter_sector_animated <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_scatter_sector_animated")
DT::datatable(dd_data_scatter_sector_animated)

```

### Input data for net timline plot

Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_timeline()`.

```{r dd_data_timeline_net}
dd_data_timeline_net <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_timeline_net")
DT::datatable(dd_data_timeline_net)

```

### Input data for buildout/phaseout timline plot

Data set meant to be used as input into `pacta.multi.loanbook.plot::plot_timeline()`.

```{r dd_data_timeline_bo_po}
dd_data_timeline_bo_po <- dplyr::filter(data_dictionary, .data[["dataset"]] == "data_timeline_bo_po")
DT::datatable(dd_data_timeline_bo_po)

```

Loading