Skip to content

Commit

Permalink
feat: gains isic_classification Revision 5 (#347)
Browse files Browse the repository at this point in the history
  • Loading branch information
jdhoffa authored Mar 6, 2024
1 parent 0ff80f1 commit be3f02e
Show file tree
Hide file tree
Showing 14 changed files with 953 additions and 42 deletions.
3 changes: 2 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# r2dii.data (development version)

* Begin deprecation of `cnb_classification` and `isic_classification` (#329).
* `isic_classification` updated to revision 5 (#329).
* Begin deprecation of `cnb_classification` (#329).
* Complete deprecation of `ald_demo` in favor of `abcd_demo` (#328).
* Complete deprecation of `green_or_brown` in favor of `increasing_or_decreasing` (#307).

Expand Down
12 changes: 8 additions & 4 deletions R/classification_bridge.R
Original file line number Diff line number Diff line change
Expand Up @@ -80,12 +80,16 @@
#' @inherit nace_classification title
#' @inherit nace_classification description
#'
#' @description
#' `r lifecycle::badge("deprecated")`
#' @section Definitions:
#' `r define('isic_classification')`
#'
#' See `?sector_classifications`
#' @template info_classification-datasets
#'
#' @keywords internal
#' @family datasets for bridging sector classification codes
#' @seealso [data_dictionary].
#'
#' @examples
#' head(isic_classification)
"isic_classification"

#' @inherit nace_classification title
Expand Down
52 changes: 51 additions & 1 deletion data-raw/classification_bridge.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ nace_classification_raw <- read_bridge(
file.path("data-raw", "nace_classification.csv")
)

nace_classification <- convert_superseded_nace_code(
nace_classification <- prepend_letter_nace_code(
nace_classification_raw,
col_from = "original_code",
col_to = "code"
Expand All @@ -34,3 +34,53 @@ psic_classification <- read_bridge(
file.path("data-raw", "psic_classification.csv")
)
usethis::use_data(psic_classification, overwrite = TRUE)

isic_classification_raw <- read_csv_(
file.path("data-raw", "isic_classification_rev_5.csv")
)

isic_classification <- prepend_letter_isic_code(
isic_classification_raw,
col_from = "ISIC Rev 5 Code",
col_to = "code"
)

isic_classification <- dplyr::mutate(
isic_classification,
sector = dplyr::case_when(
grepl("^B05", code) ~ "coal",
grepl("^B06", code) ~ "oil and gas",
grepl("^B091", code) ~ "oil and gas", # borderline
grepl("^B099", code) ~ "coal", # borderline
grepl("^C2394", code) ~ "cement",
grepl("^C2395", code) ~ "cement", # borderline
grepl("^C241", code) ~ "steel",
grepl("^C2431", code) ~"steel", # borderline
grepl("^C291", code) ~ "automotive", # borderline
grepl("^C292", code) ~ "automotive", # borderline
grepl("^C293", code) ~ "automotive", # borderline
grepl("^D351", code) ~ "power", # some of these are borderline
grepl("^H50", code) ~ "shipping",
grepl("^H51", code) ~ "aviation",
TRUE ~ "not in scope"
),
borderline = dplyr::case_when(
grepl("^B091", code) ~ TRUE,
grepl("^B099", code) ~ TRUE,
grepl("^C2395", code) ~ TRUE,
grepl("^C2431", code) ~ TRUE,
grepl("^C291", code) ~ TRUE,
grepl("^C292", code) ~ TRUE,
grepl("^C293", code) ~ TRUE,
code == "D351" ~ TRUE,
grepl("^D3513", code) ~ TRUE,
TRUE ~ FALSE
),
)

isic_classification <- dplyr::mutate(
isic_classification,
revision = "5"
)

usethis::use_data(isic_classification, overwrite = TRUE)
10 changes: 6 additions & 4 deletions data-raw/data_dictionary/isic_classification.csv
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
dataset,column,typeof,definition
isic_classification,code,character,Original ISIC code
isic_classification,code_level,double,Level of granularity of ISIC code
isic_classification,sector,character,Associated 2dii sector
isic_classification,borderline,logical,Flag indicating if 2dii sector and classification code are a borderline match. The value TRUE indicates that the match is uncertain between the 2dii sector and the classification. The value FALSE indicates that the match is certainly perfect or the classification is certainly out of 2dii's scope.
isic_classification,ISIC Rev 5 Code,character,Original ISIC Rev 5 code
isic_classification,ISIC Rev 5 Title,character,Original ISIC Rev 5 title
isic_classification,code,character,ISIC Rev 5 code with top-level letter prepended
isic_classification,sector,character,Associated PACTA sector
isic_classification,borderline,logical,Flag indicating if PACTA sector and classification code are a borderline match. The value TRUE indicates that the match is uncertain between the PACTA sector and the classification. The value FALSE indicates that the match is certainly perfect or the classification is certainly out of PACTA's scope.
isic_classification,revision,character,Column identifying to which ISIC revision the code belongs.
Loading

0 comments on commit be3f02e

Please sign in to comment.