Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #161 Coercion date datatype #214

Merged
merged 29 commits into from
Feb 13, 2024
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
21676a6
Add metadata type
cpiraux Dec 22, 2023
ce0b5e4
Add test for date coercion to character
cpiraux Jan 10, 2024
1f50d9b
Update test-length for datetime variables
cpiraux Jan 12, 2024
9625a77
Remove format from type
cpiraux Jan 12, 2024
1841c4e
Udpate function description to remove reference to DT, DTM and TM
cpiraux Jan 17, 2024
7ec86fb
Added Changes Description
cpiraux Jan 17, 2024
c2c24bf
Updated function description
cpiraux Jan 17, 2024
c7e86e1
Merge branch 'main' into 161-coercion-date-datatype
cpiraux Jan 17, 2024
b96350c
fixed small bug
cpiraux Jan 17, 2024
b24ae25
Removed blank lines
cpiraux Jan 17, 2024
fbf7839
Removed blank line
cpiraux Jan 17, 2024
2877651
Merge branch 'main' into 161-coercion-date-datatype
vedhav Jan 17, 2024
e93b119
chore: update docs with new options
vedhav Jan 17, 2024
04f8fca
Update description in type.R
cpiraux Jan 22, 2024
701fb5e
Update option character_metadata_types
cpiraux Jan 22, 2024
53bffb9
Merge branch '161-coercion-date-datatype' of github.com:atorus-resear…
cpiraux Jan 22, 2024
be591a0
update description in NEWS.md
cpiraux Jan 22, 2024
3c3b291
Merge branch 'main' into 161-coercion-date-datatype
cpiraux Jan 22, 2024
d38bc17
Merge branch 'main' into 161-coercion-date-datatype
cpiraux Jan 30, 2024
2d5fe41
fix: pass the domain name explicitly
vedhav Feb 1, 2024
7c1ccde
Add numeric to R numeric type
cpiraux Feb 5, 2024
0979306
Merge branch 'main' into 161-coercion-date-datatype
cpiraux Feb 5, 2024
388223c
Split too long line
cpiraux Feb 5, 2024
f3b1c8d
Merge branch '161-coercion-date-datatype' of github.com:atorus-resear…
cpiraux Feb 5, 2024
7bf0e4d
Merge branch 'main' into 161-coercion-date-datatype
cpiraux Feb 5, 2024
3ead6ae
Merge branch 'main' into 161-coercion-date-datatype
bms63 Feb 11, 2024
892f530
Merge branch 'main' into 161-coercion-date-datatype
bms63 Feb 11, 2024
66a07ee
Merge branch 'main' into 161-coercion-date-datatype
bms63 Feb 11, 2024
37c520e
Merge branch 'main' into 161-coercion-date-datatype
bms63 Feb 11, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,10 @@
* Added a check for character variable lengths up to 200 bytes in `xpt_validate()`(#91, #189).
* File name check is moved to strict_checks condition to allow underscores in the file name. Underscores are allowed in xpt but not per FDA requirements. (#126)
* It is now possible to get and set the xportr options using the helper function `xportr_options()` (#130)
* Added `xportr.character_metadata_types` and `xportr.numeric_metadata_types` to list the metadata types that are character or numeric. Updated `xportr.character_types` and `xportr.numeric_types` to list only the R types that are character and the R types that are numeric. This ensures that all R types, including dates, are now managed by xportr_type. If the R type differs from the metadata type, the variable is coerced (#161)..
* Adds argument assertions to public functions using `{checkmate}` (#175)


## Deprecation and Breaking Changes

* The `domain` argument for xportr functions will no longer be dynamically
Expand Down
8 changes: 6 additions & 2 deletions R/options.R
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,15 @@
#' The default argument for the 'verbose' argument for `xportr_length`.
#' \item{xportr.type_verbose}{defaults to `"label"`}:
#' The default argument for the 'verbose' argument for `xportr_type`.
#' \item{xportr.character_types}{defaults to `c("character", "char", "text", "date", "posixct", "posixt",
#' \item{xportr.character_types}{defaults to `"character"`}:
#' The default character vector used to explicitly coerce R classes to character XPT types.
#' \item{xportr.character_metadata_types}{defaults to `c("character", "char", "text", "date", "posixct", "posixt",
#' "datetime", "time", "partialdate", "partialtime", "partialdatetime",
#' "incompletedatetime", "durationdatetime", "intervaldatetime")`}:
#' The default character vector used to explicitly coerce R classes to character XPT types.
#' \item{xportr.numeric_types}{defaults to `c("integer", "numeric", "num", "float")`}:
#' \item{xportr.numeric_metadata_types}{defaults to `c("integer", "numeric", "num", "float")`}:
#' The default character vector used to explicitly coerce R classes to numeric XPT types.
#' \item{xportr.numeric_types}{defaults to `c("integer", "float", "posixct", "posixt", "time", "date")`}:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the changes in the options docs 2/2

#' The default character vector used to explicitly coerce R classes to numeric XPT types.
#' }
#'
Expand Down
39 changes: 15 additions & 24 deletions R/type.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,14 @@
#' 'xportr.character_types' option is used to explicitly collapse the class of a
#' column to character using `as.character`. Similarly, 'xportr.numeric_types'
#' will collapse a column to a numeric type. If no type is passed for a
#' variable and it isn't identified as a timing variable, it is assumed to be
#' numeric and coerced with `as.numeric`.
#' variable, it is assumed to be numeric and coerced with `as.numeric()`.
#'
#' Certain care should be taken when using timing variables. R serializes dates
#' based on a reference date of 01/01/1970 where XPT uses 01/01/1960. This can
#' result in dates being 10 years off when outputting from R to XPT if you're
#' using a date class. For this reason, `xportr` will try to determine what
#' should happen with variables that appear to be used to denote time.
#'
#' For variables that end in `DT`, `DTM`, or, `TM`, if they are not explicitly noted
#' in 'xportr.numeric_types' or 'xportr.character_types', they are coerced to
#' numeric results.
#'
#' @inheritParams xportr_length
#'
#' @section Messaging: `type_log()` is the primary messaging tool for
Expand All @@ -37,26 +32,22 @@
#' "dataset". This is the column subset by the 'domain' argument in the
#' function.
#'
#' 2) Format Name - passed as the 'xportr.format_name' option. Default:
#' "format". Character values to update the '`format.sas`' attribute of the
#' column. This is passed to `haven::write` to note the format.
#'
#' 3) Variable Name - passed as the 'xportr.variable_name' option. Default:
#' 2) Variable Name - passed as the 'xportr.variable_name' option. Default:
#' "variable". This is used to match columns in '.df' argument and the
#' metadata.
#'
#' 4) Variable Type - passed as the 'xportr.type_name'. Default: "type". This
#' 3) Variable Type - passed as the 'xportr.type_name'. Default: "type". This
#' is used to note the XPT variable "type" options are numeric or character.
#'
#' 5) (Option only) Character Types - The list of classes that should be
#' explicitly coerced to a XPT Character type. Default: `c( "character",
#' 4) (Option only) Character Types - The list of classes that should be
#' explicitly coerced to a XPT Character type. Default: c( "character",
#' "char", "text", "date", "posixct", "posixt", "datetime", "time",
#' "partialdate", "partialtime", "partialdatetime", "incompletedatetime",
#' "durationdatetime", "intervaldatetime")`
#'
#' 6) (Option only) Numeric Types - The list of classes that should be
#' explicitly coerced to a XPT numeric type. Default: `c("integer", "numeric",
#' "num", "float")`
#' 5) (Option only) Numeric Types - The list of classes that should be
#' explicitly coerced to a XPT numeric type. Default: c("integer", "numeric",
#' "num", "float")
#'
#' @return Returns the modified table.
#' @export
Expand All @@ -65,8 +56,7 @@
#' metadata <- data.frame(
#' dataset = "test",
#' variable = c("Subj", "Param", "Val", "NotUsed"),
#' type = c("numeric", "character", "numeric", "character"),
#' format = NA
#' type = c("numeric", "character", "numeric", "character")
#' )
#'
#' .df <- data.frame(
Expand Down Expand Up @@ -115,8 +105,9 @@ xportr_type <- function(.df,
variable_name <- getOption("xportr.variable_name")
type_name <- getOption("xportr.type_name")
characterTypes <- c(getOption("xportr.character_types"), "_character")
characterMetadataTypes <- c(getOption("xportr.character_metadata_types"), "_character")
numericMetadataTypes <- c(getOption("xportr.numeric_metadata_types"), "_numeric")
numericTypes <- c(getOption("xportr.numeric_types"), "_numeric")
format_name <- getOption("xportr.format_name")

if (inherits(metadata, "Metacore")) metadata <- metadata$var_spec

Expand All @@ -126,7 +117,7 @@ xportr_type <- function(.df,
}

metacore <- metadata %>%
select(!!sym(variable_name), !!sym(type_name), !!sym(format_name))
select(!!sym(variable_name), !!sym(type_name))

# Common check for multiple variables name
check_multiple_var_specs(metadata, variable_name)
Expand All @@ -144,14 +135,14 @@ xportr_type <- function(.df,
# _character is used here as a mask of character, in case someone doesn't
# want 'character' coerced to character
type.x = if_else(type.x %in% characterTypes, "_character", type.x),
type.x = if_else(type.x %in% numericTypes | (grepl("DT$|DTM$|TM$", variable) & !is.na(format)),
type.x = if_else(type.x %in% numericTypes,
"_numeric",
type.x
),
type.y = if_else(is.na(type.y), type.x, type.y),
type.y = tolower(type.y),
type.y = if_else(type.y %in% characterTypes | (grepl("DTC$", variable) & is.na(format)), "_character", type.y),
type.y = if_else(type.y %in% numericTypes, "_numeric", type.y)
type.y = if_else(type.y %in% characterMetadataTypes, "_character", type.y),
type.y = if_else(type.y %in% numericMetadataTypes, "_numeric", type.y)
)

# It is possible that a variable exists in the table that isn't in the metadata
Expand Down
13 changes: 10 additions & 3 deletions R/xportr-package.R
Original file line number Diff line number Diff line change
Expand Up @@ -64,15 +64,22 @@
#' }
#' \item{
#' xportr.character_types - The default character vector used to explicitly
#' coerce R classes to character XPT types. Default: `c("character", "char",
#' coerce R classes to character XPT types. Default: "character"
#' }
#' \item{
#' xportr.character_metadata_types - The default character vector used to explicitly
#' coerce R classes to character XPT types. Default: c("character", "char",
#' "text", "date", "posixct", "posixt", "datetime", "time", "partialdate",
#' "partialtime", "partialdatetime", "incompletedatetime", "durationdatetime",
#' "intervaldatetime")`
#' }
#' \item{
#' xportr.numeric_metadata_types - The default character vector used to explicitly
#' coerce R classes to numeric XPT types. Default: c("integer", "numeric", "num", "float")
#' }
#' \item{
#' xportr.numeric_types - The default character vector used to explicitly
#' coerce R classes to numeric XPT types. Default: `c("integer", "numeric",
#' "num", "float")`
#' coerce R classes to numeric XPT types. Default: c("integer", "float", "posixct", "posixt", "time", "date")
#' }
#' }
#'
Expand Down
14 changes: 11 additions & 3 deletions R/zzz.R
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,9 @@ xportr_options_list <- list(
xportr.label_verbose = getOption("xportr.label_verbose", "none"),
xportr.length_verbose = getOption("xportr.length_verbose", "none"),
xportr.type_verbose = getOption("xportr.type_verbose", "none"),
xportr.character_types = getOption(
"xportr.character_types",
xportr.character_types = getOption("xportr.character_types", "character"),
xportr.character_metadata_types = getOption(
"xportr.character_metadata_types",
c(
"character", "char", "text", "date", "posixct",
"posixt", "datetime", "time", "partialdate",
Expand All @@ -26,7 +27,14 @@ xportr_options_list <- list(
"intervaldatetime"
)
),
xportr.numeric_types = getOption("xportr.numeric_types", c("integer", "numeric", "num", "float"))
xportr.numeric_metadata_types = getOption(
"xportr.numeric_metadata_types",
c("integer", "numeric", "num", "float")
),
xportr.numeric_types = getOption(
"xportr.numeric_types",
c("integer", "float", "posixct", "posixt", "time", "date")
)
cpiraux marked this conversation as resolved.
Show resolved Hide resolved
)

.onLoad <- function(libname, pkgname) {
Expand Down
15 changes: 13 additions & 2 deletions man/xportr-package.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

10 changes: 7 additions & 3 deletions man/xportr_options.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/xportr_options_list.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 2 additions & 12 deletions man/xportr_type.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions tests/testthat/test-length.R
Original file line number Diff line number Diff line change
Expand Up @@ -167,8 +167,8 @@ test_that("xportr_length: Column length of known/unkown character types is 200/8
expect_equal(impute_length(123), 8)
expect_equal(impute_length(123L), 8)
expect_equal(impute_length("string"), 200)
expect_equal(impute_length(Sys.Date()), 200)
expect_equal(impute_length(Sys.time()), 200)
expect_equal(impute_length(Sys.Date()), 8)
expect_equal(impute_length(Sys.time()), 8)

withr::local_options(list(xportr.character_types = c("character", "date")))
expect_equal(impute_length(Sys.time()), 8)
Expand Down
33 changes: 33 additions & 0 deletions tests/testthat/test-type.R
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,39 @@ test_that("xportr_type: Drops factor levels", {
expect_null(attributes(df2$Val))
})

df <- data.frame(
STUDYID = c("PILOT01", "PILOT01", "PILOT01"),
USUBJID = c("01-1130", "01-1133", "01-1133"),
TRTEDT = c("2014-08-16", "2013-04-28", "2013-01-12")
) %>%
mutate(
TRTEDT = as.Date(TRTEDT),
EXSTDTC = TRTEDT
)

metadata <- data.frame(
dataset = c("df", "df", "df", "df"),
variable = c("STUDYID", "USUBJID", "TRTEDT", "EXSTDTC"),
type = c("character", "character", "numeric", "date"),
format = c(NA, NA, "DATE9.", NA)
)

test_that("xportr_metadata: Var date types (--DTC) coerced as expected and raise messages", {
# Remove empty lines in cli theme
local_cli_theme()

(
df2 <- xportr_metadata(df, metadata) %>%
xportr_type()
) %>%
expect_message("Variable type mismatches found.") %>%
expect_message("[0-9+] variables coerced")

expect_equal(purrr::map_chr(df2, class), c(
STUDYID = "character", USUBJID = "character",
TRTEDT = "Date", EXSTDTC = "character"
))
})

test_that("xportr_type: Works as expected with only one domain in metadata", {
adsl <- data.frame(
Expand Down
Loading