Skip to content

Commit

Permalink
fix(FIMSFrame): Assume yyyy-mm-dd format for dates
Browse files Browse the repository at this point in the history
When writing a data frame to a csv and reading it back in, the date
formatting can be lost, e.g., 0001-01-01 turns into 1-1-1. FIMS requires
a yyyy-mm-dd format. Now the use of the as.Date() function will create
a date object from a character object but only if it is in the correct
format. If it is in the wrong format, e.g., yyyy/mm/dd, then the function
will error out.

Right now the start_year and end_year are formatted as integers because for
plotting, I thought it would be better to have year 1 versus year 0001 but
we can change this. @ian-taylor-NOAA what do you think?
  • Loading branch information
Kelli.Johnson committed Jul 2, 2024
1 parent fd98f61 commit fd3bb80
Showing 1 changed file with 32 additions and 10 deletions.
42 changes: 32 additions & 10 deletions R/fimsframe.R
Original file line number Diff line number Diff line change
Expand Up @@ -226,11 +226,24 @@ setValidity(
}

errors <- c(errors, validate_data_colnames(object@data))

# Add checks for other slots
# Check the format for acceptable variants of the ideal yyyy-mm-dd
grepl_datestart <- grepl(
"[0-9]{1,4}-[0-9]{1,2}-[0-9]{1-2}",
data_mile1[["datestart"]]
)
grepl_dateend <- grepl(
"[0-9]{1,4}-[0-9]{1,2}-[0-9]{1-2}",
data_mile1[["dateend"]]
)
if (!all(grepl_datestart)) {
errors <- c(errors, "datestart must be in 'yyyy-mm-dd' format")
}
if (!all(grepl_dateend)) {
errors <- c(errors, "dateend must be in 'yyyy-mm-dd' format")
}

# TODO: Add checks for other slots

# Return
if (length(errors) == 0) {
return(TRUE)
Expand Down Expand Up @@ -299,14 +312,23 @@ FIMSFrame <- function(data) {
paste(errors, sep = "\n", collapse = "\n")
)
}
# Get the earliest and latest year of data and use to calculate n years for
# population simulation
start_year <- as.integer(
strsplit(min(data[["datestart"]], na.rm = TRUE), "-")[[1]][1]
)
end_year <- as.integer(
strsplit(max(data[["dateend"]], na.rm = TRUE), "-")[[1]][1]
)
# datestart and dateend need to be date classes so leading zeros are present
# but writing and reading from csv file removes the classes so they must be
# enforced here
# e.g., 0004-01-01 for January 01 0004
date_formats <- c("%Y-%m-%d")
data[["datestart"]] <- as.Date(data[["datestart"]], tryFormats = date_formats)
data[["dateend"]] <- as.Date(data[["dateend"]], tryFormats = date_formats)

# Get the earliest and latest year formatted as a string of 4 integers
start_year <- as.integer(format(
as.Date(min(data[["datestart"]], na.rm = TRUE), tryFormats = date_formats),
"%Y"
))
end_year <- as.integer(format(
as.Date(max(data[["dateend"]], na.rm = TRUE), tryFormats = date_formats),
"%Y"
))
n_years <- as.integer(end_year - start_year + 1)
years <- start_year:end_year

Expand Down

0 comments on commit fd3bb80

Please sign in to comment.