Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New variable and template code removed from L1a #34

Open
bpbond opened this issue Aug 24, 2023 · 3 comments
Open

New variable and template code removed from L1a #34

bpbond opened this issue Aug 24, 2023 · 3 comments

Comments

@bpbond
Copy link
Member

bpbond commented Aug 24, 2023

New variables code that was in L1a

    # Compute new (derived) variables
    message("\tComputing derived variables")
    for(i in seq_len(nrow(nvt))) {
        new_rn <- nvt$new_research_name[i]
        rn_needed <- strsplit(nvt$needs_research_names[i], ",", fixed = TRUE)[[1]]
        # Isolate needed data and site/plot/tree/timestamp
        dat_mrg %>% 
            filter(research_name %in% rn_needed) %>% 
            select(Site, Plot, Tree, TIMESTAMP, research_name, value) %>% 
            pivot_wider(names_from = "research_name", values_from = "value") ->
            dat_subset
        
        message("\t\t", new_rn, " <- ", nvt$needs_research_names[i],
                " (using ", nrow(dat_subset), " rows)")
        
        # Call the computation function based on the name of the new variable
        # New variable functions MUST add "value" and "units" columns; optionally "OOB"
        dat_subset <- do.call(paste0("NEWVAR_", new_rn), list(dat_subset))
        # Clean up: add research_name...
        dat_subset$research_name <- new_rn
        # remove old research name columns (from the pivot_wider above)
        dat_subset[rn_needed] <- NULL
        # ...and add information entries in the design_link and Logger columns 
        if("design_link" %in% names(dat_mrg)) dat_subset$design_link <- "<derived>"
        if("Logger" %in% names(dat_mrg)) dat_subset$Logger <- "<derived>"
        # This is inefficient but assuming we won't have many
        # Also, new variables might be needed by further ones?!?
        dat_mrg <- bind_rows(dat_mrg, dat_subset)
    }
@bpbond
Copy link
Member Author

bpbond commented Sep 2, 2023

NEWVARS_TABLE: "newvars_table.csv"

New variables table "`r NEWVARS_TABLE`" has `r nrow(nvt)` rows.
```{r newvar_code}
#| include: false

# This chunk holds dedicated functions to compute new variables
# These must be named "NEWVAR_[new_research_name]", where "new_research_name"
# is given in the `newvars_table` CSV file
# Functions are given a data frame that is guaranteed to have needed variables
# (research names) as defined in the `newvars_table` CSV file
# Functions must return the data frame they're given, with "value" and "units" 
# columns added (optionally, also a "OOB" column)

# Compute sapflow_avg, the average of shallow and deep sapflow values
NEWVAR_sapflow_avg <- function(x) {
    x$value <- (x$sapflow + x$sapflow_deep) / 2.0
    x$units <- "??"
    return(x)
}

@bpbond
Copy link
Member Author

bpbond commented Sep 2, 2023

Template-reading code that was in L1a

OUTPUT_TEMPLATES: "L1a_output_templates/"

    # Go through the output templates to generate output tables
    smry_all <- list()
    for(template in names(templates)) {
        pt_ex <- templates[[template]] # the expanded 'plot table'
        
        # Join plot table with data
        message("\t------------------------------------------------")
        message("\tJoining with output template (plot table) ", template)
        dat_mrg <- left_join(pt_ex, dat, by = c("Site", "design_link"), 
                      relationship = "many-to-many")
        # Remove unneeded columns unless needed for debugging
        if(!params$debug) {
            message("\tDropping logger/design info columns")
            dat_mrg <- select(dat_mrg, -Table, -loggernet_variable, -design_link, -Logger)
        }
        
        smry_all[[template]] <- data.frame(File = basename(dir_name),
                                           Template = template,
                                           Rows = nrow(dat_mrg),
                                           Note = "")
        
        missing <- anti_join(pt_ex, dat, by = c("Site", "design_link"))
        if(nrow(missing)) {
            dat_issues <<- dat_issues + 1
            message("ERROR: some design links in plot table not found: ", 
                    paste(missing$design_link, collapse = ", "))
            smry_all[[template]]$Note <- "Missing design_link(s)"
            next
        }

        # Put columns in order: first the ones in the template (if available),
        # then TIMESTAMP, research_name, and value; and then any other columns
        # in alphabetized order
        left_cols <- c(base::intersect(colnames(pt_ex), colnames(dat_mrg)), 
                   "research_name", "TIMESTAMP", "value")
        # Note that this order is important, as L1b depends on the fact that
        # everything to the left of TIMESTAMP is a grouping variable
        dat_mrg <- dat_mrg[c(left_cols, sort(setdiff(names(dat_mrg), left_cols)))]

        write_to_folders(dat_mrg, root_dir = out_dir, data_level = "L1a",
                         site = dat$Site[1], logger = dat$Logger[1], table = template)   
    }

@bpbond bpbond changed the title New variable code temporarily removed in multi-table-output New variable and template code removed from L1a Sep 2, 2023
@bpbond
Copy link
Member Author

bpbond commented Sep 2, 2023

Output table templates are in r params$OUTPUT_TEMPLATES; there are r length(templates) of them.

# The template tables in $OUTPUT_TEMPLATES are 'plot tables' in RR's terminology
# Read them into a list
ot <- list.files(file.path(params$DATA_ROOT, params$OUTPUT_TEMPLATES),
                 pattern = "*.csv$", full.names = TRUE)
templates <- lapply(ot, read_csv, show_col_types = FALSE)
names(templates) <- gsub("\\.csv", "", basename(ot))
# For compactness, plot tables may have expansions
templates <- lapply(templates, compasstools::expand_df)

# Read new variables table
NEWVARS_TABLE <- file.path(params$DATA_ROOT, params$NEWVARS_TABLE)
nvt <- read_csv(NEWVARS_TABLE, col_types = "cc")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant