Skip to content

Commit

Permalink
Merge branch 'release-0.2.2'
Browse files Browse the repository at this point in the history
  • Loading branch information
MatthewHeun committed May 21, 2018
2 parents b2a197b + 983b58f commit 58acbd2
Show file tree
Hide file tree
Showing 14 changed files with 496 additions and 79 deletions.
3 changes: 2 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: matsindf
Type: Package
Title: Work with matrices in data frames
Version: 0.2.1
Version: 0.2.2
Authors@R: c(person("Matthew", "Heun", role = c("aut", "cre"),
email = "[email protected]"))
Author: Matthew Heun [aut, cre]
Expand All @@ -18,6 +18,7 @@ Imports:
dplyr,
ggplot2,
magrittr,
purrr,
rlang,
tibble,
tidyr
Expand Down
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
export(collapse_to_matrices)
export(expand_to_tidy)
export(mat_to_rowcolval)
export(matsindf_apply)
export(matsindf_apply_types)
export(rowcolval_to_mat)
importFrom(dplyr,case_when)
importFrom(dplyr,filter)
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# News for `matsindf`

## matsindf 0.2.2 (2018-05-20)

New `matsindf_apply` function.
All functions now use `matsindf_apply`.

## matsindf 0.2.1 (2018-02-23)

Now uses renamed `matsbyname` package.
Expand Down
42 changes: 23 additions & 19 deletions R/collapse.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#' These column names are specified as strings by the \code{matnames}, \code{rownames}, \code{colnames},
#' \code{rowtypes}, \code{coltypes}, and \code{values} arguments to \link{collapse_to_matrices}, respectively.
#' A \pkg{matsindf}-style matrix has named rows and columns.
#' In addition, \pkg{matsindf}-style-style matrices have "types" for row and column information,
#' In addition, \pkg{matsindf}-style matrices have "types" for row and column information,
#' such as "Commodities", "Industries", "Products", or "Machines".
#' The row and column types for the \pkg{matsindf}-style matrices are stored as attributes on the matrix
#' (\code{rowtype} and \code{coltype}),
Expand All @@ -15,13 +15,15 @@
#' Row and column types are both respected and propagated by the various \code{_byname} functions
#' of the \pkg{matsbyname} package.
#' Use the \code{_byname} functions when you do operations on the \pkg{matsindf}-style matrices.
#' The \pkg{matsindf}-style-style matrices will be stored
#' The \pkg{matsindf}-style matrices will be stored
#' in a column with same name as the incoming \code{values} column.

#' This function is similar to \code{\link[tidyr]{nest}}, which stores data frames into a cell of a data frame.
#' With \code{collapse_to_matrices}, matrices are created.
#' This function is similar to \code{\link{summarise}} in that groups are respected.
#' (In fact, calls to this function may not work properly unless grouping is provided.
#' Errors of the form "Error: Duplicate identifiers for rows ..." are usually fixed by
#' grouping \code{.data} prior to calling this function.)
#' grouping \code{.DF} prior to calling this function.)
#' The usual approach is to \code{\link{group_by}} the \code{matnames} column
#' and any other columns to be preserved in the output.
#' Note that execution is halted if any of
Expand All @@ -33,18 +35,20 @@
#'
#' Groups are not preserved on output.
#'
#' @param .data the "tidy" data frame
#' @param matnames a string identifying the column in \code{.data} containing matrix names for matrices to be created
#' @param rownames a string identifying the column in \code{.data} containing row names for matrices to be created
#' @param colnames a string identifying the column in \code{.data} containing column names for matrices to be created
#' @param rowtypes optional string identifying the column in \code{.data} containing the type of values in rows of the matrices to be created
#' @param coltypes optional string identifying the column in \code{.data} containing the type of values in columns of the matrices to be created
#' @param values a string identifying the column in \code{.data} containing values to be inserted into the matrices to be created.
#' @param .DF the "tidy" data frame
#' @param matnames a string identifying the column in \code{.DF} containing matrix names for matrices to be created
#' @param rownames a string identifying the column in \code{.DF} containing row names for matrices to be created
#' @param colnames a string identifying the column in \code{.DF} containing column names for matrices to be created
#' @param rowtypes optional string identifying the column in \code{.DF} containing the type of values in rows of the matrices to be created
#' @param coltypes optional string identifying the column in \code{.DF} containing the type of values in columns of the matrices to be created
#' @param values a string identifying the column in \code{.DF} containing values to be inserted into the matrices to be created.
#' This will also be the name of the column in the output containing matrices formed from the
#' data in the \code{values} column.
#'
#' @return a data frame with matrices in columns
#'
#' @seealso \code{\link[tidyr]{nest}} and \code{\link[dplyr]{summarise}}.
#'
#' @export
#'
#' @importFrom matsbyname setrowtype
Expand Down Expand Up @@ -93,17 +97,17 @@
#' rownames = "row", colnames = "col",
#' rowtypes = "rowtype", coltypes = "coltype")
#' mats %>% spread(key = matrix, value = vals)
collapse_to_matrices <- function(.data, matnames, values, rownames, colnames,
collapse_to_matrices <- function(.DF, matnames, values, rownames, colnames,
rowtypes = NULL, coltypes = NULL){
# Ensure that none of rownames, colnames, or values is a group variable.
# These can't be in the group variables.
# If they were, we wouldn't be able to summarise them into the matrices.
if (any(c(values, rownames, colnames, rowtypes, coltypes) %in% groups(.data))) {
if (any(c(values, rownames, colnames, rowtypes, coltypes) %in% groups(.DF))) {
cant_group <- c(rownames, colnames, rowtypes, coltypes, values)
violator <- which(cant_group %in% groups(.data))
violator <- which(cant_group %in% groups(.DF))
stop(paste(cant_group[[violator]], " are grouping variables.",
"Cannot group on rownames, colnames,",
"rowtypes, coltypes, or values in argument .data of collapse_to_matrices."))
"rowtypes, coltypes, or values in argument .DF of collapse_to_matrices."))
}
# Ensure that not only one of rowtypes or coltypes is non-NULL.
if (xor(is.null(rowtypes), is.null(coltypes))) {
Expand All @@ -118,17 +122,17 @@ collapse_to_matrices <- function(.data, matnames, values, rownames, colnames,
# If we get here, both rowtypes and coltypes have been changed from default (NULL) or
# both rowtypes and coltypes have not been changed from default (NULL).
# Thus, we need to test only for the one of them being non-NULL.
.data %>%
.DF %>%
{if (!is.null(rowtypes)) {
group_by(.data, !!as.name(rowtypes), !!as.name(coltypes), add = TRUE)
group_by(.DF, !!as.name(rowtypes), !!as.name(coltypes), add = TRUE)
} else {
.data
.DF
} } %>%
dplyr::do(
# Convert .data to matrices
# Convert .DF to matrices
!!values := rowcolval_to_mat(.data, rownames = rownames, colnames = colnames, values = values,
rowtype = rowtypes, coltype = coltypes)
) %>%
select(!!!group_vars(.data), !!values) %>%
select(!!!group_vars(.DF), !!values) %>%
data.frame(check.names = FALSE)
}
178 changes: 178 additions & 0 deletions R/matsindf_apply.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
#' Apply a function to a \code{matsindf} data frame (and more)
#'
#' Applies \code{FUN} to \code{.DF} or
#' performs the calculation specified by \code{FUN}
#' on numbers or matrices.
#' \code{FUN} must return a named list.
#'
#' If \code{...} are all named numbers or matrices
#' of the form \code{argname = m},
#' \code{.DF} is ignored, and
#' \code{m}s are passed to \code{FUN} by \code{argname}s.
#' The return value is a named list provided by \code{FUN}.
#'
#' If \code{...} are all lists of numbers or matrices
#' of the form \code{argname = l},
#' \code{.DF} is ignored, and
#' \code{FUN} is \code{Map}ped across the various \code{l}s
#' to obtain a list of named lists returned from \code{FUN}.
#' The return value is a data frame
#' whose rows are the named lists returned from \code{FUN} and
#' whose column names are the names of the list items returned from \code{FUN}.
#' The series of named lists are \code{rbind}-ed to create the output data frame.
#' Columns of \code{.DF} are not present in the return value.
#'
#' If \code{...} are all named character strings
#' of the form \code{argname = string},
#' \code{.DF} is required,
#' \code{argname}s are expected to be names of arguments to \code{FUN}, and
#' \code{string}s are expected to be column names in \code{.DF}.
#' The return value is \code{.DF} with additional columns (at right)
#' whose names are the names of list items returned from \code{FUN}.
#'
#' \code{NULL} arguments in ... are ignored for the purposes of deciding whether
#' all arguments are numbers, matrices, lists of numbers of matrieces, or named character strings.
#' However, all \code{NULL} arguments are passed to \code{FUN},
#' so \code{FUN} should be able to deal with \code{NULL} arguments appropriately.
#'
#' If \code{.DF} is present, \code{...} contains strings, and one of the \code{...} strings is not the name
#' of a column in \code{.DF},
#' \code{FUN} is called WITHOUT the argument whose column is missing.
#' I.e., that argument is treated as missing.
#' If \code{FUN} works despite the missing argument, execution proceeds.
#' If \code{FUN} cannot handle the missing argument, an error will occur in \code{FUN}.
#'
#' @param .DF the \code{matsindf} data frame
#' @param FUN the function to be applied to \code{.DF}
#' @param ... named arguments to be passed by name to \code{FUN}.
#'
#' @return a named list or a data frame. (See details.)
#'
#' @export
#'
#' @examples
#' example_fun <- function(a, b){
#' return(list(c = sum_byname(a, b), d = difference_byname(a, b)))
#' }
#' # Single values for arguments
#' matsindf_apply(FUN = example_fun, a = 2, b = 2)
#' # Matrices for arguments
#' a <- 2 * matrix(c(1,2,3,4), nrow = 2, ncol = 2, byrow = TRUE,
#' dimnames = list(c("r1", "r2"), c("c1", "c2")))
#' b <- 0.5 * a
#' matsindf_apply(FUN = example_fun, a = a, b = b)
#' # Single values in lists
#' matsindf_apply(FUN = example_fun, a = list(2, 2), b = list(1, 2))
#' # Matrices in lists
#' matsindf_apply(FUN = example_fun, a = list(a, a), b = list(b, b))
#' # Single numbers in a data frame
#' DF <- data.frame(a = c(4, 4, 5), b = c(4, 4, 4))
#' matsindf_apply(DF, FUN = example_fun, a = "a", b = "b")
#' # Matrices in data frames (matsindf)
#' DF2 <- data.frame(a = I(list(a, a)), b = I(list(b,b)))
#' matsindf_apply(DF2, FUN = example_fun, a = "a", b = "b")
matsindf_apply <- function(.DF = NULL, FUN, ...){
# dots <- list(...)
# dots_except_NULL <- dots[which(!as.logical(lapply(dots, is.null)))]
# all_dots_num <- all(lapply(dots_except_NULL, FUN = is.numeric) %>% as.logical())
# all_dots_mats <- all(lapply(dots_except_NULL, FUN = is.matrix) %>% as.logical())
# all_dots_list <- all(lapply(dots_except_NULL, FUN = is.list) %>% as.logical())
# all_dots_char <- all(lapply(dots_except_NULL, FUN = is.character) %>% as.logical())
types <- matsindf_apply_types(...)
if (is.null(.DF) & (types$all_dots_num | types$all_dots_mats)) {
# We're not in a data frame.
# Simply call FUN on ...
return(FUN(...))
}
if (is.null(.DF) & types$all_dots_list) {
# All arguments are coming in as lists.
# Map FUN across the lists.
# The result of Map is a list containing all the rows of output.
# But we want columns of output, so transpose.
out_list <- transpose(Map(f = FUN, ...))
numrows <- length(out_list[[1]])
numcols <- length(out_list)
# Create a data frame filled with NA values.
out_df <- data.frame(matrix(NA, nrow = numrows, ncol = numcols)) %>% set_names(names(out_list))
# Fill the data frame with new columns.
for (j in 1:numcols) {
out_df[[j]] <- out_list[[j]]
}
return(out_df)
}
if (types$all_dots_char) {
dots <- list(...)
# If one of the ... strings is NULL, we won't be able to
# extract a column from .DF.
# So, eliminate all NULLs from the ... strings.
dots <- dots[which(!as.logical(lapply(dots, is.null)))]
arg_cols <- lapply(dots, FUN = function(colname){
return(.DF[[colname]])
})
# If one of the ... strings is not a name of a column in .DF,
# it is, practically speaking, a missing argument, and we should treat it as such.
# If an arg is not present in .DF, it will be NULL in arg_cols.
# To treat it as "missing," we remove it from the arg_cols.
arg_cols <- arg_cols[which(!as.logical(lapply(arg_cols, is.null)))]
# Then, we call FUN, possibly with the missing argument.
# If FUN can handle the missing argument, everything will be fine.
# If not, an error will occur in FUN.
return(do.call(matsindf_apply, args = c(list(.DF = NULL, FUN = FUN), arg_cols)) %>%
bind_rows() %>%
bind_cols(.DF, .))
}

# If we get here, we don't know how to deal with our inputs.
# Try our best to give a meaningful error message.
clss <- lapply(list(...), class) %>% paste(collapse = ",")
msg <- paste(
"unknown state in matsindf_apply",
"... must be all same type, all single numbers, all matrices, all lists, or all character.",
"types are:",
clss
)
stop(msg)
}


#' Determine types of ... argument for matsindf_apply
#'
#' This is a convenience function that returns a logical list for the types of \code{...}
#' with components named \code{all_dots_num}, \code{all_dots_mats},
#' \code{all_dots_list}, and \code{all_dots_char}.
#'
#' When all items in \code{...} are single numbers, \code{all_dots_num} is \code{TRUE} and all other list members are \code{FALSE}.
#' When all items in \code{...} are matrices, \code{all_dots_mats} is \code{TRUE} and all other list members are \code{FALSE}.
#' When all items in \code{...} are lists, \code{all_dots_list} is \code{TRUE} and all other list members are \code{FALSE}.
#' When all items in \code{...} are character strings, \code{all_dots_char} is \code{TRUE} and all other list members are \code{FALSE}.
#'
#' @param ... the list of arguments to be checked
#'
#' @return A logical list with components named \code{all_dot_num}, \code{all_dots_mats},
#' \code{all_dots_list}, and \code{all_dots_char}.
#'
#' @export
#'
#' @examples
#' matsindf_apply_types(a = 1, b = 2)
#' matsindf_apply_types(a = matrix(c(1, 2)), b = matrix(c(2, 3)))
#' matsindf_apply_types(a = list(1, 2), b = list(3, 4), c = list(5, 6))
#' matsindf_apply_types(a = "a", b = "b", c = "c")
matsindf_apply_types <- function(...){
dots <- list(...)
dots_except_NULL <- dots[which(!as.logical(lapply(dots, is.null)))]
all_dots_num <- all(lapply(dots_except_NULL, FUN = is.numeric) %>% as.logical())
all_dots_mats <- all(lapply(dots_except_NULL, FUN = is.matrix) %>% as.logical())
all_dots_list <- all(lapply(dots_except_NULL, FUN = is.list) %>% as.logical())
all_dots_char <- all(lapply(dots_except_NULL, FUN = is.character) %>% as.logical())
if (all_dots_mats) {
# Matrices are numerics.
# However, when all items in ... are matrices, we want to operate as matrices, not as numbers.
# So, set all_dots_num to FALSE.
all_dots_num <- FALSE
}
list(all_dots_num = all_dots_num, all_dots_mats = all_dots_mats, all_dots_list = all_dots_list, all_dots_char = all_dots_char)
}



Loading

0 comments on commit 58acbd2

Please sign in to comment.