Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantargo exercise - modify existing package #13

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
exportPattern("^[[:alpha:]]+")
importFrom("stats", "quantile")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not seem that the file has been created with roxygen2. Why?

1 change: 1 addition & 0 deletions R/meanimpute.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#' Meanimputation
#' @param x A vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include a description of the function here.

#' @export
meanimpute <- function(x) {
x[is.na(x)] <- mean(x, na.rm = TRUE)
Expand Down
19 changes: 19 additions & 0 deletions R/transform_log.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#' transform_log
#' Transform numerical values into their log values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should include an empty roxygen line between header and description.

#' @param x A vector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter description could be more specific here, e.g. indicating the data type.

#' @return logarithm of x
#' @export
#' @examples
#' transform_log(1)
#' transform_log(c(1, 2, 3, 4, 5))
#'
transform_log <- function(x) {
if (!is.numeric(x)) {
warning("transform_log: Expecting numeric argument")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should stop here instead of indicating a warning. Implicit data type conversions can always lead to strange results.

}
x_badval <- is.na(suppressWarnings(as.numeric(x)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No implicit conversion should be done here, see previous comment.

x[x_badval] <- 1
y <- log(as.numeric(x))
y[x_badval] <- NA
y
}
26 changes: 23 additions & 3 deletions R/windsorize.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,30 @@
#' Windsorize
#'
#' Do some windsorization.
#' set all outliers to a specified percentile of the data;
#' a 90% winsorization would see all data below the 5th percentile set
#' to the 5th percentile,
#' and data above the 95th percentile set to the 95th percentile.
#'
#' @param x A vector.
#' @param p A quantile.
#' @return dataset with trimmed outliers with 10% percentile
#' @examples
#' windsorize(c(3,4,4,3,4,5,1))
#' @export
windsorize <- function(x, p = .90) {
q <- quantile(x, p)
x[x >= q] <- q
if (length(x) == 0) stop("argument should not be a empty vector")
if (all(is.na(x))) {
stop("argument should not be a vector containing only NA")
}
if (!is.numeric(x)) stop("argument should be a numeric vector")
if (!is.numeric(p)) stop("argument should be a number from 0 to 1")
if (p < 0 || p > 1) {
stop("p invalid percentale. Expected values from 0 to 1")
}
q_lower <- quantile(x, (1-p)/2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if vector x contains some NA values here?

q_upper <- quantile(x, 1 - (1-p)/2)
x[x <= q_lower] <- q_lower
x[x >= q_upper] <- q_upper
x
}

3 changes: 3 additions & 0 deletions man/meanimpute.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 24 additions & 0 deletions man/transform_log.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 15 additions & 1 deletion man/windsorize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions tests/testthat.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
library(testthat)
library(datacleaner)

test_check("datacleaner")
11 changes: 11 additions & 0 deletions tests/testthat/test_tranform_log.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
context("tranform_log")
library(datacleaner)
test_that("tranform_log is correct", {
expect_equal(as.character(transform_log(c(2, 2, 2, 2, 3))), as.character(c(0.693147180559945, 0.693147180559945, 0.693147180559945, 0.693147180559945, 1.09861228866811)))
expect_equal(as.character(transform_log(c(2, 3, NA))), as.character(c(0.693147180559945, 1.09861228866811, NA)))
})

test_that("unexpected parameters", {
expect_warning(t <- transform_log(c(2, 3, "NA")), "transform_log: Expecting numeric argument")
expect_equal(as.character(t), as.character(c(0.693147180559945, 1.09861228866811, NA)))
})
17 changes: 17 additions & 0 deletions tests/testthat/test_windorize.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
context("Windsorize")
library(datacleaner)
test_that("windorizing is correct", {
expect_equal(windsorize(c(2,2,2,2,3),.9), c(2,2,2,2,2.8) )
expect_equal(windsorize(c(2,2,2,2,1),.9), c(2,2,2,2,1.2) )
})

test_that("unexpected parameters", {
expect_error(windsorize(c(NA,NA,NA), .9), "argument should not be a vector containing only NA")
expect_error(windsorize(c(), .9), "argument should not be a empty vector")
expect_error(windsorize(c(1,2,3,"4"), .9), "argument should be a numeric vector")
expect_error(windsorize(c(1,2,3,4), ".9"), "argument should be a number from 0 to 1")
expect_error(windsorize(c(1,2,3,4), -1), "p invalid percentale. Expected values from 0 to 1")
expect_error(windsorize(c(1,2,3,4), 1.2), "p invalid percentale. Expected values from 0 to 1")
expect_error(windsorize(c(1,2,3,NA), .9))

})