Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update of dataclean #43

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
exportPattern("^[[:alpha:]]+")
importFrom("stats", "quantile")
2 changes: 2 additions & 0 deletions R/meanimpute.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
#' Meanimputation
#' Calculates mean of a given vector, ignores NA values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function actually replaces all NA values with the mean of a given vector.

#' @param x A vector
#' @export
meanimpute <- function(x) {
x[is.na(x)] <- mean(x, na.rm = TRUE)
Expand Down
18 changes: 18 additions & 0 deletions R/transform_log.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#' transform_log
#' Transform numerical values into their log values
#' @param x A vector
#' @return logarithm of x
#' @examples
#' transform_log(c(NA,0,-1,exp(2)))
#'@export
transform_log<-function(x){
if(!is.numeric(x))stop("function is expecting only numeric values")
x_nan<-is.na(x)
x[x_nan]<-1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NA values are subsequently converted to ones and later to zeros through log(1). Is this behaviour intentional?

ifelse(x<0,"OK", warning("input vector contains negative values, turned into NA"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement gives a warning for EVERY negative value in x.
It would also be sufficient to do a

if(any(x < 0)) {
  warning("input vector contains negative values, turned into NA")
}

y<-log(x[x>=0])
x[x>=0]<-y
x[x<0]<-NA
x[x_nan]<-NA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here the zeros are converted again to NAs (see previous comment).
This part seems could be simplified.

x
}
23 changes: 18 additions & 5 deletions R/windsorize.R
Original file line number Diff line number Diff line change
@@ -1,10 +1,23 @@
#' Windsorize
#'
#' Do some windsorization.
#'
#' Transform all outliner data to
#' (1-p)/2 percentile value for lower outliers and
#' (1+p)/2 for higher outliers.
#'
#' @param x A vector.
#' @param p A quantile.
#' @return inuput vector x with trimmed outliers by (1-p) percentile.
#' @examples
#' windsorize(rnorm(100,0,1))
#' @export
windsorize <- function(x, p = .90) {
q <- quantile(x, p)
x[x >= q] <- q
if(is.null(x))stop("vector is empty")
y<-x[!is.na(x)]
if(length(y)==0)stop("vector contains only NAs")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition could be expressed more clearly using

if( all(is.na(x)) ) stop("vector contains only NAs")

q_max <- quantile(y, (1+p)/2)
q_min<- quantile(y,(1-p)/2)
x[x >= q_max] <- q_max
x[x<= q_min]<-q_min
x
}

}
3 changes: 3 additions & 0 deletions man/meanimpute.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions man/transform_log.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 14 additions & 1 deletion man/windsorize.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.