diff --git a/404.html b/404.html index 96244be..9a28c17 100644 --- a/404.html +++ b/404.html @@ -38,7 +38,7 @@
@@ -112,7 +112,7 @@vignettes/apply_any_r_function.Rmd
apply_any_r_function.Rmd
runner
package provides functions applied on running windows. The most universal function is runner::runner
which gives user possibility to apply any R function f
on running windows. Running windows are defined for each data window size k
, lag
with respect to their indexes. Unlike other available R packages, runner
supports any input and output type and also gives full control to manipulate window size and lag/lead.
There are different kinds of running windows and all of them are implemented in runner
.
runner
package provides functions applied on running
+windows. The most universal function is runner::runner
+which gives user possibility to apply any R function f
on
+running windows. Running windows are defined for each data window size
+k
, lag
with respect to their indexes. Unlike
+other available R packages, runner
supports any input and
+output type and also gives full control to manipulate window size and
+lag/lead.
There are different kinds of running windows and all of them are
+implemented in runner
.
The simplest window type which is similar to base::cumsum
. At each element window is defined by all elements appearing before current.
The simplest window type which is similar to
+base::cumsum
. At each element window is defined by all
+elements appearing before current.
In runner
this can be achieved as simple by:
@@ -134,10 +145,15 @@Cumulative windows
Constant sliding windows
-Second type of windows are these commonly known as running/rolling/moving/sliding windows. This types of windows moves along the index instead of cumulating like a previous one.
+
-Following diagram illustrates running windows of lengthk = 4
. Each of 15 windows contains 4 elements (except first three).Second type of windows are these commonly known as +running/rolling/moving/sliding windows. This types of windows moves +along the index instead of cumulating like a previous one.
-
+Following diagram illustrates running windows of length +k = 4
. Each of 15 windows contains 4 elements (except first +three).To obtain constant sliding windows one just needs to specify
+k
argumentTo obtain constant sliding windows one just needs to specify +
k
argument@@ -100,15 +100,32 @@# summarizing - sum of 4-elements runner( @@ -164,7 +180,18 @@
Constant sliding windows
Windows depending on date
-By default
runner
calculates on assumption that index increments by one, but sometimes data points in dataset are not equally spaced (missing weekends, holidays, other missings) and thus window size should vary to keep expected time frame. If one specifiesidx
argument, than running functions are applied on windows depending on date rather on a sequence 1-n.idx
should be the same length asx
and should be of typeDate
,POSIXt
orinteger
. Example below illustrates window of sizek = 5
lagged bylag = 1
. Note that one can specify alsok = "5 days"
andlag = "day"
as inseq.POSIXt
.
+By default
runner
calculates on assumption that index +increments by one, but sometimes data points in dataset are not equally +spaced (missing weekends, holidays, other missings) and thus window size +should vary to keep expected time frame. If one specifies +idx
argument, than running functions are applied on windows +depending on date rather on a sequence 1-n.idx
should be +the same length asx
and should be of type +Date
,POSIXt
orinteger
. Example +below illustrates window of sizek = 5
lagged by +lag = 1
. Note that one can specify also +k = "5 days"
andlag = "day"
as in +seq.POSIXt
.
In the example below in square brackets ranges for each window.diff --git a/articles/built-in_functions.html b/articles/built-in_functions.html index 5cc4dc9..8eb0601 100644 --- a/articles/built-in_functions.html +++ b/articles/built-in_functions.html @@ -39,7 +39,7 @@@@ -172,7 +199,7 @@Windows depending on date # summarize - mean runner::runner( - x = idx, + x = idx, k = 5, # 5-days window lag = 1, idx = idx, @@ -182,7 +209,7 @@
Windows depending on date # use Date or datetime sequences runner::runner( - x = idx, + x = idx, k = "5 days", # 5-days window lag = 1, idx = Sys.Date() + idx, @@ -191,7 +218,7 @@
Windows depending on date # obtain window from above illustration runner::runner( - x = idx, + x = idx, k = "5 days", lag = 1, idx = Sys.Date() + idx @@ -200,14 +227,20 @@
Windows depending on date
running at
-Runner by default returns vector of the same size as
+x
unless one puts any-size vector toat
argument. Each element ofat
is an index on which runner calculates function. Example below illustrates output of runner forat = c(13, 27, 45, 31)
which gives windows in ranges enclosed in square brackets. Range forat = 27
is[22, 26]
which is not available in current indices.Runner by default returns vector of the same size as
x
+unless one puts any-size vector toat
argument. Each +element ofat
is an index on which runner calculates +function. Example below illustrates output of runner for +at = c(13, 27, 45, 31)
which gives windows in ranges +enclosed in square brackets. Range forat = 27
is +[22, 26]
which is not available in current indices.-idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48) # summary runner::runner( - x = 1:15, + x = 1:15, k = 5, lag = 1, idx = idx, @@ -217,13 +250,21 @@
running at# full window runner::runner( - x = idx, + x = idx, k = 5, lag = 1, idx = idx, at = c(18, 27, 48, 31) )
+
at
can also be specified as interval of the output defined by time interval which results in obtaining results on following indicesseq(min(idx), max(idx), by = "<time interval>")
. Interval can be set in the same way as inseq.POSIXt
function. It’s worth noting thatat
interval shouldn’t be more frequent than interval ofidx
- forDate
the most frequent interval is a"day"
, forPOSIXt
it’s a"sec"
.
at
can also be specified as interval of the output +defined by time interval which results in obtaining results on following +indices +seq(min(idx), max(idx), by = "<time interval>")
. +Interval can be set in the same way as inseq.POSIXt
+function. It’s worth noting thatat
interval shouldn’t be +more frequent than interval ofidx
- forDate
+the most frequent interval is a"day"
, for +POSIXt
it’s a"sec"
.@@ -420,7 +508,7 @@idx_date <- seq(Sys.Date(), Sys.Date() + 365, by = "1 month") @@ -251,20 +292,30 @@
running at
Move and stretch window in time
-One can stretch window length by
+k
and shift in time (or index) usinglag
. Both arguments can beinteger
and also time interval like for example2 months
. Ifk
orlag
are a single value then window size/lag are constant for all elements of x. User can also specifyk/lag
as vector, then size and lag will vary for each window. Bothk
andlag
can be oflength(.) == 1
,length(.) == length(x)
orlength(.) == length(at)
(ifat
is specified).lag
can be negative and positive whilek
only non-negative.One can stretch window length by
k
and shift in time (or +index) usinglag
. Both arguments can be +integer
and also time interval like for example +2 months
. Ifk
orlag
are a +single value then window size/lag are constant for all elements of x. +User can also specifyk/lag
as vector, then size and lag +will vary for each window. Bothk
andlag
can +be oflength(.) == 1
,length(.) == length(x)
+orlength(.) == length(at)
(ifat
is +specified).lag
can be negative and positive while +k
only non-negative.@@ -288,16 +341,27 @@# summarizing - concatenating runner::runner( - x = 1:10, - lag = c(-1, 2, -1, -2, 0, 0, 5, -5, -2, -3), - k = c(0, 1, 1, 1, 1, 5, 5, 5, 5, 5), + x = 1:10, + lag = c(-1, 2, -1, -2, 0, 0, 5, -5, -2, -3), + k = c(0, 1, 1, 1, 1, 5, 5, 5, 5, 5), f = paste, collapse = "," ) # full window runner::runner( - x = 1:10, + x = 1:10, lag = 1, k = c(1, 1, 1, 1, 1, 5, 5, 5, 5, 5) ) @@ -273,13 +324,15 @@
Move and stretch window in timeidx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48) runner::runner( - x = 1:15, - lag = sample(c("-2 days", "-1 days", "1 days", "2 days"), - size = 15, - replace = TRUE), - k = sample(c("5 days", "10 days", "15 days"), - size = 15, - replace = TRUE), + x = 1:15, + lag = sample(c("-2 days", "-1 days", "1 days", "2 days"), + size = 15, + replace = TRUE + ), + k = sample(c("5 days", "10 days", "15 days"), + size = 15, + replace = TRUE + ), idx = Sys.Date() + idx, f = function(x) mean(x) )
Move and stretch window in time
-NA
paddingUsing
+runner
one can also specifyna_pad = TRUE
which would returnNA
for any window which is partially out of range - meaning that there is no sufficient number of observations to fill the window. By defaultna_pad = FALSE
, which means that incomplete windows are calculated anyway.na_pad
is applied on normal cumulative windows and on windows depending on date. In example below two windows exceed range given byidx
so for these windows are empty forna_pad = TRUE
. If used setsna_pad = FALSE
first window will be empty (no single element within[-2, 3]
) and last window will return elements within matchingidx
.Using
runner
one can also specify +na_pad = TRUE
which would returnNA
for any +window which is partially out of range - meaning that there is no +sufficient number of observations to fill the window. By default +na_pad = FALSE
, which means that incomplete windows are +calculated anyway.na_pad
is applied on normal cumulative +windows and on windows depending on date. In example below two windows +exceed range given byidx
so for these windows are empty +forna_pad = TRUE
. If used setsna_pad = FALSE
+first window will be empty (no single element within +[-2, 3]
) and last window will return elements within +matchingidx
.idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48) runner::runner( - x = 1:15, - k = 5, - lag = 1, - idx = idx, + x = 1:15, + k = 5, + lag = 1, + idx = idx, at = c(4, 18, 48, 51), na_pad = TRUE, f = function(x) mean(x) @@ -306,12 +370,17 @@
Using runner with
-data.frame
User can also put
+data.frame
intox
argument and apply functions which involve multiple columns. In example below we calculate beta parameter oflm
model on 1, 2, …, n observations respectively. On the plot one can observe howlm
parameter adapt with increasing number of observation.User can also put
data.frame
intox
+argument and apply functions which involve multiple columns. In example +below we calculate beta parameter oflm
model on 1, 2, …, n +observations respectively. On the plot one can observe how +lm
parameter adapt with increasing number of +observation.-x <- cumsum(rnorm(40)) y <- 3 * x + rnorm(40) date <- Sys.Date() + cumsum(sample(1:3, 40, replace = TRUE)) # unequaly spaced time series -group <- rep(c("a", "b"), 20) +group <- rep(c("a", "b"), 20) df <- data.frame(date, group, y, x) @@ -323,7 +392,12 @@
Using runner with
data.frame
< ) plot(slope)One can also use
+runner
withdplyr
also with problematicgroup_by
operations, without need to apply group_modify. Below we apply grouped 20-days beta, by specifying window lengthk = "10 days"
and providing column name where indices (dates) are kept.One can also use
runner
withdplyr
also +with problematicgroup_by
operations, without need to apply +group_modify. +Below we apply grouped 20-days beta, by specifying window length +k = "10 days"
and providing column name where indices +(dates) are kept.-library(dplyr) @@ -344,7 +418,11 @@
Using runner with
data.frame
< summ %>% ggplot(aes(x = date, y = cumulative_mse, group = group, color = group)) + geom_line()When user executes multiple
+runner
calls indplyr
mutate, one can also userun_by
function to prespecify arguments intidyverse
pipeline. In the example belowrunner
functions are applied onk = "20 days"
calculated on"date"
column.When user executes multiple
runner
calls in +dplyr
mutate, one can also userun_by
function +to prespecify arguments intidyverse
pipeline. In the +example belowrunner
functions are applied on +k = "20 days"
calculated on"date"
column.df %>% group_by(group) %>% @@ -353,29 +431,30 @@
Using runner with
data.frame
< cumulative_mse = runner( x = ., f = function(x) { - mean((residuals(lm(y ~ x, data = x))) ^ 2) + mean((residuals(lm(y ~ x, data = x)))^2) } ), - intercept = runner( x = ., f = function(x) { coefficients(lm(y ~ x, data = x))[1] } ), - slope = runner( x = ., f = function(x) { coefficients(lm(y ~ x, data = x))[2] } - ) + ) )Parallel mode
-The
+runner
function can also compute windows in parallel mode. The function doesn’t initialize the parallel cluster automatically but one have to do this outside and pass it to therunner
throughcl
argument.The
runner
function can also compute windows in parallel +mode. The function doesn’t initialize the parallel cluster automatically +but one have to do this outside and pass it to therunner
+throughcl
argument.-library(parallel) @@ -391,14 +470,23 @@
Parallel mode) stopCluster(cl)
Executing
+runner
in parallel mode isn’t always faster than a single thread. Multiple-thread computation generates some overhead due to managing the nodes. In general, complex functions which bases on processor (e.g. loops) used to be quicker in parallel mode but one should assess itself which option has the edge in specific situation.Executing
runner
in parallel mode isn’t always +faster than a single thread. Multiple-thread computation +generates some overhead due to managing the nodes. In general, +complex functions which bases on processor (e.g. loops) used to be +quicker in parallel mode but one should assess itself which option +has the edge in specific situation.Build-in functions
-With
+runner
one can use any R functions, but some of them are optimized for speed reasons. These functions are:
-- aggregating functions -length_run
,min_run
,max_run
,minmax_run
,sum_run
,mean_run
,streak_run
-- utility functions -fill_run
,lag_run
,which_run
With
runner
one can use any R functions, but some of +them are optimized for speed reasons. These functions are:
+- aggregating functions -length_run
,min_run
, +max_run
,minmax_run
,sum_run
, +mean_run
,streak_run
+- utility functions -fill_run
,lag_run
, +which_run
Build-in functions -
Site built with pkgdown 2.0.6.
+Site built with pkgdown 2.0.7.
Built-in functions
-This tutorial presents built-in functions in runner package which goal is to maximize performance. Even if one can apply any R function with
+runner::runner
, built-in functions are multiple times faster than R equivalent. Before you proceed further to this tutorial, make sure you know what the “running functions are”.This tutorial presents built-in functions in runner package which +goal is to maximize performance. Even if one can apply any R function +with
runner::runner
, built-in functions are multiple times +faster than R equivalent. Before you proceed further to this tutorial, +make sure you know what the “running functions +are”.Running aggregations
Running
-<mean,sum,min,max>_run
Runner provides basic aggregation methods calculated within running windows. Below example showing some functions behavior for different arguments setup.
+min_run
calculates current minimum for all elements of the vector. Let’s take a look at 8’th element of the vector whichmin_run
is calculated on.
-First setup uses default values, so algorithm is looking for minimum value in all elements before actual (i=8). By default missing values are removed before calculations by argumentna_rm = TRUE
, and also window is not specified. The default is equivalent ofbase::cummin
with additional option to ignoreNA
values. In second example within window k=5, the lowest value is -3. In the last example minimum is not available due to existence ofNA
. Graphical example is reproduced below in the code.Runner provides basic aggregation methods calculated within running +windows. Below example showing some functions behavior for different +arguments setup.
min_run
calculates current minimum for all +elements of the vector. Let’s take a look at 8’th element of the vector +whichmin_run
is calculated on.
+First setup uses default values, so algorithm is looking for minimum +value in all elements before actual (i=8). By default missing values are +removed before calculations by argumentna_rm = TRUE
, and +also window is not specified. The default is equivalent of +base::cummin
with additional option to ignore +NA
values. In second example within window k=5, the lowest +value is -3. In the last example minimum is not available due to +existence ofNA
. Graphical example is reproduced below in +the code.+ narm_f = min_run(x, na_rm = FALSE) +)library(runner) @@ -116,9 +133,10 @@
Running
<mean,sum,min,max>_run data.frame( x, - default = min_run(x, na_rm = TRUE), + default = min_run(x, na_rm = TRUE), k_5 = min_run(x, k = 5, na_rm = TRUE), - narm_f = min_run(x, na_rm = FALSE))
-## x default k_5 narm_f ## 1 1 1 1 1 ## 2 -5 -5 -5 -5 @@ -132,21 +150,33 @@
Running
<mean,sum,min,max>_run## 10 NA -5 -1 NA ## 11 -2 -5 -2 NA ## 12 3 -5 -2 NA
In above example constant
+k = 5
has been used which means that for each element, current minimum is calculated on last 5-elements. It may happen that one can have time series where elements are not equally spaced in time, which effects ink = 5
not constant. In example below 5-days sum is calculated. To achieve this, one should put date variable toidx
argument.
-Illustration below shows two sums calculated in 5-days window span. In both cases 5-days fit in 3-elements windows. Equivalent R code below.In above example constant
k = 5
has been used which +means that for each element, current minimum is calculated on last +5-elements. It may happen that one can have time series where elements +are not equally spaced in time, which effects ink = 5
not +constant. In example below 5-days sum is calculated. To achieve this, +one should put date variable toidx
argument.
+Illustration below shows two sums calculated in 5-days window span. In +both cases 5-days fit in 3-elements windows. Equivalent R code +below.x <- c(-0.5910, 0.0266, -1.5166, -1.3627, 1.1785, -0.9342, 1.3236, 0.6249) -idx <- as.Date(c("1970-01-03", "1970-01-06", "1970-01-09", "1970-01-12", - "1970-01-13", "1970-01-16", "1970-01-17", "1970-01-19")) +idx <- as.Date(c( + "1970-01-03", "1970-01-06", "1970-01-09", "1970-01-12", + "1970-01-13", "1970-01-16", "1970-01-17", "1970-01-19" +)) sum_run(x, k = 5, idx = idx)
-## [1] -0.5910 -0.5644 -1.4900 -2.8793 -1.7008 -1.1184 1.5679 1.0143
Specifying
+lag
argument shift of the window by number of elements or time periods (ifidx
is specified).Specifying
lag
argument shift of the window by number of +elements or time periods (ifidx
is specified).x <- c(-0.5910, 0.0266, -1.5166, -1.3627, 1.1785, -0.9342, 1.3236, 0.6249) -idx <- as.Date(c("1970-01-03", "1970-01-06", "1970-01-09", "1970-01-12", - "1970-01-13", "1970-01-16", "1970-01-17", "1970-01-19")) +idx <- as.Date(c( + "1970-01-03", "1970-01-06", "1970-01-09", "1970-01-12", + "1970-01-13", "1970-01-16", "1970-01-17", "1970-01-19" +)) sum_run(x, k = 5, lag = 2, idx = idx)
@@ -154,15 +184,23 @@## [1] NA -0.5910 -0.5644 -1.4900 -1.5166 -0.1842 -0.1842 1.5679
Running
<mean,sum,min,max>_run
Running streak
-To count consecutive elements in specified window one can use
+streak_run
. Following figure illustrates how streak is calculated with three different options setup for 9th element of the input vectorx
. First shows default configuration, with full window andna_rm = TRUE
. Second example count withink = 4
window with count reset onNA
. Last example counting streak with continuation afterNA
. Visualization also supported with corresponding R code.To count consecutive elements in specified window one can use +
streak_run
. Following figure illustrates how streak is +calculated with three different options setup for 9th element of the +input vectorx
. First shows default configuration, with +full window andna_rm = TRUE
. Second example count within +k = 4
window with count reset onNA
. Last +example counting streak with continuation afterNA
. +Visualization also supported with corresponding R code.+ s2 = streak_run(x, k = 4) +)x <- c("A", "B", "A", "A", "B", "B", "B", NA, "B", "B", "A", "B") data.frame( - x, + x, s0 = streak_run(x), s1 = streak_run(x, k = 4, na_rm = FALSE), - s2 = streak_run(x, k = 4))
-## x s0 s1 s2 ## 1 A 1 1 1 ## 2 B 1 1 1 @@ -176,17 +214,25 @@
Running streak## 10 B 5 NA 3 ## 11 A 1 1 1 ## 12 B 1 1 1
Streak is often used in sports to count number of wins or loses of the team/player. To count consecutive wins or loses in 5-days period, one have to specify
+k = 5
and include dates intoidx
argument. Specifyinglag
shifts window bounds by number of elements or time periods (ifidx
is specified).Streak is often used in sports to count number of wins or loses of +the team/player. To count consecutive wins or loses in 5-days period, +one have to specify
k = 5
and include dates into +idx
argument. Specifyinglag
shifts window +bounds by number of elements or time periods (ifidx
is +specified).+ streak_5d_lag = streak_run(x, k = 5, lag = 1, idx = idx) +)x <- c("W", "W", "L", "L", "L", "W", "L", "L") -idx <- as.Date(c("2019-01-03", "2019-01-06", "2019-01-09", "2019-01-12", - "2019-01-13", "2019-01-16", "2019-01-17", "2019-01-19")) +idx <- as.Date(c( + "2019-01-03", "2019-01-06", "2019-01-09", "2019-01-12", + "2019-01-13", "2019-01-16", "2019-01-17", "2019-01-19" +)) data.frame( idx, x, streak_5d = streak_run(x, k = 5, idx = idx), - streak_5d_lag = streak_run(x, k = 5, lag = 1, idx = idx))
## idx x streak_5d streak_5d_lag
## 1 2019-01-03 W 1 NA
## 2 2019-01-06 W 2 1
@@ -204,11 +250,18 @@ Utility functions
Improved lag
-Idea of lag_run
is the same as well known stats::lag
, with distinction that lag_run
can depend on time or any other indexes passed to idx
argument. This means that lag_run
can shift by lag
elements of the vector or by lag
time periods (if idx
is specified).
+Idea of lag_run
is the same as well known
+stats::lag
, with distinction that lag_run
can
+depend on time or any other indexes passed to idx
argument.
+This means that lag_run
can shift by lag
+elements of the vector or by lag
time periods (if
+idx
is specified).
x <- c(-0.5910, 0.0266, -1.5166, -1.3627, 1.1785, -0.9342, 1.3236, 0.6249)
-idx <- as.Date(c("1970-01-03", "1970-01-06", "1970-01-09", "1970-01-12",
- "1970-01-13", "1970-01-16", "1970-01-17", "1970-01-19"))
+idx <- as.Date(c(
+ "1970-01-03", "1970-01-06", "1970-01-09", "1970-01-12",
+ "1970-01-13", "1970-01-16", "1970-01-17", "1970-01-19"
+))
lag_run(x, lag = 3, idx = idx)
## [1] NA -0.5910 0.0266 -1.5166 NA 1.1785 NA -0.9342
@@ -216,14 +269,28 @@ Improved lag
Filling missing values
-Function used to replace NA
with previous non-NA element. To understand how fill_run
works, take a look on illustration. Row ‘x’ represents input, and another rows represent output with NA
replaced by fill_run
with different options setup (run_for_first = TRUE
and only_within = TRUE
respectively). By default, fill_run
replaces all NA
if they were preceded by any value. If NA
appeared in the beginning of the vector then it would not be replaced. But if user specify run_for_first = TRUE
initial empty values values will be replaced by next non-empty value. Option only_within = TRUE
means that NA
values would be replaced if they were surrounded by pair of identical values. No windows provided in this functionality.
+Function used to replace NA
with previous non-NA
+element. To understand how fill_run
works, take a look on
+illustration. Row ‘x’ represents input, and another rows represent
+output with NA
replaced by fill_run
with
+different options setup (run_for_first = TRUE
and
+only_within = TRUE
respectively). By default,
+fill_run
replaces all NA
if they were preceded
+by any value. If NA
appeared in the beginning of the vector
+then it would not be replaced. But if user specify
+run_for_first = TRUE
initial empty values values will be
+replaced by next non-empty value. Option only_within = TRUE
+means that NA
values would be replaced if they were
+surrounded by pair of identical values. No windows provided in this
+functionality.
x <- c(NA, NA, "b", "b", "a", NA, NA, "a", "b", NA, "a", "b")
-data.frame(x,
- f1 = fill_run(x),
- f2 = fill_run(x,run_for_first = TRUE),
- f3 = fill_run(x, only_within = TRUE))
+data.frame(x,
+ f1 = fill_run(x),
+ f2 = fill_run(x, run_for_first = TRUE),
+ f3 = fill_run(x, only_within = TRUE)
+)
## x f1 f2 f3
## 1 <NA> <NA> b <NA>
## 2 <NA> <NA> b <NA>
@@ -241,15 +308,25 @@ Filling missing values
Running which
-To obtain index number of element matching some condition in window, one can use which_run
, which returns index of TRUE
element appeared before n-th element of a vector. If na_rm = TRUE
is specified, missing is treated as FALSE
, and is ignored while searching for TRUE
. While user set na_rm = FALSE
like in second example, function returns NA
, because in following window TRUE
appears after missing and it’s impossible to be certain which is first (missing is an element of unknown value - could be TRUE
or FALSE
).
+To obtain index number of element matching some condition in window,
+one can use which_run
, which returns index of
+TRUE
element appeared before n-th element of a vector. If
+na_rm = TRUE
is specified, missing is treated as
+FALSE
, and is ignored while searching for
+TRUE
. While user set na_rm = FALSE
like in
+second example, function returns NA
, because in following
+window TRUE
appears after missing and it’s impossible to be
+certain which is first (missing is an element of unknown value - could
+be TRUE
or FALSE
).
x <- c(T, T, T, F, NA, T, F, NA, T, F, T, F)
data.frame(
- x,
+ x,
s0 = which_run(x, which = "first"),
s1 = which_run(x, na_rm = FALSE, k = 5, which = "first"),
- s2 = which_run(x, k = 5, which = "last"))
+ s2 = which_run(x, k = 5, which = "last")
+)
## x s0 s1 s2
## 1 TRUE 1 1 1
## 2 TRUE 1 1 2
@@ -263,7 +340,12 @@ Running which## 10 FALSE 1 6 9
## 11 TRUE 1 NA 11
## 12 FALSE 1 NA 11
-which
argument (‘first’ or ‘last’) used with which_run
determines which index of matching element should be returned from window. In below illustration in k = 4
elements window there are two TRUE
values, and depending on which
argument output is equal 2
or 4
.
which
argument (‘first’ or ‘last’) used with
+which_run
determines which index of matching element should
+be returned from window. In below illustration in k = 4
+elements window there are two TRUE
values, and depending on
+which
argument output is equal 2
or
+4
.
Site built with pkgdown 2.0.6.
+Site built with pkgdown 2.0.7.
diff --git a/articles/index.html b/articles/index.html index f0a6026..3454cbb 100644 --- a/articles/index.html +++ b/articles/index.html @@ -23,7 +23,7 @@ @@ -88,7 +88,7 @@The most fundamental function in runner
package is runner
. With runner::runner
one can apply any R function on running windows. This tutorial presents set of examples explaining how to tackle some tasks. Some of the examples are referenced to original topic on stack-overflow.
The most fundamental function in runner
package is
+runner
. With runner::runner
one can apply any
+R function on running windows. This tutorial presents set of examples
+explaining how to tackle some tasks. Some of the examples are referenced
+to original topic on stack-overflow.
library(runner)
library(dplyr)
@@ -175,9 +180,12 @@ Rolling sums for groups w
set.seed(3737)
df <- data.frame(
user_id = c(rep(27, 7), rep(11, 7)),
- date = as.Date(rep(c('2016-01-01', '2016-01-03', '2016-01-05', '2016-01-07',
- '2016-01-10', '2016-01-14', '2016-01-16'), 2)),
- value = round(rnorm(14, 15, 5), 1))
+ date = as.Date(rep(c(
+ "2016-01-01", "2016-01-03", "2016-01-05", "2016-01-07",
+ "2016-01-10", "2016-01-14", "2016-01-16"
+ ), 2)),
+ value = round(rnorm(14, 15, 5), 1)
+)
df %>%
group_by(user_id) %>%
@@ -192,7 +200,8 @@ runner with dplyr
Unique for specified time frame
-
+
library(runner)
library(dplyr)
@@ -216,14 +225,16 @@ Unique for specified time framedf %>%
group_by(user_id) %>%
mutate(
- distinct_7 = runner(category,
- k = "7 days",
- idx = as.Date(date),
- f = function(x) length(unique(x))),
- distinct_14 = runner(category,
- k = "14 days",
- idx = as.Date(date),
- f = function(x) length(unique(x)))
+ distinct_7 = runner(category,
+ k = "7 days",
+ idx = as.Date(date),
+ f = function(x) length(unique(x))
+ ),
+ distinct_14 = runner(category,
+ k = "14 days",
+ idx = as.Date(date),
+ f = function(x) length(unique(x))
+ )
)
grouped_df
+grouped_df
library(runner)
library(dplyr)
-Date <- seq(from = as.Date("2014-01-01"),
- to = as.Date("2019-12-31"),
- by = 'day')
+Date <- seq(
+ from = as.Date("2014-01-01"),
+ to = as.Date("2019-12-31"),
+ by = "day"
+)
market_return <- c(rnorm(2191))
AAPL <- data.frame(
- Company.name = "AAPL",
- Date = Date,
+ Company.name = "AAPL",
+ Date = Date,
market_return = market_return
)
MSFT <- data.frame(
- Company.name = "MSFT",
+ Company.name = "MSFT",
Date = Date,
market_return = market_return
)
df <- rbind(AAPL, MSFT)
df$stock_return <- c(rnorm(4382))
-df <- df[order(df$Date),]
+df <- df[order(df$Date), ]
df2 <- data.frame(
- Company.name2 = c(replicate(450, "AAPL"), replicate(450, "MSFT")),
+ Company.name2 = c(replicate(450, "AAPL"), replicate(450, "MSFT")),
Event_date = sample(
- seq(as.Date('2015/01/01'),
- as.Date('2019/12/31'),
- by = "day"),
- size = 900)
+ seq(as.Date("2015/01/01"),
+ as.Date("2019/12/31"),
+ by = "day"
+ ),
+ size = 900
+ )
)
@@ -299,8 +316,8 @@ Aggregating va
group_by(Company.name2) %>%
mutate(
intercept = runner(
- x = df[df$Company.name == Company.name2[1], ],
- k = "180 days",
+ x = df[df$Company.name == Company.name2[1], ],
+ k = "180 days",
lag = "5 days",
idx = df$Date[df$Company.name == Company.name2[1]],
at = Event_date,
@@ -311,8 +328,8 @@ Aggregating va
}
),
slope = runner(
- x = df[df$Company.name == Company.name2[1], ],
- k = "180 days",
+ x = df[df$Company.name == Company.name2[1], ],
+ k = "180 days",
lag = "5 days",
idx = df$Date[df$Company.name == Company.name2[1]],
at = Event_date,
@@ -344,7 +361,7 @@ Aggregating va
diff --git a/authors.html b/authors.html
index 09e1ee6..2a72303 100644
--- a/authors.html
+++ b/authors.html
@@ -23,7 +23,7 @@
Kałędkowski D (2022). +
Kałędkowski D (2024). runner: Running Operations for Vectors. -R package version 0.4.2. +R package version 0.4.4.
@Manual{, title = {runner: Running Operations for Vectors}, author = {Dawid Kałędkowski}, - year = {2022}, - note = {R package version 0.4.2}, + year = {2024}, + note = {R package version 0.4.4}, }@@ -105,7 +105,7 @@
runner(
- 1:15,
- k = 4,
+ 1:15,
+ k = 4,
lag = 2
)
idx <- Sys.Date() + c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
- x = 1:15,
- k = "5 days",
- lag = "1 days",
+ x = 1:15,
+ k = "5 days",
+ lag = "1 days",
idx = idx
)
idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
- x = idx,
- k = 5,
- lag = 1,
- idx = idx,
+ x = idx,
+ k = 5,
+ lag = 1,
+ idx = idx,
at = c(18, 27, 48, 31)
)
idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner(
- x = idx,
- k = 5,
- lag = 1,
- idx = idx,
+ x = idx,
+ k = 5,
+ lag = 1,
+ idx = idx,
at = c(4, 18, 48, 51),
na_pad = TRUE
)
library(parallel)
-#
+#
numCores <- detectCores()
cl <- makeForkCluster(numCores)
@@ -301,10 +304,8 @@ Developers
@@ -320,7 +321,7 @@ Dev status
diff --git a/news/index.html b/news/index.html
index b28f7aa..cf12a1c 100644
--- a/news/index.html
+++ b/news/index.html
@@ -23,7 +23,7 @@
@@ -70,7 +70,11 @@ Changelog
+
+runner 0.4.22022-09-17
- fix
runner(..., na_pad)
for vectors to return NA
when windows is incomplete. Other methods already consistent.
- fix the problems when calling
runner::runner
using do.call
. (#83 and #84)
@@ -120,7 +124,8 @@ runner 0.3.2
+- added
at
argument to all functions to return output with specific indexes.
+
runner 0.2.12019-10-01
-- added
runner
function which allows to apply custom function on running windows - so far returning only numeric
+- added
runner
function which allows to apply custom function on running windows - so far returning only numeric
+
runner 0.2.02019-03-08
- all functions have additional
idx
argument which allows to compute running windows within specified date/time/indexes range.
@@ -164,7 +170,7 @@ runner 0.2.0
diff --git a/pkgdown.yml b/pkgdown.yml
index 270d8ae..1168289 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -1,9 +1,9 @@
-pandoc: 2.14.2
-pkgdown: 2.0.6
+pandoc: 3.1.11
+pkgdown: 2.0.7
pkgdown_sha: ~
articles:
apply_any_r_function: apply_any_r_function.html
built-in_functions: built-in_functions.html
runner_examples: runner_examples.html
-last_built: 2022-09-13T15:33Z
+last_built: 2024-03-04T06:47Z
diff --git a/reference/dot-check_unresolved_at.html b/reference/dot-check_unresolved_at.html
index bd5e573..8be0d62 100644
--- a/reference/dot-check_unresolved_at.html
+++ b/reference/dot-check_unresolved_at.html
@@ -25,7 +25,7 @@
@@ -92,7 +92,7 @@ Arguments
(integer
, Date
, POSIXt
, character
vector)
Vector of any size and any value defining output data points. Values of the
vector defines the indexes which data is computed at. Can be also POSIXt
-sequence increment used in at
argument in seq.POSIXt
.
+sequence increment used in at
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -117,7 +117,7 @@ Value
diff --git a/reference/dot-check_unresolved_difftime.html b/reference/dot-check_unresolved_difftime.html
index 34574ff..43f4d8e 100644
--- a/reference/dot-check_unresolved_difftime.html
+++ b/reference/dot-check_unresolved_difftime.html
@@ -25,7 +25,7 @@
@@ -93,7 +93,7 @@ Arguments
Denoting size of the running window. If k
is a single value then window
size is constant for all elements, otherwise if length(k) == length(x)
different window size for each element. One can also specify k
in the same
-way as by
argument in seq.POSIXt
.
+way as by
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -118,7 +118,7 @@ Value
diff --git a/reference/dot-check_unresolved_index.html b/reference/dot-check_unresolved_index.html
index 29eaced..26cb87a 100644
--- a/reference/dot-check_unresolved_index.html
+++ b/reference/dot-check_unresolved_index.html
@@ -25,7 +25,7 @@
@@ -118,7 +118,7 @@ Value
diff --git a/reference/dot-is_datetime_valid.html b/reference/dot-is_datetime_valid.html
index 71a64ed..e2df80a 100644
--- a/reference/dot-is_datetime_valid.html
+++ b/reference/dot-is_datetime_valid.html
@@ -23,7 +23,7 @@
@@ -80,8 +80,8 @@ Validate date time character
Arguments
- - (`character`)
-can be anything but suppose to be a character.
+ - x
+(character
) can be anything but suppose to be a character.
@@ -103,7 +103,7 @@ Value
diff --git a/reference/dot-k_by.html b/reference/dot-k_by.html
index 7766e97..3287824 100644
--- a/reference/dot-k_by.html
+++ b/reference/dot-k_by.html
@@ -23,7 +23,7 @@
@@ -85,7 +85,7 @@ Arguments
Denoting size of the running window. If k
is a single value then window
size is constant for all elements, otherwise if length(k) == length(x)
different window size for each element. One can also specify k
in the same
-way as by
argument in seq.POSIXt
.
+way as by
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -105,7 +105,7 @@ Arguments
Examples
- k <- "1 month"
+ k <- "1 month"
idx <- seq(
as.POSIXct("2019-01-01 03:02:01"),
as.POSIXct("2020-01-01 03:02:01"),
@@ -134,7 +134,7 @@ Examples
diff --git a/reference/dot-reformat_k.html b/reference/dot-reformat_k.html
index 97ac921..0671a83 100644
--- a/reference/dot-reformat_k.html
+++ b/reference/dot-reformat_k.html
@@ -26,7 +26,7 @@
@@ -122,7 +122,7 @@ Examples
diff --git a/reference/dot-resolve_arg.html b/reference/dot-resolve_arg.html
index adc496f..5c796e7 100644
--- a/reference/dot-resolve_arg.html
+++ b/reference/dot-resolve_arg.html
@@ -25,7 +25,7 @@
@@ -109,7 +109,7 @@ Value
diff --git a/reference/dot-seq_at.html b/reference/dot-seq_at.html
index 0e69bb7..e522d52 100644
--- a/reference/dot-seq_at.html
+++ b/reference/dot-seq_at.html
@@ -23,7 +23,7 @@
@@ -101,7 +101,7 @@ Arguments
diff --git a/reference/dot-this_group.html b/reference/dot-this_group.html
index 02d2b66..5c6394d 100644
--- a/reference/dot-this_group.html
+++ b/reference/dot-this_group.html
@@ -28,7 +28,7 @@
@@ -116,7 +116,7 @@ Value
diff --git a/reference/fill_run.html b/reference/fill_run.html
index 514a98b..2138919 100644
--- a/reference/fill_run.html
+++ b/reference/fill_run.html
@@ -23,7 +23,7 @@
@@ -108,11 +108,11 @@ Value
Examples
- fill_run(c(NA, NA,1:10, NA, NA), run_for_first = TRUE)
+ fill_run(c(NA, NA, 1:10, NA, NA), run_for_first = TRUE)
#> [1] 1 1 1 2 3 4 5 6 7 8 9 10 10 10
-fill_run(c(NA, NA,1:10, NA, NA), run_for_first = TRUE)
+fill_run(c(NA, NA, 1:10, NA, NA), run_for_first = TRUE)
#> [1] 1 1 1 2 3 4 5 6 7 8 9 10 10 10
-fill_run(c(NA, NA,1:10, NA, NA), run_for_first = FALSE)
+fill_run(c(NA, NA, 1:10, NA, NA), run_for_first = FALSE)
#> [1] NA NA 1 2 3 4 5 6 7 8 9 10 10 10
fill_run(c(NA, NA, 1, 2, NA, NA, 2, 2, NA, NA, 1, NA, NA), run_for_first = TRUE, only_within = TRUE)
#> [1] 1 1 1 2 2 2 2 2 NA NA 1 NA NA
@@ -130,7 +130,7 @@ Examples
diff --git a/reference/index.html b/reference/index.html
index 585de93..d72eaa0 100644
--- a/reference/index.html
+++ b/reference/index.html
@@ -23,7 +23,7 @@
@@ -145,7 +145,7 @@ Utility functions
diff --git a/reference/lag_run.html b/reference/lag_run.html
index 9b02725..beee195 100644
--- a/reference/lag_run.html
+++ b/reference/lag_run.html
@@ -23,7 +23,7 @@
@@ -91,7 +91,7 @@ Arguments
for all elements, otherwise if length(lag) == length(x)
different window
size for each element. Negative value shifts window forward. One can also
specify lag
in the same way as by
argument in
-seq.POSIXt
. See 'Specifying time-intervals' in details
+base::seq.POSIXt()
. See 'Specifying time-intervals' in details
section.
@@ -134,7 +134,7 @@ Examples
diff --git a/reference/length_run.html b/reference/length_run.html
index 556940d..2450a56 100644
--- a/reference/length_run.html
+++ b/reference/length_run.html
@@ -25,7 +25,7 @@
@@ -89,7 +89,7 @@ Arguments
Denoting size of the running window. If k
is a single value then window
size is constant for all elements, otherwise if length(k) == length(x)
different window size for each element. One can also specify k
in the same
-way as by
argument in seq.POSIXt
.
+way as by
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -99,7 +99,7 @@ Arguments
for all elements, otherwise if length(lag) == length(x)
different window
size for each element. Negative value shifts window forward. One can also
specify lag
in the same way as by
argument in
-seq.POSIXt
. See 'Specifying time-intervals' in details
+base::seq.POSIXt()
. See 'Specifying time-intervals' in details
section.
@@ -131,7 +131,7 @@ Examples
diff --git a/reference/max_run.html b/reference/max_run.html
index 5e38e21..aefbd36 100644
--- a/reference/max_run.html
+++ b/reference/max_run.html
@@ -24,7 +24,7 @@
@@ -100,7 +100,7 @@ Arguments
Denoting size of the running window. If k
is a single value then window
size is constant for all elements, otherwise if length(k) == length(x)
different window size for each element. One can also specify k
in the same
-way as by
argument in seq.POSIXt
.
+way as by
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -110,7 +110,7 @@ Arguments
for all elements, otherwise if length(lag) == length(x)
different window
size for each element. Negative value shifts window forward. One can also
specify lag
in the same way as by
argument in
-seq.POSIXt
. See 'Specifying time-intervals' in details
+base::seq.POSIXt()
. See 'Specifying time-intervals' in details
section.
@@ -127,7 +127,7 @@ Arguments
(integer
, Date
, POSIXt
, character
vector)
Vector of any size and any value defining output data points. Values of the
vector defines the indexes which data is computed at. Can be also POSIXt
-sequence increment used in at
argument in seq.POSIXt
.
+sequence increment used in at
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -146,22 +146,22 @@ Arguments
Value
-max numeric vector of length equals length of x
.
+max (numeric
) vector of length equals length of x
.
Examples
set.seed(11)
-x1 <- sample( c(1,2,3), 15, replace=TRUE)
-x2 <- sample( c(NA,1,2,3), 15, replace=TRUE)
-k <- sample( 1:4, 15, replace=TRUE)
+x1 <- sample(c(1, 2, 3), 15, replace = TRUE)
+x2 <- sample(c(NA, 1, 2, 3), 15, replace = TRUE)
+k <- sample(1:4, 15, replace = TRUE)
max_run(x1) # simple cumulative maximum
#> [1] 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3
max_run(x2, na_rm = TRUE) # cumulative maximum with removing NA.
#> [1] 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3
-max_run(x2, na_rm = TRUE, k=4) # maximum in 4-element window
+max_run(x2, na_rm = TRUE, k = 4) # maximum in 4-element window
#> [1] 2 2 2 2 2 2 3 3 3 3 2 2 2 2 2
-max_run(x2, na_rm = FALSE, k=k) # maximum in varying k window size
+max_run(x2, na_rm = FALSE, k = k) # maximum in varying k window size
#> [1] 2 2 NA NA NA NA 3 3 2 1 2 2 2 2 2
@@ -177,7 +177,7 @@ Examples
diff --git a/reference/mean_run.html b/reference/mean_run.html
index 6d3f72b..1747566 100644
--- a/reference/mean_run.html
+++ b/reference/mean_run.html
@@ -23,7 +23,7 @@
@@ -93,9 +93,7 @@ Arguments
k
-(integer
` vector or single value)
-Denoting size of the running window. If k
is a single value then window
-size is constant for all elements, otherwise if length(k) == length(x)
+
(integer`` vector or single value)\cr Denoting size of the running window. If
kis a single value then window size is constant for all elements, otherwise if
length(k) == length(x)`
different window size for each element.
@@ -135,14 +133,14 @@ Arguments
Value
-mean numeric vector of length equals length of x
.
+mean (numeric
) vector of length equals length of x
.
Examples
set.seed(11)
x1 <- rnorm(15)
-x2 <- sample(c(rep(NA,5), rnorm(15)), 15, replace = TRUE)
+x2 <- sample(c(rep(NA, 5), rnorm(15)), 15, replace = TRUE)
k <- sample(1:15, 15, replace = TRUE)
mean_run(x1)
#> [1] -0.5910311 -0.2822184 -0.6936633 -0.8609108 -0.4530308 -0.5332176
@@ -152,11 +150,11 @@ Examples
#> [1] -0.18760011 -0.09022066 -0.06543317 0.03906450 -0.12188853 -0.13873536
#> [7] -0.13873536 -0.14571604 -0.12596067 -0.11116961 -0.09881996 -0.08871569
#> [13] -0.05194292 -0.04699909 -0.05704202
-mean_run(x2, na_rm = FALSE )
+mean_run(x2, na_rm = FALSE)
#> [1] -0.18760011 -0.09022066 -0.06543317 0.03906450 -0.12188853 -0.13873536
#> [7] NA NA NA NA NA NA
#> [13] NA NA NA
-mean_run(x2, na_rm = TRUE, k=4)
+mean_run(x2, na_rm = TRUE, k = 4)
#> [1] -0.18760011 -0.09022066 -0.06543317 0.03906450 -0.10546063 -0.16299272
#> [7] -0.21203756 -0.39209010 -0.13274756 -0.05603811 -0.03894684 0.01103493
#> [13] 0.09609256 0.09738460 0.04740283
@@ -174,7 +172,7 @@ Examples
diff --git a/reference/min_run.html b/reference/min_run.html
index 73687c0..fc2c1d0 100644
--- a/reference/min_run.html
+++ b/reference/min_run.html
@@ -23,7 +23,7 @@
@@ -98,7 +98,7 @@ Arguments
Denoting size of the running window. If k
is a single value then window
size is constant for all elements, otherwise if length(k) == length(x)
different window size for each element. One can also specify k
in the same
-way as by
argument in seq.POSIXt
.
+way as by
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -108,7 +108,7 @@ Arguments
for all elements, otherwise if length(lag) == length(x)
different window
size for each element. Negative value shifts window forward. One can also
specify lag
in the same way as by
argument in
-seq.POSIXt
. See 'Specifying time-intervals' in details
+base::seq.POSIXt()
. See 'Specifying time-intervals' in details
section.
@@ -125,7 +125,7 @@ Arguments
(integer
, Date
, POSIXt
, character
vector)
Vector of any size and any value defining output data points. Values of the
vector defines the indexes which data is computed at. Can be also POSIXt
-sequence increment used in at
argument in seq.POSIXt
.
+sequence increment used in at
argument in base::seq.POSIXt()
.
See 'Specifying time-intervals' in details section.
@@ -144,7 +144,7 @@ Arguments
Value
-min numeric vector of length equals length of x
.
+min (numeric
) vector of length equals length of x
.
@@ -152,7 +152,7 @@ Examples
diff --git a/reference/minmax_run.html b/reference/minmax_run.html
index e0088d2..a7878e6 100644
--- a/reference/minmax_run.html
+++ b/reference/minmax_run.html
@@ -24,7 +24,7 @@
@@ -115,7 +115,7 @@ Value
diff --git a/reference/run_by.html b/reference/run_by.html
index d1bb6b5..85c92b1 100644
--- a/reference/run_by.html
+++ b/reference/run_by.html
@@ -1,7 +1,7 @@
-Set window parameters — run_by • runner Set window parameters — run_by • runner