You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
When querying large data sets containing a Date column, the text to date
conversion from as.Date() can sometimes be the largest speed bottleneck in the
query. See:
https://stat.ethz.ch/pipermail/r-devel/2013-August/067141.html
Consider using as.Date on unique values of the date only. For example, this
patch to PostgreSQLSupport.R reduced a 130 second query that selection 4
million rows from a table containing a Date column, to only 30 seconds:
diff PostgreSQLSupport.R PostgreSQLSupport.new
412a413,419
> as.Date2 <- function(x) { ## Significantly faster conversion
with duplicated dates
> if (anyDuplicated(x)) {
> ux <- unique(x)
> idx <- match(x, ux)
> return(as.Date(ux)[idx])
> } else as.Date(x)
> }
433c440
< rel[,i] <- as.Date(rel[,i])
---
> rel[,i] <- as.Date2(rel[,i])
Alternatively, consider parsing the character string directly without making a
system call, as in, for example the fasttime::fastPOSIXct function. See
https://stat.ethz.ch/pipermail/r-devel/2013-August/067142.html
Another possibility, perhaps it would be more efficient to have Date data sent
as an integer (days from epoch) rather than a character string.
--Robert
Original issue reported on code.google.com by [email protected] on 16 Aug 2013 at 8:39
The text was updated successfully, but these errors were encountered:
Original issue reported on code.google.com by
[email protected]
on 16 Aug 2013 at 8:39The text was updated successfully, but these errors were encountered: