Skip to content

Grid aggregation

miturbide edited this page Oct 6, 2016 · 6 revisions

Grid aggregation

Grids can be aggregated along any of their dimensions using the function aggregateGrid.

The aggregation functions are specified in the form of a named list of the type FUN = "function", ..., where ... stands for further arguments the can be passed to FUN. This allows for a flexible definition of aggregation functions, that are internally passed to tapply to undertake the aggregation. Note that the name of the function is indicated as a character string. Parallelization options exist for this function.

Several examples follow:

Temporal aggregation

Aggregate grid has different arguments to control how the temporal aggregation is performed. The arguments aggr.d, aggr.m and aggr.y are used to indicate the temporal aggregation function to obtain daily, monthly and annual data respectively. To annually or monthly aggregate data, aggr.d and/or aggr.m functions are specified. Aggregations need to be specified from bottom to top, so for instance, if the data in the grid is sub-daily and aggr.d is not specified, an error will be given for monthly or annual aggregation requests. Similarly, annual aggregations require a previous specification of daily and monthly aggregation, when applicable. Special attributes in the Variable component indicate the aggregation undertaken.

In this example, daily precipitation data is loaded and the annual accumulated precipitation (sum of the daily precipitation) is obtained:

data(NCEP_Iberia_tp)
  • Obtaining the annual accumulated precipitation:
tp_annual <- aggregateGrid(NCEP_Iberia_tp, aggr.d = list(FUN = "sum", na.rm = TRUE),
                                           aggr.m = list(FUN = "sum", na.rm = TRUE),
                                           aggr.y = list(FUN = "sum", na.rm = TRUE))


## Data is already daily: 'aggr.d' option was ignored.
## [2016-05-09 10:31:50] Performing monthly aggregation...
## [2016-05-09 10:31:50] Done.
## [2016-05-09 10:31:50] Performing annual aggregation...
## [2016-05-09 10:31:50] Done.

The message returned by the function reports that the input data is not sub-daily, being the daily aggregation option ignored.

Note that the aggregation is done by steps, this is, data is first monthly aggregated and then annually. In this case, becasue we are applying the same aggregation function in all steps, we can directly perform the annual aggregation from daily data:

tp_annual <- timeAggregation(NCEP_Iberia_tp, list(FUN = "sum", na.rm = TRUE))

## [2016-05-09 10:33:42] Performing annual aggregation...
## [2016-05-09 10:33:42] Done.

To obtain monthly data, argument aggr.y is omitted:

tp_monthly <- aggregateGrid(NCEP_Iberia_tp, aggr.m = list(FUN = "sum", na.rm = TRUE))

## [2016-05-09 10:35:47] Performing monthly aggregation...
## [2016-05-09 10:35:47] Done.

Monthly data can be also the input:

tp_annual <- aggregateGrid(tp_monthly, aggr.y = list(FUN = "sum", na.rm = TRUE))

## [2016-05-09 10:36:56] Performing annual aggregation...
## [2016-05-09 10:36:56] Done.
# Plots of the daily and annual precipitation
par(mfrow = c(1,2))
plotMeanGrid(NCEP_Iberia_tp)
plotMeanGrid(tp_annual)
par(mfrow = c(1,1))

In order to preserve the seasonal information in annual aggregations, an special attribute season is added to the Dates element of the output grid:

attr(tp_annual$Dates, "season")
## [1]  1  2 12

This can be retrieved more conveniently using getSeason:

getSeason(tp_annual)
## [1]  1  2 12

Computing climatologies

A particular case of time aggregation is the calculation of climatologies. This is implemented in function climatology. For instance, this is the climatology of mean (daily) DJF precipitation in Iberia:

tpclim_daily <- climatology(NCEP_Iberia_tp, clim.fun = list(FUN = "mean"))
## [2016-05-09 12:35:22] - Computing climatology...
## [2016-05-09 12:35:22] - Done.
plotMeanGrid(tpclim_daily)

Having calculated the climatology, function plotClimatology can be used for data visualization. This function provides a lot of plotting options, thus, the combination of it with climatology constitutes a sophisticated version of function plotMeanGrid, allowing to define the aggregation function for plotting. Therefore, plotMeanGrid is deprecated.

The following example is the equivalent to function plotMeanGrid:

plotClimatology(climatology(NCEP_Iberia_tp, clim.fun = list(FUN = "mean")), 
                backdrop.theme = "coastline", 
                scales = list(draw = TRUE), 
                main = "total precipitation amount")

In the next example percentile 90 is calculated and plotted:

plotClimatology(climatology(NCEP_Iberia_tp, clim.fun = list(FUN = quantile, 
                probs = .9, na.rm = T)), 
                scales = list(draw = TRUE),
                backdrop.theme = "coastline")

Other climatologies (for instance the monthly or seasonal climatologies of say, accumulated precipitation) can be obtained by combining aggregateGrid and climatology accordingly. subsetGrid may be also used in order to compute monthly values from seasonal time slices...

Note that the time dimension has now length one. The date information plus its attributes defines the climatological period:

str(tpclim_daily$Data)
## num [1, 1, 1:7, 1:9] 2.99 3.11 3.18 3.5 4.16 ...
##  - attr(*, "dimensions")= chr [1:4] "member" "time" "lat" "lon"
##  - attr(*, "climatology:fun")= chr "mean"

tpclim_daily$Dates
## $start
## [1] "1990-12-01 GMT"
##
## $end
## [1] "2000-03-01 GMT"
##
## attr(,"season")
## [1] 12  1  2

An special attribute indicates that the grid is a climatology, and the type of climatology it represents:

attr(tpclim_daily$Data, "climatology:fun") 
## [1] "mean"

When dealing with multi-member grids, the logical argument by.member determines whether we want a different climatology for each member, or a single climatology (computed upon the ensemble mean):

data(tasmax_forecast)
# by.member set to FALSE
clim_hindcast <- climatology(tasmax_forecast,
                             clim.fun = list(FUN = "mean"),
                             by.member = FALSE)
str(clim_hindcast$Data)
## num [1, 1:32, 1:44] 22.3 21.8 21 20.9 21.1 ...
## - attr(*, "dimensions")= chr [1:3] "time" "lat" "lon"
## - attr(*, "climatology:fun")= chr "mean"

# by.member set to TRUE (the default)
clim_hindcast_bymember <- climatology(tasmax_forecast,
                                      clim.fun = list(FUN = "mean"),
                                      by.member = TRUE)
str(clim_hindcast_bymember$Data)
## num [1:9, 1, 1:32, 1:44] 22.2 21.7 21.8 22 22.1 ...
##  - attr(*, "dimensions")= chr [1:4] "member" "time" "lat" "lon"
##  - attr(*, "climatology:fun")= chr "mean"

Spatial aggregation

In a similar vein, spatial aggregation is done using the same interface. Two arguments, aggr.lon and aggr.lat allow special aggregations treating longitude and latitude dimensions separately. For instance, the following request calculates the accumulated winter precipitation averaged by latitude:

tp_lat <- aggregateGrid(NCEP_Iberia_tp, 
                        aggr.y = list(FUN = "sum"),
                        aggr.lon = list(FUN = "mean"))

## [2016-05-09 11:27:43] Performing annual aggregation...
## [2016-05-09 11:27:43] Done.
## [2016-05-09 11:27:43] - Aggregating lon dimension...
## [2016-05-09 11:27:43] - Done.

This is a plot summarizing the information:

# jet color palette (see help grDevices::colorRamp), to mimic image.plot
jet.colors <- colorRampPalette(rev(c("#00007F", "blue", "#007FFF",
                                 "cyan", "#7FFF7F", "yellow",
                                 "#FF7F00", "red", "#7F0000")))
mat <- tp_lat[["Data"]]
yrs <- substr(tp_lat$Dates$start,1,4) # year label as a character string
lats <- format(tp_lat$xyCoords$y, digits = 3) # latitude label as a character string (rounded to first decimal)

image(mat, col = jet.colors(21), axes = FALSE, xlab = "year", ylab = "latitude")
mtext(text = yrs, side = 1, line = 0.3, at = seq(0,1,.1),las = 2)
mtext(text = lats, side = 2, line=0.3, at = seq(0,1,0.165), las=1)
image.plot(mat, col=jet.colors(21), legend.only = TRUE, legend.lab = "DJF precip (mm)")

In order to obtain a complete spatial aggregation (i.e. a single time series for the whole domain and for each member if any), both aggr.lon and aggr.lat options are specified. For instance, the spatial mean of the example dataset (mean winter precipitation over the Iberian Peninsula):

tp_latlonMean <- aggregateGrid(NCEP_Iberia_tp,
                               aggr.lon = list(FUN = "mean"),
                               aggr.lat = list(FUN = "mean"))

Member aggregation

Another typical operation is the aggregation by members, typically the ensemble mean, but also other user-defined functions may be needed, such as a particular percentile, maximum, minimum...

data(S4_Iberia_tas)
# Ensemble mean
mn <- aggregateGrid(grid = S4_Iberia_tas, aggr.mem = list(FUN = "mean", na.rm = TRUE))
# Ensemble median
ens50 <- aggregateGrid(grid = S4_Iberia_tas,
                       aggr.mem = list(FUN = "quantile", probs = 0.5, na.rm = TRUE))
#plot ensemble median
plotClimatology(climatology(ens50, clim.fun = list(FUN = "mean")), 
                backdrop.theme = "countries", 
                scales = list(draw = T))


<-- Home page of the Wiki