-
Notifications
You must be signed in to change notification settings - Fork 16
Optimal interpolation
Optimal interpolation (OI) can be used to create gridded analysis by combining a gridded background field (e.g. from an NWP model) with observations at points. OI weights the background and the observations based on the uncertainty of each data source.
Gridpp's OI method does not perform any quality control (QC) on the observations. Check out Titanlib, which contains suitable QC functions for this purpose. For a complete step-by-step guide on creating gridded analysis with titanlib and gridpp, checkout https://tnipen.github.io/2020/06/15/titanlib-gridpp.html.
The OI function signature looks like this:
gridpp.optimal_interpolation(bgrid, background, points, pobs, pratios,
pbackground, structure, max_points)
where bgrid
is the background grid, background
is the 2D background field, points
is the locations of the observations, pobs
is a vector of observations, pratios
is ratio of observation error variance to background error variance, pbackground
is the background interpolated to the observation points, structure
is a structure function object, and max_points
specifies the maximum number of observations to use for a gridpoint.
The structure function specifies how analysis increments are spread spatially. Gridpp supports several structure functions, but typically a gaussian function is used (Barnes 1973):
structure = gridpp.BarnesStructure(100000, 200)
The first argument is the horizontal decorrelation length scale (in meters) and the second is the vertical decorrelation length scale (in meters). Optionally, a third argument specifying the decorrelation length across land area fraction (units 1), and a fourth argumnet specifying the maximum length that an observations will have an effect (in meters, also called the localization radius). If the third is unspecified, land area fractions are ignored in the structure function., If the fourth is unspecified, 3.64 times the horizontal scale is used.
A Cressman structure function is also available, where correlations decrease linearly to 0 at the length scales:
structure = gridpp.CressmanStructure(100000, 200)
Applying the observation operator to the background field results in a vector of values at the observation points. There are several methods to do this, but a typical one is nearest neighbour.
pbackground = gridpp.nearest(bgrid, points, background)
Any of gridpp's downscalers can be used for this, or the users can compute the values using their own preferred method.
OI only requires the ratio of the error variance of the observations and background. Since observations are more trustworth than the background, values less than 1 are typically used. The error variance ratios are passed in as vectors, meaning that different observations can be assigned diferent ratios.
The computation time for OI is dependent on how many observations are available within the localization radius. If the domain has areas with very high density of stations, the calculations in this region can dominate the total computation time. max_points
limits the number of observations that can be used for any particular gridpoint. If more observations are available, only the max_points
stations with the highest correlation are used.
pbackground = gridpp.bilinear(igrid, points, temp_analysis[:, :, 0])
pratios = np.full(points.size(), 0.1)
h = 100000
v = 200
structure = gridpp.BarnesStructure(h, v)
max_points = 50
analysis = gridpp.optimal_interpolation(igrid, temp_analysis[:, :, 0], points,
temp_obs, pratios, pbackground, structure, max_points)
OI can also be applied to transformed variables, by using the gridpp.optimal_interpolation_transform
function. This requires a transform object as well as the specification of the background and observation standard errors (i.e., instead of the error variance ratio as in the untransformed OI). Currently a Box-Cox and logarithmic transformations are available:
transform = gridpp.BoxCox(0.1)
transform = gridpp.Log()
The Box-Cox transformation takes an input argument that specifies the exponent in the transformation.
Here is an example:
threshold = 0.1
transform = gridpp.BoxCox(threshold)
bsigma = 2
psigma = np.full(points.size(), 0.5)
analysis = gridpp.optimal_interpolation_transform(igrid, temp_analysis[:, :, 0], bsigma, points,
temp_obs, psigma, pbackground, structure, max_points, transform)
Gridpp also supports an Ensemble-based Statistical Interpolation (EnSI; Lussana et al. 2019) scheme that uses spatial structure information from an ensemble of NWP model runs. The gridpp.optimal_interpolation_ensi
function can be used for this. It takes a 3D input field (with the ensemble dimension being last) and produces a 3D analysis field. Here, only the observation standard error is needed, since the error variance of the background is deduced from the spread of the ensemble.
pbackground = np.zeros(temp_analysis.shape)
for e in range(temp_analysis.shape[2]):
pbackground[:, :, e] = gridpp.bilinear(igrid, points, temp_analysis[:, :, e])
psigma = np.full(points.size(), 0.5)
analysis = gridpp.optimal_interpolation_ensi(igrid, temp_analysis, points,
temp_obs, psigma, pbackground, structure, max_points)
In verification experiements, it is unecessary to run the OI on a full high resolution grid. INstead one wishes to only output the analysis at points where the analysis will be evaluated at. Gridpp provides OI functions that perform OI only for a set of output points.
analysis = gridpp.optimal_interpolation(points, pbackground, points,
temp_obs, pratios, pbackground, structure, max_points)
Such a setup will generally produce very accurate analyses at the observation points since the same observations have been used as input. Gridpp can compute the analysis in a "leave-one-out" cross-validation fashion. For each output point, the observation at that point is left out of the interpolation. This can be done by defining a cross-validation structure function as follows:
structure = gridpp.Barnes
structure_cv = gridpp.CrossValidation(structure, 2000)
The structure_cv
object can then be passed into the OI as before. This will force the correlation of all observations within a 2000
m radius of an output point to be 0, effectively discarding the observation. This approach is highly efficient for finding the optimal parameters in the OI since the computational cost is very low.