@ingroup data_override_mod
- YAML Data Table format
- How to use it?
- Converting legacy data_table to data_table.yaml
- Examples
- External Weight File Structure
- Ensemble and Nest Support
Each entry in the data_table has the following key values:
- grid_name: Name of the grid to interpolate the data to. The acceptable values are "ICE", "OCN", "ATM", and "LND"
- fieldname_in_model: Name of the field as it is in the code to interpolate.
- override_file: Optional subsection with key/value pairs defining how to override from a netcdf file.
- file_name: Name of the file where the variable is located, including the directory
- fieldname_in_file: Name of the field as it is writen in the file
- interp_method: Method used to interpolate the field. The acceptable values are "bilinear", "bicubic", and "none". "none" implies that the field in the file is already in the model grid. The LIMA format is no longer supported
- multi_file: Optional subsection with key/value pairs to use multiple(3) input netcdf files instead of 1. Note that file_name must be the second file in the set when using multiple input netcdf files
- prev_file_name: The name of the first file in the set
- next_file_name: The name of the third file in the set
- external_weights: Optional subsection with key/value pairs defining the external weights file to used for the interpolation.
- file_name: Name of the file where the external weights are located, including the directory
- source: Name of the source that generated the external weights. The only acceptable value is "fregrid"
- factor: Factor that will be multiplied after the data is interpolated
- subregion: Optional subsection with key/value pairs that define a subregion of the model grid to interpolate the data to.
- type: The region type. The acceptable values are "inside_region" and "outside_region"
- lon_start: The starting latitude in the same units as the grid data in the file
- lon_end: The ending latitude in the same units as the grid data in the file
- lat_start: The starting longitude in the same units as the grid data in the file
- lon_end: The ending longitude in the same units as the grid data in the file
In order to use the yaml data format, libyaml needs to be installed and linked with FMS. Additionally, FMS must be compiled with -Duse_yaml macro. If using autotools, you can add --with-yaml
, which will add the macro for you and check that libyaml is linked correctly.
./configure --with-yaml
If using cmake, you can use -DWITH_YAML
.
cmake -DWITH_YAML=on;
To convert legacy data_table to data_table.yaml, the python script, data_table_to_yaml.py
, can be used. Additionally, the python script, is_valid_data_table_yaml.py
, can be used to check if your data_table.yaml file is valid.
4.1 The following example is going to bilnearly interpolate sic_obs
to the ICE
grid using the sic
data from the file: INPUT/hadisst_ice.data.nc
, multiply by a factor of 0.01
, and interpolate to the model time.
In the legacy format, the data_table will look like:
"ICE", "sic_obs", "sic", "INPUT/hadisst_ice.data.nc", "bilinear", 0.01
In the yaml format, the data_table will look like
data_table:
- grid_name : ICE
fieldname_in_model : sic_obs
override_file:
- file_name : INPUT/hadisst_ice.data.nc
fieldname_in_file : sic
interp_method : bilinear
factor : 0.01
Which corresponds to the following model code:
call data_override('ICE', 'sic_obs', icec, Spec_Time)
where:
ICE
is the component domain for which the variable is being interpolated and corresponds to the grid_name in the data_tablesic_obs
corresponds to the fieldname_in_model in the data_tableicec
is the storage array that holds the interpolated dataSpec_Time
is the time to interpolate the data to.
Additionally, it is required to call data_override_init (in this case with the ICE domain). The grid_spec.nc file must also contain the coordinate information for the domain being used.
call data_override_init(Ice_domain_in=Ice_domain)
4.2 The following example is going to simply multiply sic_obs
by a factor of 0.01.
In the legacy format, the data_table will look like:
"ICE", "sit_obs", "", "INPUT/hadisst_ice.data.nc", "none", 2.0
In the yaml format, the data_table will look like:
data_table:
- grid_name : ICE
fieldname_in_model : sit_obs
factor : 0.01
Which corresponds to the following model code:
call data_override('ICE', 'sit_obs', icec, Spec_Time)
where:
ICE
is the component domain for which the variable is being interpolated and corresponds to the grid_name in the data_tablesit_obs
corresponds to the fieldname_in_model in the data_tableicec
is the storage array that holds the interpolated dataSpec_Time
is the time to interpolate the data to.
Additionally, it is required to call data_override_init (in this case with the ICE domain). The grid_spec.nc file is still required to initialize data_override with the ICE domain.
call data_override_init(Ice_domain_in=Ice_domain)
4.3 The following example is an ongrid
case where it will simply read the runoff
data in the ./INPUT/runoff.daitren.clim.nc
file, multiply it by a factor of 1.0
and interpolate to the model time. It assumes that "runoff" in the netcdf file is in the ocean grid.
In the legacy format, the data_table will look like:
"OCN", "runoff", "runoff", "./INPUT/runoff.daitren.clim.nc", "none" , 1.0
In the yaml format, the data_table will look like:
data_table:
- grid_name : OCN
fieldname_in_model : runoff
override_file:
- file_name : INPUT/runoff.daitren.clim.nc
fieldname_in_file : runoff
interp_method : none
factor : 1.0
Which corresponds to the following model code:
call data_override('OCN', 'runoff', runoff_data, Spec_Time)
where:
OCN
is the component domain for which the variable is being interpolated and corresponds to the grid_name in the data_tablerunoff
corresponds to the fieldname_in_model in the data_tablerunoff_data
is the storage array that holds the interpolated dataSpec_Time
is the time to interpolate the data to.
Additionally, it is required to call data_override_init (in this case with the ocean domain). The grid_spec.nc file is still required to initialize data_override with the ocean domain and to determine if the data in the file is in the same grid as the ocean.
call data_override_init(Ocn_domain_in=Ocn_domain)
4.4 The following example uses the multi-file capability
data_table:
- grid_name : ICE
fieldname_in_model : sic_obs
override_file:
- file_name : INPUT/hadisst_ice.data_yr1.nc
fieldname_in_file : sic
interp_method : bilinear
multi_file:
- next_file_name: INPUT/hadisst_ice.data_yr2.nc
prev_file_name: INPUT/hadisst_ice.data_yr0.nc
factor : 0.01
Data override determines which file to use depending on the model time. This is to prevent having to combine the 3 yearly files into one, since the end of the previous file and the beginning of the next file are needed for yearly simulations.
4.5 The following example uses the external weight file capability
data_table:
- grid_name : ICE
fieldname_in_model : sic_obs
override_file:
- file_name : INPUT/hadisst_ice.data.nc
fieldname_in_file : sic
interp_method : bilinear
external_weights:
- file_name: INPUT/remamp_file.nc
source: fregrid
factor : 0.01
5.1 Bilinear weight file example from fregrid
dimensions:
nlon = 5 ;
nlat = 6 ;
three = 3 ;
four = 4 ;
variables:
int index(three, nlat, nlon) ;
double weight(four, nlat, nlon) ;
nlon
andnlat
must be equal to the size of the global domain.index(1,:,:)
corresponds to the index (i) of the longitudes point in the data file, closest to each model lon, latindex(2,:,:)
corresponds to the index (j) of the lattidude point in the data file, closest to each model lon, latindex(3,:,:)
corresponds to the tile (it should be 1 since data_override does not support interpolation from cubesphere grids)- From there the four corners are (i,j), (i,j+1) (i+1) (i+1,j+1)
- The weights for the four corners
- weight(:,:,1) -> (i,j)
- weight(:,:,2) -> (i,j+1)
- weight(:,:,3) -> (i+1,j)
- weight(:,:,4) -> (i+1,j+1)
It may be desired to have each member of an ensemble use a different forcing file. In other to support this, FMS allows for each ensemble member to have its own data_table.yaml. For example, for a run with 2 ensemble members, fms will search for data_table_ens_01.yaml and data_table_ens_02.yaml. However, if both the data_table.yaml and the data_table_ens_* files are present, the code will crash as only 1 option is allowed. Similary, each nest can have its own data_table (data_table_nest_01.yaml), but in this case FMS will not crash if both data_table_nest_01.yaml and data_table.yaml are present. The main grid will use the data_table.yaml and the first nest will use the data_table_nest_01.yaml file.