DISCLAIMER: Work in progress
FV3core is a Python version, using GridTools GT4Py with CPU and GPU backend options, of the FV3 dynamical core (fv3gfs-fortran repo).
The code here includes regression test data of computation units coming from serialized output from the Fortran model generated using the GridTools/serialbox
framework.
As of January 10, 2021 this documentation is outdated in that it was written when we had fv3core as its own single repository. Some functionality, such as linting, has been moved to the top level but may still be described in this document as occuring inside the fv3core folder.
WARNING This repo is under active development and relies on code and data that is not publicly available at this point.
- Ensure you have docker installed and available for building and running and has access to the VCM cloud
Be sure to complete any required post-installation instructions (e.g. for linux). Also authorize Docker to pull from gcr. Your user will need to have read access to the us.gcr.io/vcm-ml
repository.
- You can build the image, download the data, and run the tests using:
$ make tests savepoint_tests savepoint_tests_mpi
If you want to develop code, you should also install the linting requirements and git hooks locally
$ pip install -c constraints.txt -r requirements/requirements_lint.txt
$ pre-commit install
## Getting started, in more detail
If you want to build the main fv3core docker image, run
```shell
$ make build
If you want to download test data run
$ make get_test_data
And the c12_6ranks_standard data will download into the test_data
directory.
If you do not have a GCP account, there is an option to download basic test data from a public FTP server and you can skip the GCP authentication step above. To download test data from the FTP server, use make USE_FTP=yes get_test_data
instead and this will avoid fetching from a GCP storage bucket. You will need a valid in stallation of the lftp
command.
MPI parallel tests (that run that way to exercise halo updates in the model) can also be run with:
$ make savepoint_tests_mpi
The environment image that the fv3core container uses is prebuilt and lives in the GCR. The above commands will by default pull this image before building the fv3core image and running the tests. To build the environment from scratch (including GT4py) before running tests, either run
make build_environment
or
$ PULL=False make savepoint_tests
which will execute the target build_environment
for you before running the tests.
There are push_environment
and rebuild_environment
targets, but these should normally not be done manually. Updating the install image should only be done by Jenkins after the tests pass using a new environment.
If you want to run different test data, discover the possible options with
$ make list_test_data_options
This will list the storage buckets in the cloud. Then to run one of them, set EXPERIMENT to the folder name of the data you'd like to use:
e.g.
$EXPERIMENT=c48_6ranks_standard make tests
If you choose an experiment with a different number of ranks than 6, also set NUM_RANKS=<num ranks>
After make savepoint_tests
has been run at least once (or you have data in test_data and the docker image fv3core exists because make build
has been run), you can iterate on code changes using
$ DEV=y make savepoint_tests
or for the parallel or non-savepoint tests:
$ DEV=y make tests savepoint_tests_mpi
These will mount your current code into the fv3core container and run it rather than the code that was built when make build
ran.
If you to prefer to work interactively inside the fv3core container, get the test data and build the docker image (see above if you do not have a GCP account and want to get test data):
$ make get_test_data
$ make build
Testing can be run with this data from /port_dev
inside the container:
$ make dev
Then in the container:
$ pytest -v -s --data_path=/test_data/ /port_dev/tests --which_modules=<stencil name>
The 'stencil name' can be determined from the associated Translate class. e.g. TranslateXPPM is a test class that translate data serialized from a run of the fortran model, and 'XPPM' is the name you can use with --which_modules.
All of the make endpoints involved running tests can be prefixed with the TEST_ARGS
environment variable to set test options or pytest CLI args (see below) when running inside the container.
-
--which_modules <modules to run tests for>
- comma separated list of which modules to test (defaults to running all of them). -
--print_failures
- if your test fails, it will only report the first datapoint. If you want all the nonmatching regression data to print out (so you can see if there are patterns, e.g. just incorrect for the first 'i' or whatever'), this will print out for every failing test all the non-matching data. -
--failure_stride
- when printing failures, print every n failures only. -
--data_path
- path to where you have theGenerator*.dat
and*.json
serialization regression data. Defaults to current directory. -
--backend
- which backend to use for the computation. Options:[numpy, gt:cpu_ifirst, gt:cpu_first, gt:gpu, cuda]
. Defaults tonumpy
. -
--python_regression
- Run the tests that have Python based regression data. Only applies to running parallel tests (savepoint_tests_mpi) Pytest provides a lot of options, which you can see bypytest --help
. Here are some common options for our tests, which you can add toTEST_ARGS
: -
-r
- is used to report test types other than failure. It can be provideds
for skipped (e.g. tests which were not run because earlier tests of the same stencil failed),x
for xfail or "expected to fail" tests (like tests with no translate class), orp
for pass. For example, to report skipped and xfail tests you would use-rsx
. -
--disable-warnings
- will stop all warnings from being printed at the end of the tests, for example warnings that translate classes are not yet implemented. -
-v
- will increase test verbosity, while-q
will decrease it. -
-s
- will let stdout print directly to console instead of capturing the output and printing it when a test fails only. Note that logger lines will always be printed both during (by setting log_cli in our pytest.ini file) and after tests. -
-m
- will let you run only certain groups of tests. For example,-m=parallel
will run only parallel stencils, while-m=sequential
will run only stencils that operate on one rank at a time. -
--threshold_overrides_file
- will read a yaml file with error thresholds specified for specific backend and platform (docker or metal) configurations, overriding the max_error thresholds defined in the Translate classes. Format of the yaml file is described here. -
--dperiodic
- run tests on a doubly-periodic domain. Will look for only one tile's worth of test data and parallel tests will be run with a TileCommunicator instead of a CubedSphereCommunicator.
NOTE: FV3 is current assumed to be by default in a "development mode", where stencils are checked each time they execute for code changes (which can trigger regeneration). This process is somewhat expensive, so there is an option to put FV3 in a performance mode by telling it that stencils should not automatically be rebuilt:
$ export FV3_STENCIL_REBUILD_FLAG=False
- Find the location in the fv3gfs-fortran repo code where the save-point is to be added, e.g. using
$ git grep <stencil_name> <checkout of fv3gfs-fortran>
- Create a
translate
class from the serialized save-point data to a call to the stencil or function that calls the relevant stencil(s).
These are usually named tests/savepoint/translate/translate_<lowercase name>
Import this class in the tests/savepoint/translate/__init__.py
file
- Write a Python function wrapper that the translate function (created above) calls.
By convention, we name these fv3core/stencils/<lower case stencil name>.py
- Run the test, either with one name or a comma-separated list
$ make dev_tests TEST_ARGS="-–which_modules=<stencil name(s)>"
Please also review the Porting conventions section for additional explanation
To build the us.gcr.io/vcm-ml/fv3core
image with required dependencies for running the Python code, run
$ make build
Add PULL=False
to build from scratch without running docker pull
:
PULL=False make build
-
https://github.com/GridTools/serialbox - Serialbox generates serialized data when the Fortran model runs and has bindings to manage data from Python
-
https://github.com/VulcanClimateModeling/fv3gfs-fortran - This is the existing Fortran model decorated with serialization statements from which the test data is generated
-
https://github.com/GridTools/gt4py - Python package for the DSL language
-
https://github.com/VulcanClimateModeling/util Python specific model functionality, such as halo updates.
-
https://github.com/VulcanClimateModeling/fv3gfs-wrapper A Python based wrapper for running the Fortran version of the FV3GFS model.
Some of these are submodules. While tests can work without these, it may be necessary for development to have these as well. To add these to the local repository, run
$ git submodule update --init
The submodules include:
external/util
- [email protected]:VulcanClimateModeling/util.gitexternal/daint_venv
- [email protected]:VulcanClimateModeling/daint_venv.git
There are two main docker files:
-
docker/dependencies.Dockerfile
- defines dependency images such as for mpi, serialbox, and GT4py -
docker/Dockerfile
- uses the dependencies to define the final fv3core images.
The dependencies are separated out into their own images to expedite rebuilding the docker image without having to rebuild dependencies, especially on CI.
For the commands below using make -C docker
, you can alternatively run make
from within the docker
directory.
These dependencies can be updated, pushed, and pulled with make -C docker build_deps
, make -C docker push_deps
, and make -C docker pull_deps
. The tag of the dependencies is based on the tag of the current build in the Makefile, which we will expand on below.
Building from scratch requires both a deps and build command, such as make -C docker pull_deps fv3core_image
.
If any example fails for "pulled dependencies", it means the dependencies have never been built. You can build them and push them to GCR with:
$ make -C docker build_deps push_deps
fv3core image with pulled dependencies:
$ make -C docker pull_deps fv3core_image
CUDA-enabled fv3core image with pulled dependencies:
$ CUDA=y make -C docker pull_deps fv3core_image
fv3core image with locally-built dependencies:
$ make -C docker build_deps fv3core_image
If you need to install an updated version of Serialbox, you must first install cmake into the development environment. To install an updated version of Serialbox from within the container run
$ wget https://github.com/Kitware/CMake/releases/download/v3.17.3/cmake-3.17.3.tar.gz && \
tar xzf cmake-3.17.3.tar.gz && \
cd cmake-3.17.3 && \
./bootstrap && make -j4 && make install
$ git clone -b v2.6.1 --depth 1 https://github.com/GridTools/serialbox.git /tmp/serialbox
$ cd /tmp/serialbox
$ cmake -B build -S /tmp/serialbox -DSERIALBOX_USE_NETCDF=ON -DSERIALBOX_TESTING=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/serialbox
$ cmake --build build/ -j $(nproc) --target install
$ cd -
$ rm -rf build /tmp/serialbox
Dependencies are pinned using constraints.txt
. This is auto-generated by pip-compile from the pip-tools
package, which reads requirements.txt
and requirements/requirements_lint.txt
, determines the latest versions of all dependencies (including recursive dependencies) compatible those files, and writes pinned versions for all dependencies. This can be updated using:
$ make constraints.txt
This file is committed to the repository, and gives more reproducible tests if an old commit of the repository is checked out in the future. The constraints are followed when creating the fv3core
docker images. To ensure consistency this should ideally be run from inside a docker development environment, but you can also run it on your local system with an appropriate Python 3 environment.
To develop fv3core, you need to install the linting requirements in requirements/requirements_lint.txt
. To install the pinned versions, use:
$ pip install -c constraints.txt -r requirements/requirements_lint.txt
This adds pre-commit
, which we use to lint and enforce style on the code. The first time you install pre-commit
, install its git hooks using:
$ pre-commit install
pre-commit installed at .git/hooks/pre-commit
As a convenience, the lint
target of the top-level makefile executes pre-commit run --all-files
.
Linting, which formats files and checks for some style conventions, is required, as the same checks are the first step in the continuous integration testing that happens when creating a pull request.
Linting locally saves time and literal energy, since CI tests do not have to be launched so many times!
Please see the 'Development Guidelines' below for more information on the structure of the code to align your new code with the current conventions, as well as the CONTRIBUTING.md document for style guidelines.
FV3Core does not actually use the GridTools/gt4py main, it instead uses a Vulcan Climate Modeling development branch. This is publically available version at VCM/gt4py.
Situation: There is a new stable feature in a gt4py PR, but it is not yet merged into the GridTools/gt4py main branch. branches.cfg lists these features. Steps:
- Add any new branches to
branches.cfg
- Rebuild the develop branch, either:
a.
make_develop gt4py-dev path/to/branches.cfg
(you may have to resolve conflicts...) b. Adding new commits on top of the existing develop branch (e.g. merge or cherry-pick) - Force push to the develop branch:
git push -f upstream develop
The last step will launch Jenkins tests. If these pass:
- Create a git tag:
git tag v-$(git rev-parse --short HEAD)
- Push the tag:
git push upstream --tags
- Make a PR to VCM/gt4py that updates the version in
docker/Makefile
to the new tag.
FV3Core is provided under the terms of the GPLv3 license.
The main functionality of the FV3 dynamical core, which has been ported from the Fortran version in the fv3gfs-fortran repo, is defined using GT4py stencils and python 'compute' functions in fv3core/stencils. The core is comprised of units of calculations defined for regression testing. These were initially generally separated into distinct files in fv3core/stencils with corresponding files in tests/savepoint/translate/translate_.py defining the translation of variables from Fortran to Python. Exceptions exist in cases where topical and logical grouping allowed for code reuse. As refactors optimize the model, these units may be merged to occupy the same files and even methods/stencils, but the units should still be tested separately, unless determined to be redundant.
The core has most of its calculations happening in GT4py stencils, but there are still several instances of operations happening in Python directly, which will need to be replaced with GT4py code for optimal performance.
The namelist and grid are global variables defined in fv3core/_config.py The namelist is 'flattened' so that the grouping name of the option is not required to access the data (we may want to change this).
The grid variables are mostly 2d variables and are 'global' to the model thread per mpi rank. The grid object also contains domain and layout information relevant to the current rank being operated on.
Utility functions in fv3core/utils/
include:
gt4py_utils.py
:- default gt4py and model settings
- methods for generating gt4py storages
- methods for using numpy and cupy arrays in python functions that have not been put into GT4py
- methods for handling complex patterns that did not immediately map to gt4py, and will mostly be removed with future refactors (e.g. k_split_run)
- some general model math computations (e.g. great_circle_dist), that will eventually be put into gt4py with a future refactor
grid.py
:- A Grid class definition that provides information about the grid layout, current tile informationm access to grid variables used globally, and convenience methods related to tile indexing, origins and domains commonly used
- A grid is defined for each MPI rank (minimum 6 ranks, 1 for each tile face of the cubed sphere grid represnting the whole Earth)
- Also provides functionality for generating a Quantity object used for halo updates and other utilities
corners
: port of corner calculations, initially direct Python calculations, being replaced with GT4py gtscript functions as the GT4py regions feature is implementedmpi.py
: a wrapper for importing mpi4py when availableglobal_constants.py
: constants for use throughout the modeltyping.py
: Clean names for common types we use in the model. This is new and hasn't been adopted throughout the model yet, but will eventually be our standard. A shorthand 'sd' has been used in the intial version.
The tests/
directory currently includes a framework for translating fields serialized (using
Serialbox from GridTools) from a Fortran run into gt4py storages that can be inputs to
fv3core unit computations, and compares the results of the ported code to serialized
data following a unit computation.
The docker/
directory provides Dockerfiles for building a repeatable environment in which
to run the core
The external/
directory is for submoduled repos that provide essential functionality
The build system uses Makefiles following the convention of other repos within VulcanClimateModeling.
The top level functions fv_dynamics and fv_sugridz can currenty only be run in parallel using mpi with a minimum of 6 ranks (there are a few other units that also require this, e.g. whenever there is a halo update involved in a unit). These are the interface to the rest of the model and currently have different conventions than the rest of the model.
- A 'state' object (currently a SimpleNamespace) stores pointers to the allocated data fields
- Most functions within dyn_core can be run sequentially per rank
- Currently a list of ArgSpecs must decorate an interface function, where each ArgSpec provides useful information about the argument, e.g.:
@state_inputs( ArgSpec("qvapor", "specific_humidity", "kg/kg", intent="inout")
- The format is (fortran_name, long_name, units, intent)
- We currently provide a duplicate of most of the metadata in the specification of the unit test, but that may be removed eventually.
- Then the function itself, e.g. fv_dynamics, has arguments of 'state', 'comm' (the communicator) and all of the scalar parameters being provided.
Generation of regression data occurs in the fv3gfs-fortran repo (https://github.com/VulcanClimateModeling/fv3gfs-fortran) with serialization statements and a build procedure defined in tests/serialized_test_data_generation
. The version of data this repo currently tests against is defined in FORTRAN_SERIALIZED_DATA_VERSION
in this repo's docker/Makefile.image_names
. Fields serialized are defined in Fortran code with serialization comment statements such as:
!$ser savepoint C_SW-In
!$ser data delpcd=delpc delpd=delp ptcd=ptc
where the name being assigned is the name the fv3core uses to identify the variable in the test code. When this name is not equal to the name of the variable, this was usually done to avoid conflicts with other parts of the code where the same name is used to reference a differently sized field.
The majority of the logic for translating from data serialized from Fortran to something that can be used by Python, and the comparison of the results, is encompassed by the main Translate class in the tests/savepoint/translate/translate.py file. Any units not involving a halo update can be run using this framework, while those that need to be run in parallel can look to the ParallelTranslate class as the parent class in tests/savepoint/translate/parallel_translate.py. These parent classes provide generally useful operations for translating serialized data between Fortran and Python specifications, and for applying regression tests.
A new unit test can be defined as a new child class of one of these, with a naming convention of Translate<Savepoint Name>
where Savepoint Name
is the name used in the serialization statements in the Fortran code, without the -In
and -Out
part of the name. A translate class can usually be minimally specify the input and output fields. Then, in cases where the parent compute function is insuffient to handle the complexity of either the data translation or the compute function, the appropriate methods can be overridden.
For Translate objects
- The init function establishes the assumed translation setup for the class, which can be dynamically overridden as needed.
- the parent compute function does:
- Makes gt4py storages of the max shape (grid.npx+1, grid.npy+1, grid.npz+1) aligning the data based on the start indices specified. (gt4py requires data fields have the same shape, so in this model we have buffer points so all calculations can be done easily without worrying about shape matching).
- runs the compute function (defined in self.compute_func) on the input data storages
- slices the computed Python fields to be compared to fortran regression data
- The unit test then uses a modified relative error metric to determine whether the unit passes
- The init method for a Translate class:
- The input (self.in_vars["data_vars"]) and output(self.out_vars) variables are specified in dictionaries, where the keys are the name of the variable used in the model and the values are dictionaries specifying metadata for translation of serialized data to gt4py storages. The metadata that can be specied to override defaults are:
- Indices to line up data arrays into gt4py storages (which all get created as the max possible size needed by all operations, for simplicity): "istart", "iend", "jstart", "jend", "kstart", "kend". These should be set using the 'grid' object available to the Translate object, using equivalent index names as in the declaration of variables in the Fortran code, e.g. real:: cx(bd%is:bd%ie+1,bd%jsd:bd%jed ) means we should assign. Example:
self.in_vars["data_vars"]["cx"] = {"istart": self.is\_, "iend": self.ie + 1,
"jstart": self.jsd, "jend": self.jed,}
- There is only a limited set of Fortran shapes declared, so abstractions defined in the grid can also be used,
e.g.:
self.out_vars["cx"] = self.grid.x3d_compute_domain_y_dict()
. Note that the variables, e.g.grid.is\_
andgrid.ie
specify the 'compute' domain in the x direction of the current tile, equivalent tobd%is
andbd%ie
in the Fortran model EXCEPT that the Python variables are local to the current MPI rank (a subset of the tile face), while the Fortran values are global to the tile face. This is because these indices are used to slice into fields, which in Python is 0-based, and in Fortran is based on however the variables are declared. But, for the purposes of aligning data for computations and comparisons, we can match them in this framework. Shapes need to be defined in a dictionary per variable including"istart"
,"iend"
,"jstart"
,"jend"
,"kstart"
,"kend"
that represent the shape of that variable as defined in the Fortran code. The default shape assumed if a variable is specified with an empty dictionary isisd:ied, jsd:jed, 0:npz - 1
inclusive, and variables that aren't that shape in the Fortran code need to have the 'start' indices specified for the in_vars dictionary , and 'start' and 'end' for the out_vars."serialname"
can be used to specify a name used in the Fortran code declaration if we'd like the model to use a different name"kaxis"
: which dimension is the vertical direction. For most variables this is '2' and does not need to be specified. For Fortran variables that assign the vertical dimension to a different axis, this can be set to ensure we end up with 3d storages that have the vertical dimension where it is expected by GT4py."dummy_axes"
: If set this will set of the storage to have singleton dimensions in the axes defined. This is to enable testing stencils where the full 3d data has not been collected and we want to run stencil tests on the data for a particular slice."names_4d"
: If a 4d variable is being serialized, this can be set to specify the names of each 3d field. By default this is the list of tracers.- input variables that are scalars should be added to
self.in_vars["parameters"]
self.compute_func
is the name of the model function that should be run by the compute method in the translate classself.max_error
overrides the parent classes relative error threshold. This should only be changed when the reasons for non-bit reproducibility are understood.self.max_shape
sets the size of the gt4py storage created for testingself.ignore_near_zero_errors[<varname>] = True
: This is an option to let some fields pass with higher relative error if the absolute error is very smallself.skip_test
: This is an option to jump over the test case, to be used in the override file for temporary deactivation of tests.
For ParallelTranslate
objects:
- Inputs and outputs are defined at the class level, and these include metadata such as the "name" (e.g. understandable name for the symbol), dimensions, units and n_halo(numb er of halo lines)
- Both
compute_sequential
andcompute_parallel
methods may be defined, where a mock communicator is used in thecompute_sequential
case - The parent assumes a state object for tracking fields and methods exist for translating from inputs to a state object and extracting the output variables from the state. It is assumed that Quantity objects are needed in the model method in order to do halo updates.
ParallelTranslate2Py
is a slight variation of this used for many of the parallel units that do not yet utilize a state object and relies on the specification of the same index metadata of the Translate classesParallelTranslateBaseSlicing
makes use of the state but relies on the Translate object of self._base, a Translate class object, to align the data before making quantities, computing and comparing.
Pytest can be configured to give you a pdb session when a test fails. To route this properly through docker, you can run:
TEST_ARGS="-v -s --pdb" RUN_FLAGS="--rm -it" make tests
This can be done with any pytest target, such as make savepoint_tests
and make savepoint_tests_mpi
.
The GeosDycoreWrapper
class provides an API to run the dynamical core in a Python component of a GEOS model run. A GeosDycoreWrapper
object is initialized with a namelist, communicator, and backend, which creates the communicators, partitioners, dycore state, and dycore object required to run the Pace dycore. A wrapper object takes numpy arrays of u, v, w, delz, pt, delp, q, ps, pe, pk, peln, pkz, phis, q_con, omga, ua, va, uc, vc, mfxd, mfyd, cxd, cyd,
and diss_estd
and returns a dictionary containing numpy arrays of those same variables. Wrapper objects contain a timer
attrubite that tracks the amount of time moving input data to the dycore state, running the dynamical core, and retrieving the data from the state.