Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark with DaCe cpu and gpu backends #50

Merged
merged 19 commits into from
Jan 30, 2024
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[submodule "external/gt4py"]
path = external/gt4py
url = https://github.com/gridtools/gt4py.git
[submodule "external/dace"]
[submodule "dacecpufix"]
path = external/dace
url = https://github.com/spcl/dace.git
url = https://github.com/FlorianDeconinck/dace.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really do not love doing this, cannot wait to have it resolved

4 changes: 2 additions & 2 deletions constraints.txt
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ coverage==5.5
# pytest-cov
cytoolz==0.12.1
# via gt4py
dace==0.14.4
dace==0.15.1
# via
# -r requirements_dev.txt
# pace-dsl
Expand Down Expand Up @@ -184,7 +184,7 @@ googleapis-common-protos==1.53.0
# via google-api-core
gprof2dot==2021.2.21
# via pytest-profiling
gridtools-cpp==2.3.0
gridtools-cpp==2.3.1
# via gt4py
h5netcdf==0.11.0
# via -r util/requirements.txt
Expand Down
26 changes: 23 additions & 3 deletions driver/examples/configs/baroclinic_c12_orch_cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,30 @@ performance_config:
nx_tile: 12
nz: 79
dt_atmos: 225
minutes: 5
seconds: 675
layout:
- 1
- 1
- 2
- 2
diagnostics_config:
path: output
output_format: netcdf
names:
- u
- v
- ua
- va
- pt
- delp
- qvapor
- qliquid
- qice
- qrain
- qsnow
- qgraupel
z_select:
- level: 65
names:
- pt
dycore_config:
a_imp: 1.0
beta: 0.
Expand Down
9 changes: 4 additions & 5 deletions driver/examples/configs/baroclinic_c384_cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,10 @@ performance_config:
nx_tile: 384
nz: 79
dt_atmos: 450
minutes: 7
seconds: 30
days: 9
layout:
- 1
- 1
- 16
- 16
diagnostics_config:
path: output
output_format: netcdf
Expand Down Expand Up @@ -72,7 +71,7 @@ dycore_config:
nwat: 6
p_fac: 0.1
rf_cutoff: 800.
rf_fast: false
rf_fast: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has rf_fast=True been implemented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has not, but when run with it set to false, it throws an NotImplementedError when tau != 0. Should this instead be removed or tau == 0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could cherry pick over the not-implemented PR from Florian and remove the options from the yaml config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather change tau than set something in the config that we haven't actually implemented in the model, at least for now. I'll create an issue for it as well though so we can implement a config that's internally consistent.

tau: 5.
vtdm4: 0.06
z_tracer: true
Expand Down
2 changes: 1 addition & 1 deletion driver/examples/configs/baroclinic_c384_gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ dycore_config:
nwat: 6
p_fac: 0.1
rf_cutoff: 800.
rf_fast: false
rf_fast: true
tau: 5.
vtdm4: 0.06
z_tracer: true
Expand Down
2 changes: 1 addition & 1 deletion external/dace
Submodule dace updated 51 files
+2 −0 .github/workflows/fpga-ci.yml
+2 −0 .github/workflows/general-ci.yml
+2 −0 .github/workflows/gpu-ci.yml
+2 −0 .github/workflows/heterogeneous-ci.yml
+75 −0 .github/workflows/pace-build-ci.yml
+38 −48 dace/codegen/CMakeLists.txt
+11 −3 dace/codegen/codegen.py
+142 −86 dace/codegen/compiled_sdfg.py
+2 −2 dace/codegen/cppunparse.py
+3 −0 dace/codegen/targets/cpp.py
+47 −18 dace/codegen/targets/cpu.py
+18 −6 dace/codegen/targets/cuda.py
+10 −5 dace/codegen/tools/get_cuda_arch.cpp
+10 −2 dace/config_schema.yml
+50 −29 dace/dtypes.py
+4 −4 dace/frontend/python/newast.py
+19 −19 dace/frontend/python/replacements.py
+4 −4 dace/libraries/blas/nodes/gemm.py
+2 −2 dace/libraries/sparse/nodes/csrmm.py
+2 −2 dace/libraries/sparse/nodes/csrmv.py
+4 −7 dace/memlet.py
+10 −0 dace/properties.py
+35 −18 dace/runtime/include/dace/cuda/copy.cuh
+12 −4 dace/runtime/include/dace/math.h
+113 −0 dace/runtime/include/dace/nan.h
+7 −1 dace/sdfg/analysis/schedule_tree/sdfg_to_tree.py
+4 −3 dace/sdfg/analysis/schedule_tree/treenodes.py
+5 −3 dace/sdfg/nodes.py
+9 −3 dace/sdfg/sdfg.py
+7 −1 dace/sdfg/state.py
+14 −7 dace/sdfg/validation.py
+15 −5 dace/serialize.py
+3 −2 dace/sourcemap.py
+1 −1 dace/symbolic.py
+79 −1 dace/transformation/passes/analysis.py
+9 −0 dace/transformation/passes/dead_dataflow_elimination.py
+248 −0 dace/transformation/passes/reference_reduction.py
+12 −5 dace/transformation/passes/simplify.py
+9 −7 dace/transformation/transformation.py
+1 −1 dace/version.py
+2 −2 doc/sdfg/ir.rst
+0 −5 requirements.txt
+1 −1 setup.py
+47 −0 tests/codegen/codegen_used_symbols_test.py
+84 −0 tests/codegen/cuda_memcopy_test.py
+11 −3 tests/numpy/common.py
+26 −0 tests/numpy/gpu_test.py
+6 −4 tests/openmp_test.py
+61 −0 tests/passes/access_ranges_test.py
+621 −2 tests/sdfg/reference_test.py
+3 −3 tests/transformations/local_storage_test.py
2 changes: 1 addition & 1 deletion external/gt4py
Submodule gt4py updated 87 files
+1 −1 .gitpod.yml
+38 −0 CODING_GUIDELINES.md
+8 −8 docs/user/next/QuickstartGuide.md
+1 −0 pyproject.toml
+15 −7 src/gt4py/_core/definitions.py
+5 −3 src/gt4py/next/allocators.py
+197 −112 src/gt4py/next/common.py
+8 −7 src/gt4py/next/constructors.py
+22 −2 src/gt4py/next/embedded/common.py
+2 −2 src/gt4py/next/embedded/context.py
+39 −31 src/gt4py/next/embedded/nd_array_field.py
+174 −0 src/gt4py/next/embedded/operators.py
+2 −0 src/gt4py/next/errors/__init__.py
+17 −5 src/gt4py/next/errors/exceptions.py
+1 −1 src/gt4py/next/ffront/ast_passes/simple_assign.py
+1 −1 src/gt4py/next/ffront/ast_passes/single_static_assign.py
+79 −91 src/gt4py/next/ffront/decorator.py
+6 −3 src/gt4py/next/ffront/fbuiltins.py
+1 −1 src/gt4py/next/ffront/foast_introspection.py
+1 −1 src/gt4py/next/ffront/foast_passes/closure_var_folding.py
+67 −66 src/gt4py/next/ffront/foast_passes/type_deduction.py
+1 −1 src/gt4py/next/ffront/foast_pretty_printer.py
+6 −4 src/gt4py/next/ffront/foast_to_itir.py
+12 −11 src/gt4py/next/ffront/func_to_foast.py
+2 −2 src/gt4py/next/ffront/func_to_past.py
+20 −22 src/gt4py/next/ffront/past_passes/type_deduction.py
+20 −16 src/gt4py/next/ffront/past_to_itir.py
+4 −4 src/gt4py/next/ffront/source_utils.py
+1 −1 src/gt4py/next/ffront/type_info.py
+22 −0 src/gt4py/next/field_utils.py
+1 −1 src/gt4py/next/iterator/dispatcher.py
+25 −22 src/gt4py/next/iterator/embedded.py
+4 −4 src/gt4py/next/iterator/ir.py
+1 −1 src/gt4py/next/iterator/ir_utils/ir_makers.py
+1 −1 src/gt4py/next/iterator/runtime.py
+3 −3 src/gt4py/next/iterator/tracing.py
+3 −0 src/gt4py/next/iterator/transforms/collapse_tuple.py
+2 −2 src/gt4py/next/iterator/transforms/cse.py
+2 −2 src/gt4py/next/iterator/transforms/pass_manager.py
+3 −3 src/gt4py/next/iterator/transforms/unroll_reduce.py
+21 −14 src/gt4py/next/iterator/type_inference.py
+1 −1 src/gt4py/next/otf/binding/nanobind.py
+2 −2 src/gt4py/next/otf/compilation/build_systems/cmake_lists.py
+1 −1 src/gt4py/next/otf/compilation/compiler.py
+1 −1 src/gt4py/next/otf/stages.py
+4 −2 src/gt4py/next/otf/workflow.py
+3 −3 src/gt4py/next/program_processors/codegens/gtfn/gtfn_ir_to_gtfn_im_ir.py
+3 −3 src/gt4py/next/program_processors/codegens/gtfn/gtfn_module.py
+10 −10 src/gt4py/next/program_processors/codegens/gtfn/itir_to_gtfn_ir.py
+60 −28 src/gt4py/next/program_processors/processor_interface.py
+152 −97 src/gt4py/next/program_processors/runners/dace_iterator/__init__.py
+5 −0 src/gt4py/next/program_processors/runners/dace_iterator/itir_to_sdfg.py
+162 −262 src/gt4py/next/program_processors/runners/dace_iterator/itir_to_tasklet.py
+1 −1 src/gt4py/next/program_processors/runners/dace_iterator/utility.py
+3 −3 src/gt4py/next/program_processors/runners/gtfn.py
+25 −23 src/gt4py/next/type_system/type_info.py
+17 −17 src/gt4py/next/type_system/type_translation.py
+10 −12 src/gt4py/next/utils.py
+3 −1 tests/next_tests/exclusion_matrices.py
+11 −11 tests/next_tests/integration_tests/cases.py
+2 −0 tests/next_tests/integration_tests/feature_tests/ffront_tests/ffront_test_utils.py
+4 −4 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_arg_call_interface.py
+85 −3 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_execution.py
+19 −0 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_external_local_field.py
+30 −0 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_gt4py_builtins.py
+7 −7 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_math_builtin_execution.py
+2 −2 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_math_unary_builtins.py
+2 −2 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_program.py
+2 −2 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_scalar_if.py
+34 −34 tests/next_tests/integration_tests/feature_tests/ffront_tests/test_type_deduction.py
+1 −1 tests/next_tests/integration_tests/feature_tests/iterator_tests/test_builtins.py
+0 −1 tests/next_tests/integration_tests/feature_tests/iterator_tests/test_conditional.py
+137 −0 tests/next_tests/integration_tests/multi_feature_tests/ffront_tests/test_embedded_regression.py
+2 −2 tests/next_tests/integration_tests/multi_feature_tests/iterator_tests/test_column_stencil.py
+1 −1 tests/next_tests/unit_tests/conftest.py
+13 −1 tests/next_tests/unit_tests/embedded_tests/test_common.py
+5 −6 tests/next_tests/unit_tests/embedded_tests/test_nd_array_field.py
+6 −8 tests/next_tests/unit_tests/ffront_tests/test_func_to_foast.py
+9 −9 tests/next_tests/unit_tests/ffront_tests/test_func_to_past.py
+2 −2 tests/next_tests/unit_tests/ffront_tests/test_past_to_itir.py
+6 −2 tests/next_tests/unit_tests/iterator_tests/test_embedded_internals.py
+1 −1 tests/next_tests/unit_tests/iterator_tests/test_runtime_domain.py
+2 −2 tests/next_tests/unit_tests/program_processor_tests/test_processor_interface.py
+1 −1 tests/next_tests/unit_tests/test_allocators.py
+106 −32 tests/next_tests/unit_tests/test_common.py
+2 −2 tests/next_tests/unit_tests/test_constructors.py
+1 −1 tests/next_tests/unit_tests/type_system_tests/test_type_translation.py
3 changes: 2 additions & 1 deletion physics/pace/physics/stencils/microphysics.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import pace.physics.functions.microphysics_funcs as functions
import pace.util
import pace.util.constants as constants
from pace.dsl.dace.orchestration import orchestrate
from pace.dsl.dace.orchestration import dace_inhibitor, orchestrate
from pace.dsl.stencil import StencilFactory
from pace.dsl.typing import Float, FloatField, FloatFieldIJ, Int
from pace.util import X_DIM, Y_DIM, Z_DIM
Expand Down Expand Up @@ -2227,6 +2227,7 @@ def setupm(self, dt_atmos: float):
self._ces0 = constants.EPS * es0
self._set_timestep(dt_atmos)

@dace_inhibitor
def _update_timestep_if_needed(self, timestep: float):
if timestep != self._timestep:
self._set_timestep(timestep=timestep)
Expand Down
4 changes: 4 additions & 0 deletions tests/main/driver/test_example_configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,10 @@
"baroclinic_c12_orch_cpu.yaml",
"tropical_read_restart_fortran.yml",
"tropicalcyclone_c128.yaml",
"baroclinic_c384_cpu.yaml",
"baroclinic_c384_gpu.yaml",
"baroclinic_c3072_cpu.yaml",
"baroclinic_c3072_gpu.yaml",
bensonr marked this conversation as resolved.
Show resolved Hide resolved
]

JENKINS_CONFIGS_DIR = os.path.join(dirname, "../../../.jenkins/driver_configs/")
Expand Down