-
Notifications
You must be signed in to change notification settings - Fork 55
Quirks
A collection of problems or unexpected behaviour. Pure bugs in DALES should be filed as github issues instead.
30.7.2020, FJ. On SuperMUC, when compiling with SYST=lisa-intel.
Compiling modheterostats.f90 causes an internal compiler error.
Changing -O3
to -O2
allows the compilation to finish.
/lrz/sys/intel/impi2019u6/impi/2019.6.154/intel64/bin/mpiifort \
-I/dss/dsshome1/lrz/sys/spack/release/19.2/opt/x86_avx512/netcdf-fortran/4.4.4-intel-ryxkupm/include \
-r8 -ftz -extend_source -g -traceback -O3 -xHost -module program_modules \
-c /dss/dsshome1/0F/di67mow/dales/src/modheterostats.f90 -o CMakeFiles/dales4.dir/modheterostats.f90.o
20000_28030
catastrophic error: **Internal compiler error: internal abort** Please report
this error along with the circumstances in which it occurred in a Software Problem Report.
Note: File and line given may not be explicit cause of this error.
compilation aborted for /dss/dsshome1/0F/di67mow/dales/src/modheterostats.f90 (code 1)
On Lisa, runs with 16 cores on one node frequently crashed when compiled with the foss/2018b module set.
This seems to be a bug in OpenMPI. With OpenMPI 3.1.3 the problem disappeared. The crash usually happened in transpose_b
or transpose_binv
in modpois.
Such a bug is mentioned in the OpenMP release notes, and
this bug report.
Cray MPI is by default not compatible with programs that launch other programs with system(). In Dales, at least some versions, system() is used to symlink init*latest files to the appropriate init* file. This caused crashes. A possible workaround is to call a function for creating the symlink, this is annoying because the syntax and support for this differs between Fortran compilers.
Some OpenMPI versions report an incorrect wallclock time - not taking CPU frequency variations into account.
Cmake before version 3.0 is not compatible with the Intel Fortran compiler - it adds an obsolete flag -i_dynamic. Using a newer cmake solves this issue.
The FFT Poisson solver may not be correct if the domain size is odd. With odd sizes, there is a different element ordering in the output of the FFT, and the code seems to not take this into account. Test on the rico case, divergence diagnostics: divmax, divtot = 7.31E-10 1.60E-11 (rico, 145x144) divmax, divtot = 3.70E-17 2.28E-11 (rico, 144x144)
divmax for odd sizes is significantly larger, which is a sign that the solution is less accurate. For now, odd domain sizes are not recommended. For future alternative Poisson solvers, we will aim to handle odd sizes correctly, rather than reproducing the current behavior.