Skip to content

Commit

Permalink
Update and extend README
Browse files Browse the repository at this point in the history
  • Loading branch information
moebiusband73 committed Nov 13, 2024
1 parent 25efac7 commit 174489d
Showing 1 changed file with 191 additions and 60 deletions.
251 changes: 191 additions & 60 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,88 +4,219 @@
|-----------------------------------------|

MD-Bench is a toolbox for the performance engineering of short-range force
calculation kernels on molecular-dynamics applications. It aims at covering all
available state-of-the-art algorithms from different community codes such as
calculation kernels on molecular-dynamics applications. It aims at covering
state-of-the-art algorithms from different community codes such as
LAMMPS and GROMACS.

## Build instructions
## Getting started

Properly configure your building by changing `config.mk` file. The following
options are available:
Clone the repository from GitHub:

- **TOOLCHAIN:** Compiler toolchain (available options: GCC, CLANG, ICC, ONEAPI, NVCC).
- **ISA:** Instruction set (available options: ARM, X86). Only relevant with
SIMD other than NONE.
- **SIMD:** Instruction set (available options: NONE, SSE, AVX, AVX\_FMA, AVX2, AVX512).
- **MASK\_REGISTERS:** Use AVX512 mask registers (always true when ISA is set to AVX512).
- **OPT\_SCHEME:** Optimization algorithm (available options: verletlist, clusterpair).
- **ENABLE\_LIKWID:** Enable likwid to make use of HPM counters.
- **DATA\_TYPE:** Floating-point precision (available options: SP, DP).
- **DATA\_LAYOUT:** Data layout for atom vector properties (available options: AOS, SOA).
- **ASM\_SYNTAX:** Assembly syntax to use when generating assembly files (available options: ATT, INTEL).
- **DEBUG:** Toggle debug mode.
- **ONE\_ATOM\_TYPE:** Simulate only one atom type and do not perform table lookup for parameters.
- **MEM\_TRACER:** Trace memory addresses for cache simulator.
- **INDEX\_TRACER:** Trace indexes and distances for gather-md.
- **COMPUTE\_STATS:** Compute statistics.
```shell=
git clone https://github.com/RRZE-HPC/MD-Bench.git
```

Edit config.mk and configure the compiler toolchain

Configurations for LAMMPS Verlet Lists optimization scheme:
```makefile=
# Compiler tool chain (GCC/CLANG/ICC/ICX/ONEAPI/NVCC)
TOOLCHAIN ?= CLANG
```

- **ENABLE\_OMP\_SIMD:** Use omp simd pragma on half neighbor-lists kernels.
- **USE\_SIMD\_KERNEL:** Compile kernel with explicit SIMD intrinsics.
Best supported are ICC (deprecated legacy Intel compiler) and ICX (LLVM based
Intel compiler). Choose NVCC to enable CUDA GPU kernels.

Configurations for GROMACS MxN optimization scheme:
The toolchain settings are located in the `./make` directory. Review the
settings for the configured toolchain. You can configure different settings in
`config.mk`, for starters on a X86 based system the defaults are fine.

- **USE\_REFERENCE\_VERSION:** Use reference version (only for correction purposes).
- **XTC\_OUTPUT:** Enable XTC output.
To build the binary call, don't forget to load the compiler module on the
NHR@FAU clusters (e.g. `module load intel`):

Configurations for CUDA:
```shell=
make
```

- **USE\_CUDA\_HOST\_MEMORY:** Use CUDA host memory to optimize host-device transfers.
While the Makefile works with any version of GNU make, some features require GNU
make > v4.

You can run MD-Bench without any arguments:

```shell=
./MDBench-VL-ICX-X86-AVX2-DP
```

When done, just use `make` to compile the code.
You can clean intermediate build results with `make clean`, and all build results with `make distclean`.
You have to call `make clean` before `make` if you changed the build settings.
## Build system

## Usage
MD-Bench uses a Makefile with pattern rules and automatic dependency generation.
If you add source files you do not need to change the Makefile as long as the
sources are placed either in the `./src/verletlist/`, `./src/clusterpair/` or
`./src/common` directories. If you change a file, all object files that depend
on it are rebuild.

Use the following command to run a simulation:
All configuration variables can be overwritten from the command line, e.g. to
build with ICC without changing `./config.mk` build with:

```bash
./MD-Bench-<TAG>-<OPT_SCHEME> [OPTION]...
```shell=
make TOOLCHAIN=ICC
```

Where `TAG` and `OPT_SCHEME` correspond to the building options with the same
name. Without any options, a Copper FCC lattice system with size 32x32x32
(131072 atoms) over 200 time-steps using the Lennard-Jones potential (sigma=1.0,
epsilon=1.0) is simulated.

The default behavior and other options can be changed using the following parameters:

```sh
-p <string>: file to read parameters from (can be specified more than once)
-f <string>: force field (lj or eam), default lj
-i <string>: input file with atom positions (dump)
-e <string>: input file for EAM
-n / --nsteps <int>: set number of timesteps for simulation
-nx/-ny/-nz <int>: set linear dimension of systembox in x/y/z direction
-r / --radius <real>: set cutoff radius
-s / --skin <real>: set skin (verlet buffer)
--freq <real>: processor frequency (GHz)
--vtk <string>: VTK file for visualization
--xtc <string>: XTC file for visualization
Multiple configurations can be build at the same time. Every configuration has a
unique binary name `./MDBENCH-<build tag>`. Intermediate build results are
located in a `./build/build-<build tag>/` directory.

All make targets act on the current configuration set in `./config.mk`, but this
can be of course overwritten on the command line.

Supported make targets:

- `make`: Build the binary for current configuration.
- `make clean`: Remove intermediate build results.
- `make distclean`: Remove intermediate build results and binary. Also removes
generated tags and clangd files, more on that later.
- `make cleanall`: Remove all generated files. **Note**: This target applies to
all configurations.
- `make info` Output compiler version, useful for logging in automated benchmark
scripts.
- `make asm`: Generate assembly output of all source files. The assembly files
are placed in the intermediate build directory.
- `make format`: Reformat all source files with `clang-format` using the format
specification in `.clang-format`.

### Build time options

- `TOOLCHAIN`: Determines which toolchain makefile is included
- `ISA`: No usage apart from tag strings
- `SIMD`: Controls the generation of intrinsic kernels for clusterpair
- `OPT_SCHEME`: Algorithmic variant (verletlist or clusterpair), different
source directories and main routines are used
- `ENABLE_LIKWID`: Turn on LIKWID instrumentation, the LIKWID library has to be available
- `DATA_TYPE`: Switch between single precision and double precision floating
point. This is controlled by defines.
- `DATA_LAYOUT`: Switch between array-of-structure (AOS) and structure-of-array
(SOA) layout for atom positions and forces. Tradeoff between better cache
utilisation and easier SIMD vectorization.
- `DEBUG`: Enable additional debug output
- `SORT_ATOMS`: Resort atoms to ensure that atoms that are nearby are also close
to each other in the data structures
- `EXPLICIT_TYPES`: Default the atom properties are stored in scalar variables.
This option enables to support multiple atom types with different properties.
- `ENABLE_OMP_SIMD`: This enforces the use of `#pragma omp simd` for the
verletlist half-neighbour list force kernel. Without is the Intel compiler (at
least ICC) refuses to do SIMD vectorization.
- `USE_REFERENCE_VERSION`: Enforce usage of C implementation for clusterpair
algorithm for validation
- `USE_CUDA_HOST_MEMORY`: Enable pinned host memory for faster host-device transfers
- `ENABLE_MPI:` Turn on the MPI parallel version of the code

### Build for GPU targets

MD-Bench currently only supports Nvidia GPUs using CUDA kernels. To enable CUDA
kernels you need to specify `NVCC` as toolchain. The CUDA source code is in the
same source directories with Cuda suffix and `.cu` as file type ending. If
`NVCC` is set as toolchain, all supported kernels are automatically set to their
CUDA variants at build time. This means a binary either supports CPU kernels or
GPU kernels.

## Command line arguments

MD-Bench can be executed without any arguments, in this case the full neighbor
list testcase with LJ force will be computed for 200 steps and a size of
32x32x32 unit cells.

- `-p / --params <string>`: file to read parameters from (can be specified more
than once). Default initialization sets parameters for default LJ testcase.
*`-f <string>`: force field (lj, eam), default lj. For anything different than
lj you also need to provide spcific parameter file.
- `-i <string>`: input file with atom positions (dump). MD-Bench supports
Brookhaven protein data bank (.pdb), GROMACS GROMOS87 (.gro), and LAMMPS dump
(.dmp) file formats
- `-e <string>`: input file for EAM parameters
- `-n / --nsteps <int>`: set number of timesteps for simulation (default 200)
- `-nx/-ny/-nz <int>`: set linear dimension of systembox in x/y/z direction
(default 32 in every dimension)
- `-half <int>`: use half (1) or full (0) neighbor lists (default 0 - full
neighbor list)
- `-r / --radius <real>`: set cutoff radius (default 2.5)
- `-s / --skin <real>`: set skin (verlet buffer, default 0.3)
- `-w <file>`: write input atoms to file
- `--freq <real>`: processor frequency (GHz), used to calculate cycle metrics
(default 2.4)
- `--vtk <string>`: VTK output file for visualization

## Available testcases

For all variants you can switch between single precision and double precision
and between AOS versus SOA data layouts using build time options. You can use
the half neighbour list algorithm instead of the default full neighbour list by
setting `-half 1`. To enforce SIMD vectorization for the half neighbour list
algorithm you can set the option `ENABLE_OMP_SIMD=true`.

### Lennard-Jones potential for solid copper

Just start without any command line argument, this is the default testcase. You
may change the number of timesteps using the `-n` options and change the problem
size using the `-nz, -ny, -nz` options.

### EAM potential for solid copper

Call MD-Bench as follows:

```shell=
./MDBench-<TAG> -n 400 -f eam -e ./data/Cu_u3.eam
```

## Examples
Two different EAM variants are available: `Cu_u3.eam` and `Cu_u6.eam`. The EAM
potential is currently only available for verletlist.

### Lennard-Jones potential for melted copper

The melted copper testcase has only 32000 atoms in the default configuration.
Call MD-Bench as follows:

```shell=bash
./MDBench-<TAG> -n 400 -i ./data/copper_melting/input_lj_cu_one_atomtype_20x20x20.dmp
```

### Lennard-Jones potential for melted copper with explicit types

Compile MD-Bench with `EXPLICIT_TYPES=true` in `config.mk`.

TBD
Call MD-Bench as follows:

```shell=bash
./MDBench-<TAG> -n 400 -i ./data/copper_melting/input_lj_cu_one_atomtype_20x20x20.dmp
```

**This testcase currently segvaults!**

### EAM potential for melted copper

Call MD-Bench as follows:

```shell=bash
./MDBench-<TAG> -n 400 -f eam -e ./data/Cu_u3.eam -i ./data/copper_melting/input_eam_cu_one_atomtype_20x20x20.dmp
```

Two different EAM variants are available: `Cu_u3.eam` and `Cu_u6.eam`. The EAM
potential is currently only available for verletlist.t.

### Lennard-Jones potential for argon gas

Call MD-Bench as follows:

```shell=bash
./MDBench-<TAG> -i ./data/argon/input.gro -p ./data/argon/mdbench_params.conf
```

## Citations

Rafael Ravedutti Lucio Machado, Jan Eitzinger, Jan Laukemann, Georg Hager, Harald
Köstler and Gerhard Wellein: MD-Bench: A performance-focused prototyping harness for
state-of-the-art short-range molecular dynamics algorithms. Future Generation
Computer Systems ([FGCS](https://www.sciencedirect.com/journal/future-generation-computer-systems)), Volume 149, 2023, Pages 25-38, ISSN 0167-739X, DOI:
Rafael Ravedutti Lucio Machado, Jan Eitzinger, Jan Laukemann, Georg Hager,
Harald Köstler and Gerhard Wellein: MD-Bench: A performance-focused prototyping
harness for state-of-the-art short-range molecular dynamics algorithms. Future
Generation Computer Systems
([FGCS](https://www.sciencedirect.com/journal/future-generation-computer-systems)),
Volume 149, 2023, Pages 25-38, ISSN 0167-739X, DOI:
[https://doi.org/10.1016/j.future.2023.06.023](https://doi.org/10.1016/j.future.2023.06.023)

Rafael Ravedutti Lucio Machado, Jan Eitzinger, Harald Köstler, and Gerhard
Expand Down

0 comments on commit 174489d

Please sign in to comment.