Skip to content

Commit

Permalink
mpi: Add LAMMPS example and do some renaming
Browse files Browse the repository at this point in the history
  • Loading branch information
simo-tuomisto committed Oct 10, 2024
1 parent 43c6c11 commit db97eaa
Show file tree
Hide file tree
Showing 6 changed files with 209 additions and 18 deletions.
68 changes: 68 additions & 0 deletions content/examples/lammps-openmpi.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
Bootstrap: docker
From: ubuntu:latest

%arguments

NPROCS=4
OPENMPI_VERSION=4.1.6
LAMMPS_VERSION=29Aug2024

%post

### Install OpenMPI dependencies

apt-get update
apt-get install -y wget bash gcc gfortran g++ make file bzip2 ca-certificates libucx-dev

### Build OpenMPI

OPENMPI_VERSION_SHORT=$(echo {{ OPENMPI_VERSION }} | cut -f 1-2 -d '.')
cd /opt
mkdir ompi
wget -q https://download.open-mpi.org/release/open-mpi/v${OPENMPI_VERSION_SHORT}/openmpi-{{ OPENMPI_VERSION }}.tar.bz2
tar -xvf openmpi-{{ OPENMPI_VERSION }}.tar.bz2
# Compile and install
cd openmpi-{{ OPENMPI_VERSION }}
./configure --prefix=/opt/ompi --with-ucx=/usr
make -j{{ NPROCS }}
make install
cd ..
rm -rf openmpi-{{ OPENMPI_VERSION }} openmpi-{{ OPENMPI_VERSION }}.tar.bz2

### Build example application

# Install LAMMPS dependencies
apt-get install -y cmake

export OMPI_DIR=/opt/ompi
export PATH="$OMPI_DIR/bin:$PATH"
export LD_LIBRARY_PATH="$OMPI_DIR/lib:$LD_LIBRARY_PATH"
export CMAKE_PREFIX_PATH="$OMPI_DIR:$CMAKE_PREFIX_PATH"

# Build LAMMPS
cd /opt
wget -q https://download.lammps.org/tars/lammps-{{ LAMMPS_VERSION }}.tar.gz
tar xf lammps-{{ LAMMPS_VERSION }}.tar.gz
cd lammps-{{ LAMMPS_VERSION }}
cmake -S cmake -B build \
-DCMAKE_INSTALL_PREFIX=/opt/lammps \
-DBUILD_MPI=yes \
-DBUILD_OMP=yes
cmake --build build --parallel {{ NPROCS }} --target install
cp -r examples /opt/lammps/examples
cd ..
rm -rf lammps-{{ LAMMPS_VERSION }} lammps-{{ LAMMPS_VERSION }}.tar.gz

%environment
export OMPI_DIR=/opt/ompi
export PATH="$OMPI_DIR/bin:$PATH"
export LD_LIBRARY_PATH="$OMPI_DIR/lib:$LD_LIBRARY_PATH"
export MANPATH="$OMPI_DIR/share/man:$MANPATH"

export LAMMPS_DIR=/opt/lammps
export PATH="$LAMMPS_DIR/bin:$PATH"
export LD_LIBRARY_PATH="$LAMMPS_DIR/lib:$LD_LIBRARY_PATH"
export MANPATH="$LAMMPS_DIR/share/man:$MANPATH"

%runscript
exec /opt/lammps/bin/lmp "$@"
2 changes: 1 addition & 1 deletion content/examples/lumi-mpich.def
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ from: ubuntu:latest

%post

### Install OpenMPI dependencies
### Install MPICH dependencies

apt-get update
apt-get install -y file g++ gcc gfortran make gdb strace wget ca-certificates --no-install-recommends
Expand Down
File renamed without changes.
19 changes: 19 additions & 0 deletions content/examples/run_lammps_indent.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash
#SBATCH --time=00:10:00
#SBATCH --mem=2G
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=4
#SBATCH --output=lammps_indent.out

# Copy example from image
apptainer exec lammps-openmpi.sif cp -r /opt/lammps/examples/indent .

cd indent

# Load OpenMPI module
module load openmpi

export PMIX_MCA_gds=hash

# Run simulation
srun apptainer run ../lammps-openmpi.sif -in in.indent
File renamed without changes.
138 changes: 121 additions & 17 deletions content/mpi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -107,25 +107,26 @@ files differ as well. Pick a definition file for your site.

.. tab:: Triton (Aalto)

:download:`triton-ompi.def </examples/triton-ompi.def>`:
:download:`triton-openmpi.def </examples/triton-openmpi.def>`:

.. literalinclude:: /examples/triton-ompi.def
.. literalinclude:: /examples/triton-openmpi.def

Check warning on line 112 in content/mpi.rst

View workflow job for this annotation

GitHub Actions / Build and gh-pages

Could not lex literal_block as "singularity". Highlighting skipped.
:language: singularity

To build:

.. code-block:: console
srun --mem=16G --cpus-per-task=4 --time=01:00:00 apptainer build triton-ompi.sif triton-ompi.def
srun --mem=16G --cpus-per-task=4 --time=01:00:00 apptainer build triton-openmpi.sif triton-openmpi.def
To run (some extra parameters are needed to prevent launch errors):

.. code-block:: console
$ module load openmpi/4.1.6
$ export OMPI_MCA_orte_top_session_dir=/tmp/$USER/openmpi
$ export PMIX_MCA_gds=hash
$ srun --partition=batch-milan --mem=2G --nodes=2-2 --ntasks-per-node=1 --time=00:10:00 apptainer run ompi-triton.sif
$ export UCX_POSIX_USE_PROC_LINK=n
$ export OMPI_MCA_orte_top_session_dir=/tmp/$USER/openmpi
$ srun --partition=batch-milan --mem=2G --nodes=2-2 --ntasks-per-node=1 --time=00:10:00 apptainer run openmpi-triton.sif
srun: job 3521915 queued and waiting for resources
srun: job 3521915 has been allocated resources
Expand Down Expand Up @@ -159,24 +160,24 @@ files differ as well. Pick a definition file for your site.
.. tab:: Puhti (CSC)

:download:`puhti-ompi.def <examples/puhti-ompi.def>`:
:download:`puhti-openmpi.def <examples/puhti-openmpi.def>`:

.. literalinclude:: /examples/puhti-ompi.def
.. literalinclude:: /examples/puhti-openmpi.def

Check warning on line 165 in content/mpi.rst

View workflow job for this annotation

GitHub Actions / Build and gh-pages

Could not lex literal_block as "singularity". Highlighting skipped.
:language: singularity

To build:

.. code-block:: console
apptainer build puhti-ompi.sif puhti-ompi.def
apptainer build puhti-openmpi.sif puhti-openmpi.def
To run (some extra parameters are needed to prevent error messages):

.. code-block:: console
$ module load openmpi/4.1.4
$ export PMIX_MCA_gds=hash
$ srun --account=project_XXXXXXX --partition=large --mem=2G --nodes=2-2 --ntasks-per-node=1 --time=00:10:00 apptainer run puhti-ompi.sif
$ srun --account=project_XXXXXXX --partition=large --mem=2G --nodes=2-2 --ntasks-per-node=1 --time=00:10:00 apptainer run puhti-openmpi.sif
srun: job 23736111 queued and waiting for resources
srun: job 23736111 has been allocated resources
Expand Down Expand Up @@ -207,7 +208,6 @@ files differ as well. Pick a definition file for your site.
2097152 12050.35
4194304 12058.36
.. tab:: LUMI (CSC)

:download:`lumi-mpich.def <examples/lumi-mpich.def>`:
Expand Down Expand Up @@ -302,17 +302,17 @@ Below are explanations on how the interconnect libraries were provided.
The interconnect support was provided by the ``libucx-dev``-package that
provides Infiniband drivers.

:download:`triton-ompi.def <examples/triton-ompi.def>`, line 15:
:download:`triton-openmpi.def <examples/triton-openmpi.def>`, line 15:

Check failure on line 306 in content/mpi.rst

View workflow job for this annotation

GitHub Actions / Build and gh-pages

Error in "literalinclude" directive:
.. literalinclude:: /examples/triton-ompi.def
:language: singularity
.. literalinclude:: /examples/triton-openmpi.def
:# Copy example from imagelanguage: singularity
:lines: 15

The OpenMPI installation was then configured to use these drivers:

:download:`triton-ompi.def <examples/triton-ompi.def>`, line 26:
:download:`triton-openmpi.def <examples/triton-openmpi.def>`, line 26:

.. literalinclude:: /examples/triton-ompi.def
.. literalinclude:: /examples/triton-openmpi.def
:language: singularity
:lines: 26

Expand All @@ -321,9 +321,9 @@ Below are explanations on how the interconnect libraries were provided.
The interconnect support is provided by installing drivers from
Mellanox's Infiniband driver repository:

:download:`puhti-ompi.def <examples/puhti-ompi.def>`, lines 27-38:
:download:`puhti-openmpi.def <examples/puhti-openmpi.def>`, lines 27-38:

.. literalinclude:: /examples/puhti-ompi.def
.. literalinclude:: /examples/puhti-openmpi.def
:language: singularity
:lines: 27-38

Expand Down Expand Up @@ -373,6 +373,110 @@ that the program in the container can be built against. This
wrapper can then use different MPI implementations during
runtime.

Example on portability: LAMMPS
------------------------------

LAMMPS is a classical molecular dynamics simulation code with a focus
on materials modeling.

Let's build a container with LAMMPS in it:


.. literalinclude:: /examples/lammps-openmpi.def

Check warning on line 385 in content/mpi.rst

View workflow job for this annotation

GitHub Actions / Build and gh-pages

Could not lex literal_block as "singularity". Highlighting skipped.
:language: singularity

Let's also create a submission script that runs a LAMMPS example
where an indent will pushes against a material:

.. literalinclude:: /examples/run_lammps_indent.sh
:language: singularity

Now this exact same container can be run in both Triton / Puhti that have
OpenMPI installed because both clusters use Slurm and InfiniBand
interconnects.

.. tabs::

.. tab:: Triton (Aalto)

.. code-block:: console
$ export PMIX_MCA_gds=hash
$ export UCX_POSIX_USE_PROC_LINK=n
$ export OMPI_MCA_orte_top_session_dir=/tmp/$USER/openmpi
$ sbatch run_lammps_indent.sh
$ tail -n 27 lammps_indent.out
Loop time of 0.752293 on 4 procs for 30000 steps with 420 atoms
Performance: 10336396.152 tau/day, 39878.072 timesteps/s, 16.749 Matom-step/s
99.6% CPU use with 4 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 0.31927 | 0.37377 | 0.42578 | 7.5 | 49.68
Neigh | 0.016316 | 0.020162 | 0.023961 | 2.3 | 2.68
Comm | 0.19882 | 0.25882 | 0.31814 | 10.2 | 34.40
Output | 0.00033215 | 0.00038609 | 0.00054361 | 0.0 | 0.05
Modify | 0.044981 | 0.049941 | 0.054024 | 1.7 | 6.64
Other | | 0.04921 | | | 6.54
Nlocal: 105 ave 112 max 98 min
Histogram: 1 0 1 0 0 0 0 1 0 1
Nghost: 92.5 ave 96 max 89 min
Histogram: 1 0 1 0 0 0 0 1 0 1
Neighs: 892.25 ave 1003 max 788 min
Histogram: 2 0 0 0 0 0 0 0 1 1
Total # of neighbors = 3569
Ave neighs/atom = 8.497619
Neighbor list builds = 634
Dangerous builds = 0
Total wall time: 0:00:01
.. tab:: Puhti (CSC)

.. code-block:: console
$ export PMIX_MCA_gds=hash
$ sbatch --account=project_XXXXXXX --partition=large run_lammps_indent.sh
$ tail -n 27 lammps_indent.out
Loop time of 0.527178 on 4 procs for 30000 steps with 420 atoms
Performance: 14750222.558 tau/day, 56906.723 timesteps/s, 23.901 Matom-step/s
99.7% CPU use with 4 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 0.22956 | 0.26147 | 0.28984 | 5.5 | 49.60
Neigh | 0.012471 | 0.015646 | 0.018613 | 2.3 | 2.97
Comm | 0.13816 | 0.17729 | 0.2192 | 9.3 | 33.63
Output | 0.00023943 | 0.00024399 | 0.00025267 | 0.0 | 0.05
Modify | 0.03212 | 0.035031 | 0.037378 | 1.2 | 6.65
Other | | 0.0375 | | | 7.11
Nlocal: 105 ave 112 max 98 min
Histogram: 1 0 1 0 0 0 0 1 0 1
Nghost: 92.5 ave 96 max 89 min
Histogram: 1 0 1 0 0 0 0 1 0 1
Neighs: 892.25 ave 1003 max 788 min
Histogram: 2 0 0 0 0 0 0 0 1 1
Total # of neighbors = 3569
Ave neighs/atom = 8.497619
Neighbor list builds = 634
Dangerous builds = 0
Total wall time: 0:00:01
Review of this session
----------------------

.. admonition:: Key points to remember

- MPI version should match the version installed to the cluster
- Cluster MPI module should be loaded for maximum compatibility with job launching
- Care must be taken to make certain that the container utilizes
fast interconnects

0 comments on commit db97eaa

Please sign in to comment.