Home

Running on Palmetto (Clemson)

Checkout sailfish

git clone [email protected]:clemson-cal/sailfish.git

Install dependency Python modules

pip3 install --user cffi # only needed for running in CPU mode
pip3 install --user cupy-cuda111 # only needed for running in GPU mode

Get an interactive compute node

qsub -I -l select=1:ncpus=56:mem=200gb:ngpus=2:gpu_model=a100,walltime=24:00:00

Load the CUDA module, and specify a reductions accelerator (see documentation):

module load cuda/11.1.1-gcc/9.5.0
export CUPY_ACCELERATORS=cub

Go to the sailfish directory and run something

cd sailfish
bin/sailfish -g kitp-code-comparison

To redirect your data outputs to the zfs filesystem, add the flag -o /zfs/warpgate/your-username.

Note: Palmetto has Python 3.6.8 installed to a system location. However sailfish might soon require Python 3.7 or higher. If so we will need to use Anaconda (which comes with a Python 3.9.12), unless the palmetto admins install a newer vanilla Python. Under Anaconda, the modules and pip-installs would be:

module load anaconda3/2022.05-gcc/9.5.0
module load cuda/11.6.2-gcc/9.5.0
pip3 install --user cupy-cuda116

There is no need to pip-install cffi since Anaconda includes one that will work.

Running on Pleiades (NASA)

Checkout sailfish

git clone [email protected]:clemson-cal/sailfish.git

Install dependency Python modules

Pleiades has a good installation of Python 3.9.5 that already has cffi. They are on CUDA 11.0; the matching cupy version is cupy-cuda110.

module load python3
pip3 install --upgrade --user cupy-cuda110 numpy loguru

Note: the final two dependencies are only needed for work on the experimental sailfish v0.6. NumPy needs to be upgraded from the system 1.20 to the newest 1.23, and sailfish v0.6 depends on the loguru module for logging.

Get an interactive compute node (below will give you one of the nodes with 8 NVIDIA V100's)

qsub -I -l select=1:ncpus=1:model=sky_gpu:ngpus=8,walltime=0:30:00 -q devel@pbspl4

Load modules

module load python3 cuda

Go to the sailfish directory and run something

cd sailfish
bin/sailfish -g kitp-code-comparison

Here is a sample submission script for running on Pleiades (note use of v100@pbspl4 queue).

#PBS -N r10-n2k-e00-q05
#PBS -l select=1:ncpus=1:model=sky_gpu:ngpus=4,walltime=24:00:00
#PBS -q v100@pbspl4

module load python3 cuda

cd $PBS_O_WORKDIR

bin/sailfish kitp-code-comparison \
    --model which_diagnostics=forces \
      eccentricity=0.0 mass_ratio=0.5 \
      sink_radius=0.03 softening_length=0.03 \
    --new-timestep-cadence=10 --cfl=0.2 --patches=4 -g \
    --resolution=2000 --checkpoint=50.0 --timeseries=0.01 -e 3000 \
    -o data/r10-n2k-e00-q05 > r10-n2k-e00-q05.out

To check on your runs on a GPU queue you also need to specify the queue name for the qstat command, e.g.

qstat v100@pbspl4 -u your-username

Running on Greene (NYU)

Checkout sailfish

git clone [email protected]:clemson-cal/sailfish.git

Create a Singularity overlay. Install dependency Python modules to the environment when instructed:

pip3 install cupy-cudaXXX # cupy wheel must match the Cuda version of your Singularity image (e.g. cupy-cuda112 for Cuda/11.2)

Write a SLURM job submission script that starts your Singularity image and runs Sailfish; e.g. run.sbatch:

#!/bin/bash

#SBATCH --job-name=my-job
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --time=48:00:00
#SBATCH --gres=gpu:4         # gres=gpu:v100:4 to request only v100's
#SBATCH --mem=2GB

module purge

singularity exec --nv \
  --overlay /scratch/<NetID>/my_singularity/my_example.ext3:ro \
	    /scratch/work/public/singularity/cuda11.2.2-cudnn8-devel-ubuntu20.04.sif \
	    /bin/bash -c \
  "source /ext3/env.sh; python ~/sailfish/bin/sailfish kitp-code-comparison --mode=gpu --patches=4"

# None: in GPU mode (only), choose --patches=<num GPUS>

Submit to the queue

sbatch run.sbatch

Running on ALICE (Leiden University)

Log into the alice1 or alice2 login nodes
Load Python module

module load Python/3.9.5-GCCcore-10.3.0

Create a virtual environment (e.g. named venv_sailfish) and activate it

python -m venv venv_sailfish
source venv_sailfish/bin/activate

Install dependencies

pip install --upgrade pip
pip install --upgrade setuptools
pip install wheel
pip install numpy
pip install cffi # if you want to run on CPUs
pip install cupy # if you want to run on GPUS
pip install matplotlib # if you want to run the plotting script on the cluster

Checkout sailfish

git clone https://github.com/clemson-cal/sailfish.git

Here is an example job submission script

#!/bin/bash
#SBATCH --job-name=example
#SBATCH --mail-user="[email protected]"
#SBATCH --mail-type="ALL"
#SBATCH --time=00:01:00
#SBATCH --partition=gpu-short
#SBATCH --output=example%j.out
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=150gb
#SBATCH --gres=gpu:1

cd /home/your-username/
module load Python/3.9.5-GCCcore-10.3.0
source venv_sailfish/bin/activate
export CUPY_ACCELERATORS=cub
nvidia-smi

cat $PBS_NODEFILE | uniq
cd sailfish
bin/sailfish circumbinary-disk -g -c1.0 -e1.0 -o /home/westernacherjrws/data1 > example_output_file

Running on workstations

The primary difference when running on a workstation is installing the correct version of cupy for your machine.

NVIDIA GPUs

On a workstation running CUDA version 11.7 with an NVIDIA GPU, one would run

pip3 install --user cupy-cuda117 # only needed for running in GPU mode

To check which CUDA version is installed, you can run

nvidia-smi

AMD GPUs

At the moment, cupy only seems to support versions 4.3 and 5.0 of ROCm. You can check which version is installed using

apt show rocm-libs # on a system with the apt package manager
yum info rocm-libs # on a system with the yum package manager

Then, install the version of cupy corresponding to your ROCm version, e.g.

pip3 install --user cupy-rocm-5-0

General cupy install for ROCm runtime (requires modern GCC to handle std::enable_if statements in HIP libs)

export HCC_AMDGPU_TARGET=<gfxid> # this is under the Name section when running rocminfo; MI50s give gfx906
export __HIP_PLATFORM_HCC__
export CUPY_INSTALL_USE_HIP=1
export ROCM_HOME=/opt/rocm # typical rocm install location, could vary from system to system
pip3 install cupy

Note: On ROCm, the cupy module redirects cupy.cuda.runtime calls to the ROCm runtime automatically, so pieces of sailfish source code that seem specialized to the CUDA runtime are actually agnostic to the hardware.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly