-
Notifications
You must be signed in to change notification settings - Fork 11
Home
- Checkout sailfish
git clone [email protected]:clemson-cal/sailfish.git
- Install dependency Python modules
pip3 install --user cffi # only needed for running in CPU mode
pip3 install --user cupy-cuda111 # only needed for running in GPU mode
- Get an interactive compute node
qsub -I -l select=1:ncpus=56:mem=200gb:ngpus=2:gpu_model=a100,walltime=24:00:00
- Load the cuda module
module load cuda/11.1.1-gcc/9.5.0
- Go to the sailfish directory and run something
cd sailfish
./scripts/main.py -g kitp-code-comparison
To redirect your data outputs to the zfs filesystem, add the flag -o /zfs/warpgate/your-username
.
- Checkout sailfish
git clone [email protected]:clemson-cal/sailfish.git
- Install dependency Python modules
Pleiades has a good installation of Python 3.9.5 that already has cffi
. They are on CUDA 11.0; the matching cupy
version is cupy-cuda110
.
module load python3
pip3 install cupy-cuda110
- Get an interactive compute node (below will give you one of the nodes with 8 NVIDIA V100's)
qsub -I -l select=1:ncpus=1:model=sky_gpu:ngpus=8,walltime=0:30:00 -q devel@pbspl4
- Load modules
module load python3 cuda
- Go to the sailfish directory and run something
cd sailfish
./scripts/main.py -g kitp-code-comparison
Here is a sample submission script for running on Pleiades (note use of v100@pbspl4
queue).
#PBS -N r10-n2k-e00-q05
#PBS -l select=1:ncpus=1:model=sky_gpu:ngpus=4,walltime=24:00:00
#PBS -q v100@pbspl4
module load python3 cuda
cd $PBS_O_WORKDIR
./scripts/main.py kitp-code-comparison \
--model which_diagnostics=forces \
eccentricity=0.0 mass_ratio=0.5 \
sink_radius=0.03 softening_length=0.03 \
--new-timestep-cadence=10 --cfl=0.2 --patches=4 -g \
--resolution=2000 --checkpoint=50.0 --timeseries=0.01 -e 3000 \
-o data/r10-n2k-e00-q05 > r10-n2k-e00-q05.out
To check on your runs on a GPU queue you also need to specify the queue name for the qstat
command, e.g.
qstat v100@pbspl4 -u your-username
- Checkout sailfish
git clone [email protected]:clemson-cal/sailfish.git
- Create a Singularity overlay. Install dependency Python modules to the environment when instructed:
pip3 install cupy-cudaXXX # cupy wheel must match the Cuda version of your Singularity image (e.g. cupy-cuda112 for Cuda/11.2)
- Write a SLURM job submission script that starts your Singularity image and runs Sailfish; e.g. run.sbatch:
#!/bin/bash
#SBATCH --job-name=my-job
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --time=48:00:00
#SBATCH --gres=gpu:4 # gres=gpu:v100:4 to request only v100's
#SBATCH --mem=2GB
module purge
singularity exec --nv \
--overlay /scratch/<NetID>/my_singularity/my_example.ext3:ro \
/scratch/work/public/singularity/cuda11.2.2-cudnn8-devel-ubuntu20.04.sif \
/bin/bash -c \
"source /ext3/env.sh; python ~/sailfish/scripts/main.py kitp-code-comparison --mode gpu --patches 4" # patches match cpus-per-task/num gpus
- Submit to the queue
sbatch run.sbatch
The primary difference when running on a workstation is installing the correct version of cupy for your machine.
On a workstation running CUDA version 11.7 with an NVIDIA GPU, one would run
pip3 install --user cupy-cuda117 # only needed for running in GPU mode
To check which CUDA version is installed, you can run
nvidia-smi
At the moment, cupy only seems to support versions 4.3 and 5.0 of ROCm. You can check which version is installed using
apt show rocm-libs # on a system with the apt package manager
ocm-libs # on a system with the yum package manager
Then, install the version of cupy corresponding to your ROCm version, e.g.
pip3 install --user cupy-rocm-5-0
General cupy install for ROCm runtime (requires modern GCC to handle std::enable_if statements in HIP libs)
$ export HCC_AMDGPU_TARGET=<gfxid> # This is under the Name section when running rocminfo. MI50s give gfx906
$ export __HIP_PLATFORM_HCC__
$ export CUPY_INSTALL_USE_HIP=1
$ export ROCM_HOME=/opt/rocm # Typical rocm install location. Might vary from system to system
$ pip install cupy
the cupy module somehow translates the various cupy.cuda.runtime
calls to the respective rocm runtime automatically, so there
is no disruption to current implementation of sailfish
even on ROCm systems.