-
Notifications
You must be signed in to change notification settings - Fork 11
Home
ctiede edited this page Oct 14, 2022
·
20 revisions
- Checkout sailfish
git clone [email protected]:clemson-cal/sailfish.git
- Install dependency Python modules
pip3 install --user cffi # only needed for running in CPU mode
pip3 install --user cupy-cuda111 # only needed for running in GPU mode
- Get a compute node
qsub -I -l select=1:ncpus=56:mem=200gb:ngpus=2:gpu_model=a100,walltime=24:00:00
- Load the cuda module
module load cuda/11.1.1-gcc/9.5.0
- Go to the sailfish directory and run something
cd sailfish
./scripts/main.py -g kitp-code-comparison
- Checkout sailfish
git clone [email protected]:clemson-cal/sailfish.git
- Create a Singularity overlay. Install dependency Python modules to the environment when instructed:
pip3 install cupy-cudaXXX # cupy wheel must match the Cuda version of your Singularity image (e.g. cupy-cuda112 for Cuda/11.2)
- Write a SLURM job submission script that starts your Singularity image and runs Sailfish; e.g. run.sbatch:
#!/bin/bash
#SBATCH --job-name=my-job
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=4
#SBATCH --time=48:00:00
#SBATCH --gres=gpu:4 # gres=gpu:v100:4 to request only v100's
#SBATCH --mem=2GB
module purge
singularity exec --nv \
--overlay /scratch/<NetID>/my_singularity/my_example.ext3:ro \
/scratch/work/public/singularity/cuda11.2.2-cudnn8-devel-ubuntu20.04.sif \
/bin/bash -c \
"source /ext3/env.sh; python ~/sailfish/scripts/main.py kitp-code-comparison --mode gpu --patches 4" # patches match cpus-per-task/num gpus
- Submit to the queue
sbatch run.sbatch
The primary difference when running on a workstation is installing the correct version of cupy for your machine.
On a workstation running CUDA version 11.7 with an NVIDIA GPU, one would run
pip3 install --user cupy-cuda117 # only needed for running in GPU mode
To check which CUDA version is installed, you can run
nvidia-smi
At the moment, cupy only seems to support versions 4.3 and 5.0 of ROCm. You can check which version is installed using
apt show rocm-libs # on a system with the apt package manager
yum info rocm-libs # on a system with the yum package manager
Then, install the version of cupy corresponding to your ROCm version, e.g.
pip3 install --user cupy-rocm-5-0