-
Notifications
You must be signed in to change notification settings - Fork 0
Computing resources
This document describes information for network computing resources in the Kitzes Lab: servers you can access remotely to run code or download/upload data.
General computing
- Use the command line
- Configure an SSH alias
- Configure an SSH keypair
- Mount network hard drive to personal computer
- Move default environment and package cache
- Transfer data using Globus
Cluster-specific info
- Use h2p, Pitt CRC's computing cluster
- Use Bridges-2 HPC, Pitt Supercomputing Center's computing cluster
- Use Snowy, the lab computational server
-
Use
robin
, the lab Mac server -
Use
phoebe
, the lab Windows server
The command line is a text interface that is very important for programming
Here are some common command line commands:
-
ssh [email protected]
- SSH ("secure shell") into a remote server namedmy.remote.pitt.edu
using the "kitzeslab" username -
pwd
- print working directory. This is the directory, aka the "folder", that you're issuing commands from. -
ls
- list contents of the current working directory -
ls /Volumes/seagate1/
- list the contents of the directory/Volumes/seagate1/
-
ls -lh
- list the files with more information:l
= long list,h
= human-readable file sizes -
cd
- change directories into your home directory -
cd ..
- go up one directory, e.g. from/Volumes/seagate1/
to/Volumes/
. -
cd seagate1/
- change directories intoseagate1/
. This is a relative path, i.e. the shell attempts to change into a directory contained within the current working directory. -
cd /Volumes/seagate1
- change directories into/Volumes/seagate1/
. This is an absolute path, i.e. you can do this from any working directory -
mkdir hello
- make a directory calledhello
within the current directory -
mkdir /Users/tessa/Recordings/MSD-0001
- make the directoryMSD-0001
within preexisting directory/Users/tessa/Recordings/
. -
scp kitzeslab@<robin's URL>:/Volumes/seagate1/data/field-data/<date folder>/<card folder>/<filename>.WAV .
- copy a file into your current working directory (specified by.
). -
scp kitzeslab@<robin's URL>:/Volumes/seagate1/data/field-data/<date folder>/<card folder>/<filename>.WAV /Users/<username>/Recordings
- copy a file into the directory/Users/<username>/Recordings
. -
cat myfiles.txt | zip myfiles.zip -@
- Zip a list of files (contained in myfiles.txt as absolute paths to the files with line breaks)
Git is a program you can use to manage code. For more info on how to use it, see here: https://github.com/kitzeslab/lab-docs/wiki/Using-Git
Rsync lets you copy and sync files between two destinations, including between two computers.
Examples:
rsync /Source/directory/goes/here /Destination/directory/goes/here
rsync [email protected] /Local/place/I/want/files
- Helpful rsync flags
-
-a
- preserve as much info about files as possible ("archive" mode), e.g. date of creation -
-z
- compress files, e.g. if over a connection -
--progress
- show progress for each file -
--ignore-existing
- skip copying files that have already been uploaded
-
Example:
rsync -az --progress --ignore-existing [email protected] /Volumes/MyProjectData/
This package allows you run programs in the background. Use this for long programs that you'd like to keep running, even if you leave, exit your SSH session, etc.
- New tmux session:
tmux new -s <descriptive-session-name>
- Detach from session: ^B then D
- View sessions:
tmux ls
- Reattach to session:
tmux attach -t <descriptive-session-name>
If the idea of typing in a long address every time you SSH doesn't excite you, it's easy to set up a simple SSH alias so that you can log into Robin with the command ssh robin
or the Center for Research Computing's h2p
cluster located at h2p.crc.pitt.edu
. You need to be added to these machines before you can access them.
-
Open (or create) an SSH config file,
~/.ssh/config
on your local machine (i.e. the one you're sitting at). Use whatever text editor you're familiar with, ornano
if you're not sure what to use.nano ~/.ssh/config
-
Insert these lines with your own Pitt username after
User
, and the correct domain name for HostName. Note that each indented line must be tabbed, not spaced.Host robin #or h2p for cluster HostName %h.<rest of domain name> #rest of domain name is crc.pitt.edu for cluster ForwardX11 yes ForwardX11Trusted yes Compression yes User <username> #your pitt username
-
Now write your file and exit back to the terminal.
If you also don't want to type in a password every time you log in, you can set up an automatic login from your personal laptop using a keypair. (Only do this if you're certain your personal computer is secure!) These steps work for both robin
and the cluster. These steps assume you have already configured an SSH alias (above).
-
If you don't already have an ssh keypair, generate a keypair by running this command on your personal computer, then pressing
return
3x:ssh-keygen
-
Install
ssh-copy-id
on your personal computer. Mac users: you may need to install Homebrew first. -
Connect to the VPN with the JMR-USER role.
-
Use
ssh-copy-id
on your personal computer to copy the public key to your home directory onrobin
orh2p
:brew install ssh-copy-id ssh-copy-id -i ~/.ssh/id_rsa.pub robin #replace with h2p for cluster
-
Type your
robin
orh2p
account password when prompted
If you have followed both the alias and SSH keypair steps above, you can login without typing a password by simply typing ssh robin
or ssh h2p
.
You can mount a hard drive on the VPN (i.e., rook
) such that you can access it as if it is a hard drive connected to your personal computer.
- Log into Pulse Secure
- Navigate to Finder > Go > Connect to Server (or cmd+K)
- Use
ifconfig
on the desired device to find the IP address or server name - Enter smb://server.name or smb://ip_address... into the prompt to connect
- If necessary, enter your username and password for the remote disk/system
To allow remote access to a folder from a Mac, such as a folder on Robin, follow these instructions. Note that the path to a shared folder via smb is based on its "shared name" rather than the folder's path on the remote host. For example, smb://robin.bio.pitt.edu/myfolder
rather than smb://robin.bio.pitt.edu/Volumes/lacie/projects1/me/myfolder
In order to mount new storage emu
please follow the above instructions but replace smb with nfs on step 4. That is
- Enter
nfs://snowy_ip_address/media/emu
into the prompt to connect
It would get mounted on your machine as a drive and would be shown in Finder or can be accessed directly at /Volumes/emu
.
You might want to change where conda envs and pip cache is stored to save space on your home (~
) directory.
Conda environment location:
Edit/create ~/.condarc
to contain the following:
envs_dirs:
- /Users/nolan/newpath
- might want to move existing environments, do you need to update their paths?
Change location of pip cache (where downloaded packages are stored so you don't have to re-download them):
(to do: add instructions)
Globus transfers data between two "endpoint" computers. Many servers, like Pitt's H2P cluster or Bridges, have a pre-created endpoint. For our personal computers or our lab servers, we have to set up our own endpoints. We recommend using Robin instead of a personal computer for these transfers, but you can also set up an endpoint on your personal computer.
First time using Globus: set up a web account
- This step is only needed if you're creating an endpoint on a personal computer
- Go to https://app.globus.org/, select the organization University of Pittsburgh, and log in using your regular Pitt credentials
First time using Globus: set up the endpoint.
- Log in to https://app.globus.org/activity with the correct Pitt account
- If you are setting up an endpoint on a personal computer, then log in to your own Pitt account
- If you are setting up an endpoint on a shared computer, then log in to the
[email protected]
account
- Use an administrator account to install Globus Connect Personal if it's not installed already
- If setting up the endpoint on a shared computer, complete the following steps while logged into the computer using a shared account that everyone can log into. For example, the Robin endpoint is accessible through the Robin
kitzeslab
account. So, you may have to complete the following steps by logging out of the administrator account and logging into the account that everyone can access, if those two accounts are different. - Open Globus Connect Personal. You should see a dialog box with the option to click "Log In."
- If not, this computer already has an endpoint associated it.
- Each personal endpoint can only be owned by one Globus web account. If you're creating an account on a shared computer, make sure that the endpoint is owned by the
[email protected]
account. If the endpoint already exists, check that it is accessible by following the steps under "Prep the online transfer" below.- If you can open the endpoint on the Globus web app, no further actions need to be taken to set the endpoint up.
- If you can't open the endpoint on the Globus web app, delete the old endpoint by going to "Preferences" > "General" > Click "Delete Globus Connect Personal configuration and exit", then re-open Globus personal connect and the dialog with the Log In button should appear.
- Click "Log In" on the dialog to open a webpage where you can set up the preferences for the endpoint, e.g. a description. Go through the process and exit. It helps to give the endpoint a descriptive name (e.g. indicate that it belongs to [email protected])
If transferring data from an external hard drive: prep the hard drive on your computer
- Log in to the endpoint computer
- If the endpoint computer is Robin, log in to the
kitzeslab
account on Robin (this is how we access the Robin endpoint)
- If the endpoint computer is Robin, log in to the
- Attach the hard drive to your endpoint computer
- Open the Globus Connect Personal app on the computer
- Click the icon in the top menu bar > "Preferences" > "Access"
- If the hard drive does not appear in the dialog box, click the "+" button and navigate to it, pressing Open to add it to the dialog
- Click the "Shareable" checkbox for the hard drive
Open the two endpoints on the Globus web app
- Open the Globus web app on your web browser (https://app.globus.org/activity)
- Log into the Globus app using the correct Pitt account
- If transferring from Robin, use the
jaklab
Pitt account username and password - If transferring from your personal computer, use your personal Pitt account
- If transferring from Robin, use the
- Go to File Manager on the webpage
- Open both of your connections in the two boxes labeled "Collection" as follows
- Click in the box and either search or go to "Your collections"
- Robin endpoint: (via the jaklab Globus web account) the collection is under "Your collections" and should be named "Robin (kitzeslab user account)"
- Personal computer endpoint: (via your Pitt Globus web account) the collection is under "Your collections" and should be whatever you named it
- Bridges2 endpoint (via any account): search for the collection named "PSC Bridges2 with XSEDE authentication"
- H2P.CRC endpoint (via any account): search for the collection named "pitt#dtn"
- If your collection doesn't snow up, follow the instructions above under "First time using Globus: set up the endpoint"
- For some collections (H2P.CRC and Bridges2) you will have to log in:
- pitt#dtn - log in with either your Pitt username and password or the jaklab username and password
- PSC Bridges 2 - log in with your XSEDE username and password (different from the username and password you use to SSH into Bridges2)
- Navigate to the correct folder on each endpoint by typing in the path underneath the collection name
- Robin or any Mac computer:
/Volumes/EXTERNAL_HARD_DRIVE_NAME
- Bridges2:
/ocean/projects/bio200037p/YOURUSERNAME/data
Create the "data" folder for your Bridges2 account if it doesn't exist by clicking the "New folder" button - CRC:
/bgfs/jkitzes/jaklab/data
(if using the jaklab account to store data) or/bgfs/jkitzes/YOURUSERNAME/data
. Create the "data" folder for your CRC account if it doesn't exist by clicking the "New folder" button
- Robin or any Mac computer:
- Click on the folder of data that you want to transfer and click "Start"
- Check on progress by clicking on "Activity"
- Every 24 hours you will need to refresh your credentials by going to "Activity", clicking on the task, then pressing
- List this new location for the data in the "Location of in-use copies" column of the Datasets and Drives document
We have allocations on the h2p (Hail To Pitt) and Bridges-2 computing clusters.
Both use the SLURM workload management systems, but there are some differences in how they are used. User guides for h2p and Bridges-2.
H2P is the Pitt CRC's cluster. Justin needs to request an account to give you access.
Get onto the cluster with
ssh h2p.crc.pitt.edu
On h2p, set yourself up a storage directory on the cluster. Make a personal storage directory with your Pitt username within /bgfs/jkitzes
:
mkdir /bgfs/jkitzes/abc123
- View apparent size of a folder (not compressed size):
du -sh --apparent-size <folder_name>
- Check our lab's storage quota:
crc-squota.py
- View cluster info about which groups of nodes are available: sinfo
-
mix
: some space available on these nodes -
alloc
: all space allocated on these nodes
-
- View how many nodes and cores are idle on each cluster & partition
crc-idle.py
- General info about wait times: see https://weekly.aws.barrymoo.dev/
- Use JupyterHub: while logged into Pulse, type into your browser
hub.crc.pitt.edu
- For installing OPSO:
/ihome/sam/bmooreii/workspace/crc-wrappers/jupyter-kernel/
- For installing OPSO:
module load
module list
module unload
module purge
which python
An interactive session can be used to run jobs directly on the command line without using the SLURM scheduler.
Start an interactive bash session with one core, for 2 hours (more info here): srun -n1 -t02:00:00 --pty bash
To submit jobs on the cluster, you need three things:
- The script you want to run
- The virtual environment to run the script in
- A slurm file that sets up the environment and runs the script
Typical slurm script for CRC:
#!/usr/bin/env bash
#SBATCH --job-name=train_bbcu_largeimage
#SBATCH --output=train_bbcu_largeimage.out
#SBATCH --error=train_bbcu_largeimage.err
#SBATCH --time=0-05:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cluster=gpu
#SBATCH --partition=a100
#SBATCH --cpus-per-task=12
#SBATCH --gres=gpu:2
echo "number of cpus:"
echo $SLURM_JOB_CPUS_PER_NODE
source activate opso_dev
python train.py
This is the code you want to run. Make sure to test out the code locally. You should never run the script directly from the command line (e.g. don't call python my_script.py
). When it comes time to use the script on the cluster, follow the next two steps to set it up using slurm
. If your analysis is large, create a smaller version of it first and then submit it to the cluster using slurm
as you will for your larger script. This will let you test that your slurm pipeline works, as well as that the results, timing, etc. are what you expected.
You must create a virtual environment for your script to use. This consists of two steps: loading the modules for your programming language, then installing packages in a virtual environment.
To load modules, use the module purge
and module load
commands. For example, these commands load the python 3.7 module using the following commands in the command line. The venv/wrap
module also needs to be loaded in order to create a virtual environment.
module purge #Removes all previous modules
module load python/3.7.0 venv/wrap #Loads the Python module and the virtualenvwrapper module
Most Python module loads will be identical to the one above, but you can use search tools like spider
to search for other modules. See the Application Environment section of the CRC help page for more information.
Once you've loaded these modules, use the following commands to make a pip-based virtual environment:
mkvirtualenv my_env_name #create a new virtual environment named "my_env_name"
workon my_env_name #activate the virtual environment "my_env_name" (it will automatically be activated when you initially create it)
pip install my_packages_to_install #e.g. opensoundscape==0.4.1 #install packages inside of the activated environment
deactivate #leave the environment
The slurm script is the script submitted to the cluster to run your personal script. It contains information about where output of your script should be stored, what the script should be named when you look it up on the cluster, how much time the script should be allowed to run, which computing resources should be used, etc.
Typically this script is named the same thing as your script, but with a .slurm
extension. For instance, with my_script.py
you might choose to name the slurm script my_script.slurm
. This script is typically in the same directory as your python script; strange things can happen if not.
The following is an example of a slurm script with comments to show the meaning of each of the settings
#!/usr/bin/env bash # Should be at the beginning of every script
#SBATCH --job-name=my_script_20200101 # The name of the script when looking it up on the cluster
#SBATCH --output=my_script.out # Where to save outputs. Instead of printing outputs in the terminal, the print statements, errors, etc. generated by your script will be saved in this file.
#SBATCH --time=1-12:00:00 # days-hh:mm:ss - how long your script should run before cancelling. Select this by testing a small version of the script with "time" print statements to get a good estimate of runtime.
#SBATCH --nodes=1 # how many computing nodes to use
#SBATCH --ntasks-per-node=3 # what amount of parallelization should occur
#SBATCH --cpus-per-task=1 # if multiple cpus should be used per node
#SBATCH --cluster=gpu # which cluster to use -- different clusters have different features. Some are busier than others
#SBATCH --partition=v100 # which partition to use
#SBATCH --gres=gpu:1 # The number of GPUs to use, if you are using GPUs
#SBATCH --mem=100G # how much RAM the script should be given
# Run the same commands you would run to use your virtual environment
module purge
# Cuda is needed to run script on GPUs
module load python/3.7.0 venv/wrap cuda/10.0.130
# Activate the virtual environment you created above
workon opso_0.4.1
# Call your script
python my_script.py
# Output some stats about the job afterwards.
crc-job-stats.py
When customizing this file, pay particular attention to the SBATCH settings at the top of the file. The node configuration help on the CRC website will help you decide on settings like cluster
and partition
.
Try to set the "time" setting with a little bit of wiggle room, so that your script is sure to complete in the allotted time. If the job isn't complete before the timer is up, it will be cancelled, and your progress may be lost! To prevent losing much progress, it can help to set up your script to save incrementally (e.g., if the script does many analyses, it could write one analysis to a file before moving on to the next analysis). The amount of time a job takes on the cluster can be variable: for instance, a job that exceeds a time cutoff one day can run two hours under the cutoff the next day. It's unclear why there is this variability.
After you have customized this script, submit the job to the cluster using sbatch ./my_script.slurm
.
Some useful commands allow you to check on the status of the cluster and your submitted scripts. In particular, crc-squeue.py
will show you whether your submitted jobs have started running, and if so, how long they have been running. You can use sinfo
to see how busy each cluster/partition is, as it will take longer for resources to be freed on busier partitions (e.g. those with "mix"). More information on these commands can be found here.
Another way to check the status of your job is to inspect the output, e.g. cat my_script.out
. The output file will not be created until the job is running on the cluster. In some cases, we have found that the output file is not created or populated until the script runs--this can be confusing if your job times out before the output file is created.
Bridges-2 User Guide/Docs: https://www.psc.edu/resources/bridges-2/user-guide/
Submit a help ticket: [email protected]
To use Bridges-2 you will need to create an XSEDE account at https://portal.xsede.org/. Then ask justin to add your account to the group.
First, make sure you have ssh set up by logging into XSEDE website and setting up dual authentication.
Set up a password for the PSC at https://apr.psc.edu/
Login to bridges2.psc.edu
via ssh directly (default port 22) instead of through XSEDE.
sinfo
sacctmgr show cluster
check a node: scontrol show node r007
Check our allocations: projects
find available software
module spider
module avail python
load python using Anaconda:
module load anaconda3/2020.07
we will typically use these types: RM-shared (regular memory cpus, 128 per node, request up to 128) GPU-shared (node with 8 gpus, 5 cpus per gpu, request up to 4) RM (request full 128-cpu nodes) GPU (request full 8-gpu nodes)
Two places to typically store files
Home space: $HOME - for small stuff like scripts
$PROJECT - storage (10Tb)
Node-local ($LOCAL): disk that can be used from within a node on a job (temporary! cleared when job ends)
Put shared data in /ocean/projects/bio200037p/shared/
Currently bridges2 only allows people in the same group to read each others' files in this shared folder.
To add read permissions to a file in the shared folder, chmod +r [file]
(can't add write permissions)
Can use globus to transfer files (see information below)
Put shared data in the shared locations described above
Don't run jobs on login nodes (the first thing you get to when you log in). Use either an interactive job, slurm, or OnDemand to run code.
Access on ondemand.bridges2.psc.edu; can be used to start JupyterHub / R Studio servers
Easily start an interactive job with 1 cpu: interact
Use interact --help
for more options
Make a slurm script, probably in your home directory. it might look like this:
#!/usr/bin/env bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --partition=RM
#SBATCH --time=00:05:00
#SBATCH --output=log.out
source ~/.bashrc #so that we can use workon
module load anaconda3/2019.10 #2020.11 seems to be working fine as well
module load ffmpeg/4.3.1 #necessary for loading mp3 files in opso
module load cuda #if you use gpu with pytorch you'll need cuda
workon opensoundscape-QPriMHYU-py3.7 #activate your virtualenv
#do stuff
python script.py
#print some useful job stats
python /jet/home/sammlapp/scripts/jobstats.py
don't run module purge
. If you do, you'll need to get sbatch command back with module load slurm
Submit the slurm job
sbatch script.slurm
view current jobs
squeue -u [USER]
cancel job (get job number from squeue)
scancel [job#]
https://www.psc.edu/resources/bridges-2/user-guide-2/
Use gpu-shared
partition to request part of a node.
- Max number of GPUs you can request on GPU-shared is 4 (each node has 8 GPUs)
- Each gpu has 5 CPUs; request the correct number
Add ‘module load cuda’ to your slurm script
Example slurm script:
#!/usr/bin/env bash
#SBATCH --job-name=train_feat
#SBATCH --output=train.out
#SBATCH --time=1:00:00
#SBATCH --nodes=1
#SBATCH --partition=GPU-shared
#SBATCH --ntasks-per-node=5 #Number of CPUs to use; 5xGPUs
#SBATCH --gpus=1 #8 per node, max=4 on GPU-shared
source ~/.bashrc
module load anaconda3/2020.07
module load ffmpeg/4.3.1
#module load cuda
workon opensoundscape-pSZCvE5P-py3.8
python train.py
python ~/scripts/jobstats.py
First, you'll need an environment with ray
module load anaconda3/2020.07
mkvirtualenv ray
workon ray #if not already activated
pip install opensoundscape ray
in your bash script, use
module load anaconda3/2020.07
workon ray
and in python,
import ray
import os
from glob import glob
# define a function that you want to run in parallel using ray
# add the @ray.remote "decorator" immediately before the function like this
@ray.remote
def analyze(f):
return f
# initialize ray - it should figure out how many cores are available on its own
ray.init()
# list of inputs for the parallelized function
inputs = glob("./*.wav")
# spin off a bunch of ray tasks and keep their nametags in a list
# later, we'll use the nametags to retrieve the results of each task
# the .remote() call returns the nametag immediately, and queues up the task for its "workers" ie cpus
ray_nametags = [analyze.remote(f) for f in inputs]
# get the results back from the ray tasks as they finish
# the ray.get(id) function will return a value once the task with that id completes
for idx,nametag in enumerate(ray_nametags):
result = ray.get(nametag)
print(idx)
Abbreviated instructions; see opensoundscape.org for full instructions
load additional packages
pip3 install --user virtualenvwrapper
pip3 install --user poetry
now run which virtualenvwrapper_lazy.sh
to check that it is in ~/.local/bin/
then open your ~/.bashrc and add these lines:
export PATH=$PATH:~/.local/bin #or wherever virtualenvwrapper_lazy.sh is
source virtualenvwrapper_lazy.sh
export WORKON_HOME=~/.cache/pypoetry/virtualenvs
just this time: run source ~/.bashrc
since it isn't sourced yet
use virtualenvwrapper as normal. For instance, create an environment:
mkvirtualenv testenv
to deactivate the environment:
deactivate
navigate to home directory, since that's where we want to keep opso. Use poetry to build dependencies in a new environment.
Note: you must load the right Python version using module load <anaconda_name>
before you create the poetry environment (see more info below)
cd ~ #go to home
git clone https://github.com/kitzeslab/opensoundscape.git #get opso
cd opensoundscape #enter directory
#if necessary, check out a branch, eg 'git checkout develop'
module load anaconda3/2020.11 #to get python >= 3.7
poetry install #create an environment with all dependencies and opensoundscape
workon <tab to get env name> #activate environment
Make sure to use the command above to load the correct Python version before using Poetry install.
- Once you create a poetry environment, its python version is fixed
- If you need to change python version:
- delete the environment, e.g.
poetry env remove opensoundscape-pSZCvE5P-py3.8
- activate an environment with the correct python version, e.g.
module load anaconda3/2020.11
- run
poetry install
To monitor the GPU/CPU usage of a job on the cluster, you can use the same nvidia-smi
and htop
command line utilities as you would use on a local machine like Snowy. To monitor usage of a currently-running job:
- Find the node on which the job is running. You can view the names, run time and nodes of your currently running jobs using
squeue -u YOURUSERNAME
. - When you first ssh into bridges, you are on a login node. From a login node you can ssh into the node your job is running on. e.g. if your job is running on node v001, from the login node:
ssh v001.ib.bridges2.psc.edu
. - Run
htop
to look at CPU usage ornvidia-smi
to look at GPU usage.
Snowy is a local GPU machine that we use as a server and workhorse for GPU and CPU heavy tasks.
You can access Snowy via SSH. Look at the lab's IP address list if you need snowy's IP address.
Ask Sam if you need login credentials for snowy.
We use the ~/projects folder as a home directory for projects. Create a subfolder with your username, and then further subfolders for each project. For instance, ~/projects/sml161/SONG50_i_now_speak_raven
Python environments are managed with conda. Always install pip in a new environment. Name environments starting with your Pitt username or initials, ie sml161_py37env.
First, install Miniconda using the following steps.
- Find the latest installer for the needed Python version here
- Use
wget
in the terminal and copy and paste the link to download the installer, e.g.wget https://repo.anaconda.com/miniconda/Miniconda3-py310_23.1.0-1-Linux-x86_64.sh
- After the download is complete, use
bash
to start the installer and follow the instructions, e.g.bash Miniconda3-py310_23.1.0-1-Linux-x86_64.sh
. Make sure to select the option to initialize Miniconda. - Log out and log back in again, or source your bashrc (
source ~/.bashrc
), to load the newly installedconda
command
Now create your own virtual environment.
# To exit an environment if you're already in one
conda deactivate
# To create your environment
conda create -n username_environmentName python=3.8 pip
# Activate your environment to install packages
conda activate username_environmentName
#if desired, install opensoundscape:
pip install opensoundscape==0.10.0
# to use Snowy's GPUs you need to install PyTorch in a very specific way
(last updated October 2023)
# this often changes, and we need to update the strategy regularly
# as of Oct 2024, we can actually just use the most up to date package versions, so you might get the
# right version when you just `pip install opensoundscape`. If you get cuda errors try this:
pip uninstall torch torchvision # remove previous versions
pip3 install torch torchvision torchaudio
# See this page for possible updates or the correct version for different hardware: https://pytorch.org/get-started/locally/. To check the current CUDA version, run `nvidia-smi` and look at the upper-right of the panel for CUDA Version. As of Oct 2024 we have 12.2 but it works fine with the current torch packages designed for 12.4.
#if desired, install jupyter-lab
conda install -c conda-forge jupyterlab
Note that you need to install pip in the new environment, otherwise, "pip install"-ed packages will be installed in the base environment!
Note 2: if your environment is using the wrong ffmpeg (you might get no backend errors, for instance), it may be that in the above command, conda installed a local ffmpeg but you want to use the system one. It worked for me to simply delete the one in the conda environment, though this is very hacky (rm -r /home/kitzeslab/miniconda3/envs/opso060/bin/ffmpeg
).
- SSH into snowy
- If jupyter-lab is not installed, install it
conda install -c conda-forge jupyterlab
- Start jupyter lab by running on port 8080 or any open port. Take note of the URL that is printed
jupyter-lab --no-browser --port=8080
- In a new browser window, use SSH to connect a local port (for instance, 8080) to the port Jupyter is running on (from the command above, by default also 8080). Use snowy's username and IP address, and provide the password if prompted.
ssh -L 8080:localhost:8080 username@remote_ip_address
- Now, open a web browser (Chrome) and visit the URL where Jupyter is running (it's printed in the terminal when you start Jupyter), e.g. http://localhost:8080/?token=....
When you are finished, use Ctrl+C to cancel the jupyter lab process from the command line.
To make a conda environment available in JupyterLab, use
ipython kernel install --user --name=environment_name
replacing environment_name
with the name of your conda environment.
conda env remove -n env_name
GPU/cuda errors such as this one are often resolved by rebooting the machine (ask other users before rebooting):
Failed to initialize NVML: Driver/library version mismatch
Our lab Mac server, robin
, can be used for data storage and small analyses. If you don't already have an account, create one by logging into your Pitt computing account while physically sitting at robin
(not via SSH).
We will also have to ask for you to be added to the JMR-USER VPN role.
To SSH,
-
Get onto the JMR-USER VPN using Pulse Secure. (You'll have to select the JMR-USER role.) :
- Download Pulse Secure
- Make a new connection named "Pitt VPN" with server
sremote.pitt.edu
. - Connect to the connection. Read the instructions on which secondary password to use for Duo authentication
- Enter your Pitt username and password.
- Select the JMR-USER Lab role after completing the Duo login
-
SSH into
robin
with the lab username:ssh <username>@robin.<rest of domain name>
-
Enter your password for your account on
robin
when prompted.
You can log into the graphical interface of our Mac computer, Robin, remotely. On a Mac machine, e.g. your laptop:
- Log into Pulse Secure
- Open Finder
- Open "Connect to Server" dialogue (press CMD+K or navigate menus: Go > Connect to Server)
- Enter in
vnc://robin.<rest of domain name>
and press "Connect" - In the dialog box that appears, enter the name and password of whatever account you want to use
- This will open up a program called "Screen Sharing" where you can use your mouse and keyboard to interact with Robin as if you are sitting in front of it.
Some things to keep in mind:
- This will allow you to control the screen on the Robin desktop in 103 Clapp. If the monitor is turned on it will look like a ghost is using the computer--and anyone in 103 Clapp could see what you are doing!
- If one user account is using the screen on the desktop and another account tries to log in, the screen sharing program will notify you. In that case, select the “log in as yourself” option in the notification box, and a new screen will be created that doesn’t show up on the desktop.
- For security, it is good practice to log out when you are finished with the Robin GUI. To log out, literally log out on the shared screen itself, e.g. click on the Apple menu on the screen share > Log out. It's not enough to just close the screen sharing program. For some tasks it may not be possible or desirable to log out (e.g. running a program in the background). In that case, just lock your screen using the "Lock screen" option in the Apple menu.
You can log into the graphical interface of our Windows computer, Phoebe, remotely. On a Mac machine, e.g. your laptop:
- Open the Mac App Store and search for Microsoft Remote Desktop
- Install Microsoft Remote Desktop 10
- Connect to your JMR-USER Pulse Secure role
- Open Microsoft Remote Desktop
- Open Connections > Add PC and use the following settings:
- PC name: the IP address of Robin
- Friendly name: anything (e.g. Phoebe)
- Click "add" then double click on the computer icon to connect. Ignore any warnings that might pop up.
- Log out of your account when you are finished to close the window