Skip to content

Commit

Permalink
Merge pull request #55 from NAG-DevOps/openISS-reid-update
Browse files Browse the repository at this point in the history
Updating README and OpenISS script
  • Loading branch information
smokhov authored Oct 21, 2024
2 parents f2a5a37 + a6159ae commit 4f02bc2
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 34 deletions.
56 changes: 30 additions & 26 deletions src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -373,49 +373,53 @@ Time is in minutes, run Yolo with different hardware configurations GPU types V1


<!-- TOC --><a name="openiss-reid-tfk"></a>
## OpenISS-reid-tfk
## OpenISS Person Re-Identification Baseline

The following steps will provide the information required to execute the *OpenISS Person Re-Identification Baseline* Project (https://github.com/NAG-DevOps/openiss-reid-tfk) on *SPEED*
The following are the steps required to run the *OpenISS Person Re-Identification Baseline* Project (https://github.com/NAG-DevOps/openiss-reid-tfk) on the *Speed* cluster. This implementatoin is based on tensorflow and keras

<!-- TOC --><a name="environment"></a>
### Environment
<!-- TOC --><a name="Prerequisites"></a>
### Prerequisites

The pre-requisites to prepare the environment are located in `environment.yml` (https://github.com/NAG-DevOps/openiss-reid-tfk).
#### Dataset
Using the Market1501 dataset which consist of
- Train images: 12,936
- Query images: 3,368
- Gallery images: 15,913

Using a test dataset (Market1501) and 120 epochs as an example, we ran the script and the results were the following:
Running for 10 epochs as an example, the results for different Speed configurations were:
- Using GPU: 29 minute
- Using CPUs (32 cores): 6 hours and 49 minute

Speed 1 GPU: 5hrs 25min
#### Environment Setup
The environment setup instructions are located in `environment.yml` (https://github.com/NAG-DevOps/openiss-reid-tfk). Ensure all dependencies are correctly installed.

Speed CPU - 32 cores: 2 days 22 hours
<!-- TOC --><a name="configuration-and-execution"></a>
### Configuration and execution

TEST DATASET: Market1501
- Log into Speed and navigate to your speed-scratch directory:

ssh $[email protected]
cd /speed-scratch/$USER/

---- Train images: 12936
- Clone the GitHub repo from https://github.com/NAG-DevOps/openiss-reid-tfk

---- Query images: 3368
- Download the dataset: Navigate to the `datasets/` directory, make the script executable, and run `get_dataset_market1501.sh`:

---- Gallery images: 15913
chmod u+x *.sh && ./get_dataset_market1501.sh

<!-- TOC --><a name="configuration-and-execution"></a>
### Configuration and execution
- Download `openiss-reid-speed.sh` execution script from this repository.

- Log into Speed, go to your speed-scratch directory: `cd /speed-scratch/$USER/`
- Clone the repo from https://github.com/NAG-DevOps/openiss-reid-tfk
- Download the dataset: go to `datasets/` and run `get_dataset_market1501.sh`
- In `reid.py` set the epochs (`g_epochs=120` by default)
- Download `openiss-reid-speed.sh` from this repository
- On `environment.yml` comment or uncomment tensorflow accordingly (for CPU or GPU, GPU is default)
- On `openiss-reid-speed.sh` comment or uncomment the resourse allocation section accordingly (GPU is default), make sure you only request CPU or GPU but not both
- Submit the job:
- In `reid.py` set the number of epochs (`g_epochs=120` by default)

On CPUs nodes: `sbatch ./openiss-reid-speed.sh`
- In `environment.yml` comment/uncomment the TensorFlow section depending on whether you are running on CPU or GPU. GPU is enabled by default.

On GPUs nodes: `sbatch -p pg ./openiss-reid-speed.sh`
- In `openiss-reid-speed.sh` comment/uncomment the resource allocation lines for either CPU or GPU, depending on the target node (GPU is default). Ensure that only one type (CPU or GPU) is requested.

**IMPORTANT**
- Submit the job:

Modify the script `openiss-reid-speed.sh` to setup the job to be ready for CPUs or GPUs nodes; `--mem=` and `gpus=` in particular, see more information about these parameters on https://github.com/NAG-DevOps/speed-hpc/blob/master/doc/speed-manual.pdf
For CPU nodes: `sbatch ./openiss-reid-speed.sh`

For GPU nodes: `sbatch -p pg ./openiss-reid-speed.sh`

<!-- TOC --><a name="cuda"></a>
## CUDA
Expand Down
16 changes: 8 additions & 8 deletions src/openiss-reid-speed.sh
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
#!/encs/bin/tcsh

# Give job a name
#SBATCH -J openiss-reid
# Job name
#SBATCH --job-name openiss-reid

# Send an email when the job starts, finishes or if it is aborted.
# Recieve email notifications when the job starts, finishes or fails.
#SBATCH --mail-type=ALL

# Specify the output file name
#SBATCH -o openiss-reid-tfk.log

# Set output directory to current
#SBATCH --chdir=./

# Specify the output file name
#SBATCH -o openiss-reid-output-%A.log

# Request Memory
#SBATCH --mem=32G
#SBATCH --mem=20G

# Request CPU - comment this section if the job needs GPUs
##SBATCH -n 32
##SBATCH -c 32

# Request GPU - comment this section if the job needs CPUs and uncomment the previous section
#SBATCH --gpus=1
Expand Down

0 comments on commit 4f02bc2

Please sign in to comment.