Trim trailing whitespace (#4206)

This PR trims trailing whitespace and normalizes file endings in all files, and adds pre-commit hooks to enforce this for future contributions. These pre-commit hooks already exist in most other RAPIDS repositories, so it would be good to standardize. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Brad Rees (https://github.com/BradReesWork) - Jake Awe (https://github.com/AyodeAwe) - Chuck Hastings (https://github.com/ChuckHastings) URL: #4206
rapidsai · Mar 8, 2024 · 47119c3 · 47119c3
1 parent ab9e445
commit 47119c3
Show file tree

Hide file tree

Showing 174 changed files with 706 additions and 740 deletions.
diff --git a/.dockerignore b/.dockerignore
@@ -1,2 +1,2 @@
 # Ignore cmake builds from local machine that might have occured before attempting Docker build. Including these files will cause CMake cache conflict issues
-/cpp/build
+/cpp/build
diff --git a/.github/workflows/add-to-project.yml b/.github/workflows/add-to-project.yml
@@ -4,7 +4,7 @@ on:
   issues:
     types:
       - opened
-      
+
   pull_request_target:
     types:
       - opened

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -5,11 +5,13 @@
 exclude: '^thirdparty'
 repos:
   - repo: https://github.com/pre-commit/pre-commit-hooks
-    rev: v4.4.0
+    rev: v4.5.0
     hooks:
       - id: check-added-large-files
       - id: debug-statements
+      - id: end-of-file-fixer
       - id: mixed-line-ending
+      - id: trailing-whitespace
   - repo: https://github.com/psf/black
     rev: 22.10.0
     hooks:

diff --git a/benchmarks/cugraph-dgl/pytest-based/README.MD b/benchmarks/cugraph-dgl/pytest-based/README.MD
@@ -1,10 +1,10 @@
-## Run Benchmarks 
+## Run Benchmarks
 
-#### SG 
+#### SG
 ```
 pytest bench_cugraph_dgl_uniform_neighbor_sample.py -k "SG and fanout_10_25 and rmat_24_4" --benchmark-save='1_rmat_24_4.json'
 ```
-#### MG 
+#### MG
 
 ```
 DASK_NUM_WORKERS=2 pytest bench_cugraph_dgl_uniform_neighbor_sample.py -k "MG and fanout_10_25 and rmat_24_16" --benchmark-save='2_rmat_24_8.json'

diff --git a/benchmarks/cugraph/standalone/bulk_sampling/README.md b/benchmarks/cugraph/standalone/bulk_sampling/README.md
@@ -16,21 +16,21 @@ Required:
         the samples will be written to a new folder in /home/samples that
         contains information about the sampling run as well as the time
         of the run.
-    
+
     --dataset_root
         The folder where datasets are stored.  Uses the format described
         in the input format section.
-    
+
     --datasets
         Comma-separated list of datasets; can specify ogb or rmat (i.e. ogb_papers100M[2],rmat_22_16).
         For ogb datasets, can provide replication factor using brackets.
         Will attempt to read from dataset_root/<datset_name>.
-    
+
 Optional:
     --fanouts
         Comma-separated list of fanout values (i.e. [10, 25]).
         The default fanout is [10, 25].
-    
+
     --batch_sizes
         Comma-separated list of batch sizes (i.e. 500, 1000).
         Defaults to "512,1024"
@@ -39,7 +39,7 @@ Optional:
         Comma-separated list of seeds per call.  Controls the number of input seed vertices processed
         in a single sampling call.
         Defaults to 524288
-    
+
     --reverse_edges
         Whether to reverse the edges of the input edgelist. Should be set to False for PyG and True for DGL.
         Defaults to False (PyG).
@@ -52,8 +52,8 @@ Optional:
     --random_seed
         Seed for random number generation.
         Defaults to '62'
-    
-    
+
+
 ### Input Format
 The script expects its input data in the following format:
 ```
@@ -159,4 +159,4 @@ GPUs per node is currently unsupported by this script but should be possible in
 
 ### Output
 The results of training will be outputted to the logs directory with an `output.txt` file for each worker.
-These will be overwritten upon each run.  Accuracy is only reported on rank 0.
+These will be overwritten upon each run.  Accuracy is only reported on rank 0.
diff --git a/benchmarks/cugraph/standalone/bulk_sampling/run_sampling.sh b/benchmarks/cugraph/standalone/bulk_sampling/run_sampling.sh
@@ -67,7 +67,7 @@ handleTimeout 120 python ${MG_UTILS_DIR}/wait_for_workers.py \
 
 DASK_STARTUP_ERRORCODE=$LAST_EXITCODE
 
-echo $SLURM_NODEID 
+echo $SLURM_NODEID
 if [[ $SLURM_NODEID == 0 ]]; then
     echo "Launching Python Script"
     python ${SCRIPTS_DIR}/cugraph_bulk_sampling.py \
@@ -78,7 +78,7 @@ if [[ $SLURM_NODEID == 0 ]]; then
         --batch_sizes $BATCH_SIZE \
         --seeds_per_call_opts "524288" \
         --num_epochs $NUM_EPOCHS \
-        --random_seed 42 
+        --random_seed 42
 
     echo "DONE" > ${SAMPLES_DIR}/status.txt
 fi
@@ -108,4 +108,4 @@ sleep 2
 
 if [[ $SLURM_NODEID == 0 ]]; then
     rm ${SAMPLES_DIR}/status.txt
-fi
+fi
diff --git a/benchmarks/cugraph/standalone/bulk_sampling/run_train_job.sh b/benchmarks/cugraph/standalone/bulk_sampling/run_train_job.sh
@@ -16,7 +16,7 @@
 #SBATCH -p luna
 #SBATCH -J datascience_rapids_cugraphgnn-papers:bulkSamplingPyG
 #SBATCH -N 1
-#SBATCH -t 00:25:00 
+#SBATCH -t 00:25:00
 
 CONTAINER_IMAGE=${CONTAINER_IMAGE:="please_specify_container"}
 SCRIPTS_DIR=$(pwd)
@@ -81,4 +81,3 @@ srun \
             --fanout $FANOUT \
             --replication_factor $REPLICATION_FACTOR \
             --num_epochs $NUM_EPOCHS
-
diff --git a/benchmarks/dgl/README.md b/benchmarks/dgl/README.md
@@ -13,4 +13,4 @@ pytest dgl_benchmark.py::bench_dgl_pure_gpu
 ## For UVA Benchmarks
 ```
 pytest dgl_benchmark.py::bench_dgl_uva
-```
+```
diff --git a/benchmarks/shared/build_cugraph_ucx/README.MD b/benchmarks/shared/build_cugraph_ucx/README.MD
@@ -6,10 +6,10 @@ docker build -f cugraph_ucx.dockerfile . -t cugraph_ucx
 docker run --privileged -it --gpus=all --net=host cugraph_ucx /bin/bash
 
 #### Client Bandwidth Test
-python3 test_client_bandwidth.py 
+python3 test_client_bandwidth.py
 
 ```bash
-(base) root@exp02:/home# python3 test_client_bandwidth.py 
+(base) root@exp02:/home# python3 test_client_bandwidth.py
 2022-12-19 13:31:30,867 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
 2022-12-19 13:31:30,867 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
 2022-12-19 13:31:30,891 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
@@ -30,8 +30,8 @@ Bandwidth = 5.2037 gb/s
 #### Sampling Test
 python3 test_cugraph_sampling.py
 ```bash
-test_client_bandwidth.py  test_cugraph_sampling.py  
-(base) root@exp02:/home# python3 test_cugraph_sampling.py 
+test_client_bandwidth.py  test_cugraph_sampling.py
+(base) root@exp02:/home# python3 test_cugraph_sampling.py
 [1671456769.722931] [exp02:93   :0]          parser.c:1989 UCX  WARN  unused environment variable: UCX_MEMTYPE_CACHE (maybe: UCX_MEMTYPE_CACHE?)
 [1671456769.722931] [exp02:93   :0]          parser.c:1989 UCX  WARN  (set UCX_WARN_UNUSED_ENV_VARS=n to suppress this warning)
 2022-12-19 13:32:56,228 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
@@ -54,4 +54,4 @@ Sampling 1,000 took = 69.15879249572754 ms
 Sampling 10,000 took = 89.63620662689209 ms
 Sampling 100,000 took = 135.9888792037964 ms
 ----------------------------------------Completed Test----------------------------------------
-```
+```
diff --git a/benchmarks/shared/build_cugraph_ucx/build-ucx.sh b/benchmarks/shared/build_cugraph_ucx/build-ucx.sh
@@ -1,5 +1,5 @@
 #!/bin/bash
-# Copyright (c) 2023, NVIDIA CORPORATION.
+# Copyright (c) 2023-2024, NVIDIA CORPORATION.
 # SPDX-License-Identifier: Apache-2.0
 set -ex
 
@@ -16,4 +16,4 @@ mkdir build-linux && cd build-linux
     --enable-mt --enable-numa --with-gnu-ld --with-rdmacm --with-verbs \
     --with-cuda=${CUDA_HOME} \
     ${CONFIGURE_ARGS}
-make -j install
+make -j install
diff --git a/benchmarks/shared/build_cugraph_ucx/cugraph_ucx.dockerfile b/benchmarks/shared/build_cugraph_ucx/cugraph_ucx.dockerfile
@@ -55,7 +55,7 @@ RUN gpuci_mamba_retry install -y -c pytorch -c rapidsai-nightly -c rapidsai -c c
     tqdm
 
 
-# Build ucx from source with IB support 
+# Build ucx from source with IB support
 # on 1.14.x
 RUN conda remove --force -y ucx ucx-proc
 

diff --git a/ci/test.sh b/ci/test.sh
@@ -1,5 +1,5 @@
 #!/bin/bash
-# Copyright (c) 2019-2023, NVIDIA CORPORATION.
+# Copyright (c) 2019-2024, NVIDIA CORPORATION.
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
@@ -105,7 +105,7 @@ if hasArg "--run-python-tests"; then
     # rmat is not tested because of MG testing
     pytest --cache-clear --junitxml=${CUGRAPH_ROOT}/junit-cugraph-pytests.xml -v --cov-config=.coveragerc --cov=cugraph_pyg --cov-report=xml:${WORKSPACE}/python/cugraph_pyg/cugraph-coverage.xml --cov-report term --ignore=raft --ignore=tests/mg --ignore=tests/int --ignore=tests/generators --benchmark-disable
     echo "Ran Python pytest for cugraph_pyg : return code was: $?, test script exit code is now: $EXITCODE"
-    
+
     echo "Python pytest for cugraph-service (single-GPU only)..."
     cd ${CUGRAPH_ROOT}/python/cugraph-service
     pytest -sv --cache-clear --junitxml=${CUGRAPH_ROOT}/junit-cugraph-service-pytests.xml --benchmark-disable -k "not mg" ./tests

diff --git a/cpp/cmake/thirdparty/get_nccl.cmake b/cpp/cmake/thirdparty/get_nccl.cmake
@@ -1,5 +1,5 @@
 #=============================================================================
-# Copyright (c) 2021, NVIDIA CORPORATION.
+# Copyright (c) 2021-2024, NVIDIA CORPORATION.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -32,7 +32,3 @@ function(find_and_configure_nccl)
 endfunction()
 
 find_and_configure_nccl()
-
-
-
-
-Original file line number
+Diff line change
@@ Expand Up / @@ -4,7 +4,7 @@ on: @@
       issues:
         types:
           - opened
       pull_request_target:
         types:
           - opened
@@ Expand Down @@