-
Notifications
You must be signed in to change notification settings - Fork 5
Building GridPACK on Cori
First, copy the gpbuild folder into the home directory:
rsync -rl --info=progress2 /global/common/software/m3363/gpbuild ~
There shouldn't be any need to edit any of the files in ~/gpbuild after copying, as they should be setup to install into $HOME/gpbuild.
First, configure the environment:
cd ~/gpbuild
source module_csh
module list
should print something like this:
1) modules/3.2.11.4 9) cray-libsci/19.06.1 17) dvs/2.12_2.2.167-7.0.1.1_17.11__ge473d3a2
2) altd/2.0 10) udreg/2.3.2-7.0.1.1_3.61__g8175d3d.ari 18) alps/6.6.58-7.0.1.1_6.30__g437d88db.ari
3) darshan/3.2.1 11) ugni/6.0.14.0-7.0.1.1_7.63__ge78e5b0.ari 19) rca/2.2.20-7.0.1.1_4.74__g8e3fb5b.ari
4) craype-network-aries 12) pmi/5.0.14 20) atp/2.1.3
5) craype-haswell 13) dmapp/7.1.1-7.0.1.1_4.72__g38cf134.ari 21) PrgEnv-gnu/6.0.5
6) cray-mpich/7.7.10 14) gni-headers/5.0.12.0-7.0.1.1_6.46__g3b1768f.ari 22) boost/1.69.0
7) gcc/8.3.0 15) xpmem/2.2.20-7.0.1.1_4.28__g0475745.ari 23) cmake/3.21.3
8) craype/2.6.2 16) job/2.2.4-7.0.1.1_3.55__g36b56f4.ari 24) git/2.21.0
cd ~/gpbuild/ga-5.8.1/
mkdir GA_shared
./configure --with-mpi-ts --disable-f77 --without-blas --enable-cxx --enable-i4 --prefix="$HOME/gpbuild/GA_shared" --enable-shared
make
make install
cd ~/gpbuild/petsc-3.7.6
make clean
chmod +x ./build_csh
./build_csh
sbatch --wait build.job # Produces reconfigure-cori-gnu-cxx-complex-opt.py
python2 reconfigure-cori-gnu-cxx-complex-opt.py
make all
rm -rf CMake*
export BOOST_DIR=/usr/common/software/boost/1.69.0/gnu/haswell
export LD_LIBRARY_PATH=$BOOST_DIR/lib:$LD_LIBRARY_PATH
export CMAKE_PREFIX_PATH=$BOOST_DIR/:$CMAKE_PREFIX_PATH
export BOOST_ROOT=$BOOST_DIR
export BOOST_INC=$BOOST_DIR/include
export BOOST_LIB=$BOOST_DIR/lib
CFLAGS='-L/opt/cray/xpmem/default/lib64 -lxpmem -L/opt/cray/ugni/default/lib64 -lugni -L/opt/cray/udreg/default/lib64 -ludreg -L/opt/cray/pe/pmi/default/lib64 -lpmi' \
cmake -Wdev \
-D BOOST_ROOT:STRING='/usr/common/software/boost/1.69.0/gnu/haswell' \
-D CMAKE_TOOLCHAIN_FILE:STRING=$HOME/gpbuild/GridPACK/src/build/ToolChain.cmake \
-D PETSC_DIR:STRING="$HOME/gpbuild/petsc-3.7.6" \
-D PETSC_ARCH:STRING='cori-gnu-cxx-complex-opt' \
-D BUILD_GA:BOOL=ON \
-D GA_INFINIBAND:BOOL=ON \
-D CMAKE_INSTALL_PREFIX:PATH="$HOME/gpbuild/GridPACK/src/gridpack-install" \
-D BUILD_SHARED_LIBS:BOOL=ON \
-D MPI_CXX_COMPILER:STRING='CC' \
-D MPI_C_COMPILER:STRING='cc' \
-D MPIEXEC:STRING='srun' \
-D CHECK_COMPILATION_ONLY:BOOL=true \
-D ENABLE_CRAY_BUILD:BOOL=true \
-D USE_PROGRESS_RANKS:BOOL=false \
-D CMAKE_BUILD_TYPE:STRING='RELWITHDEBINFO' \
-D CMAKE_VERBOSE_MAKEFILE:STRING=TRUE \
..
cd $HOME/gpbuild/GridPACK/src/build
rm -rf ../gridpack-install
mkdir ../gridpack-install
make clean
rm -rf CMake*
./build_csh
make; make install
This will install GridPACK to $HOME/gpbuild/GridPACK/src/gridpack-install
cd ~/gpbuild/GridPACK/python
GRIDPACK_DIR=$HOME/gpbuild/GridPACK/src/gridpack-install
export GRIDPACK_DIR
unset RHEL_OPENMPI_HACK
rm -rf build/*
rm -rf *.so
rm -rf gridpack_hadrec.egg-info/
python setup.py build
mkdir $GRIDPACK_DIR/lib/python
PYTHONPATH="${GRIDPACK_DIR}/lib/python:${PYTHONPATH}"
export PYTHONPATH
python setup.py install --home="$GRIDPACK_DIR"
salloc --nodes 1 --qos interactive --time 01:00:00 --constraint haswell -A m3363
cd ~/gpbuild/petsc-3.7.6/src/snes/examples/tutorials
export PETSC_DIR=$HOME/gpbuild/petsc-3.7.6
export PETSC_ARCH=cori-gnu-cxx-complex-opt
make ex1
make ex3
Output of srun -n 1 ./ex1 -ksp_gmres_cgs_refinement_type refine_always -snes_monitor_short
0 SNES Function norm 6.04152
1 SNES Function norm 4.78676
2 SNES Function norm 2.98646
3 SNES Function norm 0.230624
4 SNES Function norm 0.00193631
5 SNES Function norm 1.43559e-07
6 SNES Function norm < 1.e-11
Number of SNES iterations = 6
Output of srun -n 3 ./ex3 -nox -pc_type asm -mat_type mpiaij -snes_monitor_cancel -snes_monitor_short -ksp_gmres_cgs_refinement_type refine_always
:
srun -n 3 ./ex3 -nox -pc_type asm -mat_type mpiaij -snes_monitor_cancel -snes_monitor_short -ksp_gmres_cgs_refinement_type refine_always
atol=1e-50, rtol=1e-08, stol=1e-08, maxit=50, maxf=10000
0 SNES Function norm 5.41468
1 SNES Function norm 0.295258
2 SNES Function norm 0.000450229
3 SNES Function norm 1.38967e-09
Number of SNES iterations = 3
Norm of error 1.49751e-10 Iterations 3
salloc --nodes 1 --qos interactive --time 01:00:00 --constraint haswell -A m3363
cd ~/gpbuild/GridPACK/src/build/applications/powerflow
Output:
### GridPACK math module configured on 3 processors
Maximum number of iterations: 50
Convergence tolerance: 0.000001
0: I have 46 buses and 59 branches
1: I have 49 buses and 68 branches
2: I have 45 buses and 63 branches
repeat time = 1
----------test Iteration 0, before PF solve, Tol: 9.638903e-02
Residual norms for nrs_ solve.
0 KSP Residual norm 7.190983016757e-03
1 KSP Residual norm 1.347944633023e-03
2 KSP Residual norm 4.535626750145e-04
3 KSP Residual norm 2.917341728681e-04
4 KSP Residual norm 1.647921238149e-04
5 KSP Residual norm 7.897452726535e-05
6 KSP Residual norm 6.607054072768e-05
7 KSP Residual norm 5.371399518110e-05
8 KSP Residual norm 2.813124046284e-05
9 KSP Residual norm 1.594051825451e-05
10 KSP Residual norm 1.064548544847e-05
11 KSP Residual norm 8.490960299617e-06
12 KSP Residual norm 6.180701866024e-06
13 KSP Residual norm 4.020683912845e-06
14 KSP Residual norm 1.398606474076e-06
15 KSP Residual norm 8.626097626373e-07
16 KSP Residual norm 7.364346571727e-07
17 KSP Residual norm 6.848053732285e-07
18 KSP Residual norm 6.630520485578e-07
19 KSP Residual norm 6.560101273664e-07
20 KSP Residual norm 6.408068771526e-07
21 KSP Residual norm 5.938572278848e-07
22 KSP Residual norm 4.248234917688e-07
23 KSP Residual norm 2.594885010890e-07
24 KSP Residual norm 1.456851302217e-07
25 KSP Residual norm 7.026809525820e-08
26 KSP Residual norm 3.712644895932e-08
27 KSP Residual norm 1.436436895894e-08
28 KSP Residual norm 7.357363353615e-09
KSP Object:(nrs_) 3 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=50
tolerances: relative=1e-12, absolute=1e-08, divergence=10000.
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object:(nrs_) 3 MPI processes
type: bjacobi
block Jacobi: number of blocks = 3
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (nrs_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (nrs_sub_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=71, cols=71
package used to perform factorization: petsc
total: nonzeros=453, allocated nonzeros=453
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 40 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=71, cols=71
total: nonzeros=453, allocated nonzeros=1065
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 40 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 3 MPI processes
type: mpiaij
rows=201, cols=201
total: nonzeros=1347, allocated nonzeros=5908
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 40 nodes, limit used is 5
Iteration 1 Tol: 9.638903e-02
Residual norms for nrs_ solve.
0 KSP Residual norm 3.313863268360e-05
1 KSP Residual norm 1.604771965118e-05
2 KSP Residual norm 1.199185979790e-05
3 KSP Residual norm 7.960609461082e-06
4 KSP Residual norm 6.033927451473e-06
5 KSP Residual norm 4.482037948986e-06
6 KSP Residual norm 1.262973288967e-06
7 KSP Residual norm 9.394684496620e-07
8 KSP Residual norm 6.955684736309e-07
9 KSP Residual norm 4.966270714777e-07
10 KSP Residual norm 3.801378639361e-07
11 KSP Residual norm 3.159321420023e-07
salloc --nodes 1 --qos interactive --time 01:00:00 --constraint haswell -A m3363
cd ~/gpbuild/GridPACK/python
GRIDPACK_DIR=$HOME/gpbuild/GridPACK/src/gridpack-install
export GRIDPACK_DIR
PYTHONPATH="${GRIDPACK_DIR}/lib/python/gridpack_hadrec-0.0.1-py3.9-linux-x86_64.egg:${PYTHONPATH}” // add egg’s path to pythonpath
export PYTHONPATH
export PATH=$HOME/gpbuild/GridPACK/src/gridpack-install/bin:$PATH
Output of srun -n 3 python src/hello.py
: (seems to be working)
hello.py: hello from process 0 of 3
hello.py: hello from process 1 of 3
hello.py: hello from process 2 of 3
GridPACK math module configured on 3 processors
Output of srun -n 3 python src/task_manager.py
: (seems to be working)
process 0 of 3 executing task 0
process 0 of 3 executing task 3
process 0 of 3 executing task 6
process 0 of 3 executing task 9
process 0 of 3 executing task 12
process 0 of 3 executing task 15
process 0 of 3 executing task 18
process 0 of 3 executing task 21
process 0 of 3 executing task 24
process 0 of 3 executing task 27
process 0 of 3 executing task 30
process 0 of 3 executing task 33
process 0 of 3 executing task 36
process 0 of 3 executing task 39
process 0 of 3 executing task 42
process 0 of 3 executing task 45
process 0 of 3 executing task 48
process 0 of 3 executing task 51
process 0 of 3 executing task 54
process 0 of 3 executing task 57
process 0 of 3 executing task 60
process 0 of 3 executing task 63
process 0 of 3 executing task 66
process 0 of 3 executing task 69
process 0 of 3 executing task 72
process 0 of 3 executing task 75
process 0 of 3 executing task 78
process 0 of 3 executing task 81
process 0 of 3 executing task 84
process 0 of 3 executing task 87
process 0 of 3 executing task 90
process 0 of 3 executing task 93
process 0 of 3 executing task 96
process 0 of 3 executing task 99
process 1 of 3 executing task 2
process 1 of 3 executing task 4
process 1 of 3 executing task 8
process 1 of 3 executing task 11
process 1 of 3 executing task 14
process 1 of 3 executing task 17
process 1 of 3 executing task 20
process 1 of 3 executing task 23
process 1 of 3 executing task 26
process 1 of 3 executing task 29
process 1 of 3 executing task 32
process 1 of 3 executing task 35
process 1 of 3 executing task 38
process 1 of 3 executing task 41
process 1 of 3 executing task 44
process 1 of 3 executing task 47
process 1 of 3 executing task 49
process 1 of 3 executing task 52
process 1 of 3 executing task 55
process 1 of 3 executing task 58
process 1 of 3 executing task 61
process 1 of 3 executing task 64
process 1 of 3 executing task 67
process 1 of 3 executing task 70
process 1 of 3 executing task 73
process 1 of 3 executing task 76
process 1 of 3 executing task 80
process 1 of 3 executing task 83
process 1 of 3 executing task 86
process 1 of 3 executing task 89
process 1 of 3 executing task 92
process 1 of 3 executing task 95
process 1 of 3 executing task 98
process 2 of 3 executing task 1
process 2 of 3 executing task 5
process 2 of 3 executing task 7
process 2 of 3 executing task 10
process 2 of 3 executing task 13
process 2 of 3 executing task 16
process 2 of 3 executing task 19
process 2 of 3 executing task 22
process 2 of 3 executing task 25
process 2 of 3 executing task 28
process 2 of 3 executing task 31
process 2 of 3 executing task 34
process 2 of 3 executing task 37
process 2 of 3 executing task 40
process 2 of 3 executing task 43
process 2 of 3 executing task 46
process 2 of 3 executing task 50
process 2 of 3 executing task 53
process 2 of 3 executing task 56
process 2 of 3 executing task 59
process 2 of 3 executing task 62
process 2 of 3 executing task 65
process 2 of 3 executing task 68
process 2 of 3 executing task 71
process 2 of 3 executing task 74
process 2 of 3 executing task 77
process 2 of 3 executing task 79
process 2 of 3 executing task 82
process 2 of 3 executing task 85
process 2 of 3 executing task 88
process 2 of 3 executing task 91
process 2 of 3 executing task 94
process 2 of 3 executing task 97
GridPACK math module configured on 3 processors
Output of srun python setup.py test
(seems to be working)
running test
Searching for nose
Best match: nose 1.3.7
Processing nose-1.3.7-py3.7.egg
Using /global/u2/t/tflynn/gpbuild/GridPACK/python/.eggs/nose-1.3.7-py3.7.egg
running egg_info
writing gridpack_hadrec.egg-info/PKG-INFO
writing dependency_links to gridpack_hadrec.egg-info/dependency_links.txt
writing top-level names to gridpack_hadrec.egg-info/top_level.txt
reading manifest file 'gridpack_hadrec.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'gridpack_hadrec.egg-info/SOURCES.txt'
running build_ext
-- Cray Programming Environment 2.6.2 C
-- NOTE: LOADEDMODULES changed since initial config!
-- NOTE: this may cause unexpected build errors.
-- Cray Programming Environment 2.6.2 CXX
-- pybind11 v2.4.3
statusGRIDPACK_HAVE_GOSS: OFF
statusGRIDPACK_GOSS_LIBRARY:
-- Configuring done
-- Generating done
-- Build files have been written to: /global/u2/t/tflynn/gpbuild/GridPACK/python/build/temp.linux-x86_64-3.7
[ 0%] Built target parallel_scripts
Consolidate compiler generated dependencies of target gridpack
[ 50%] Linking CXX shared module ../../../gridpack.cpython-37m-x86_64-linux-gnu.so
[100%] Built target gridpack
hadrec_test (tests.gridpack_test.GridPACKTester) ... ok
hello_test (tests.gridpack_test.GridPACKTester) ... ok
task_test (tests.gridpack_test.GridPACKTester) ... ok
----------------------------------------------------------------------
Ran 3 tests in 4.924s
OK
GridPACK math module configured on 1 processors
Make a new conda environment:
conda create --name gpenv --clone base
conda activate gpenv
conda install -c conda-forge gym
pip install xmltodict
salloc --nodes 1 --qos interactive --time 01:00:00 --constraint haswell -A m3363
cd ~/powerGridEnv/src
conda activate gpenv
Output of srun python test_gridpack_env_39bus.py
: (seems to be working)
/global/homes/t/tflynn/.conda/envs/gpenv/lib/python3.7/site-packages/gym/spaces/box.py:142: UserWarning: WARN: Casting input x to numpy array.
logger.warn("Casting input x to numpy array.")
[(0, 0, 1.0, 0.1), (0, 1, 1.0, 0.1), (0, 2, 1.0, 0.1), (0, 3, 1.0, 0.1), (0, 4, 1.0, 0.1), (0, 5, 1.0, 0.1), (0, 6, 1.0, 0.1), (0, 7, 1.0, 0.1), (0, 8, 1.0, 0.1)]
-----------------root path of the rlgc: /global/u2/t/tflynn/powerGridEnv
!!!!!!!!!-----------------start the env
(7,)
------------------- Total steps: 50, Episode total reward without any load shedding actions: -4601.344304438557
------------------- Total steps: 80, Episode total reward with manually provided load shedding actions: -1472.2004908672507
volt_ob_noact.shape: (51, 4)
--------- GridPACK HADREC APP MODULE deallocated ----------
!!!!!!!!!-----------------finished gridpack env testing
Output of srun python test_gridpack_env_300bus.py
: (seems to be working)
/global/homes/t/tflynn/.conda/envs/gpenv/lib/python3.7/site-packages/gym/spaces/box.py:142: UserWarning: WARN: Casting input x to numpy array.
logger.warn("Casting input x to numpy array.")
test_gridpack_env_300bus.py:238: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
plt.subplot(121)
test_gridpack_env_300bus.py:247: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
plt.subplot(122)
[(0, 0, 1.0, 0.1), (0, 1, 1.0, 0.1), (0, 2, 1.0, 0.1), (0, 3, 1.0, 0.1), (0, 4, 1.0, 0.1), (0, 5, 1.0, 0.1), (0, 6, 1.0, 0.1), (0, 7, 1.0, 0.1), (0, 8, 1.0, 0.1)]
-----------------root path of the rlgc: /global/u2/t/tflynn/powerGridEnv
!!!!!!!!!-----------------start the env
finished loading npz file
ob_dim: 142 ac_dim: 34
finished loading the weights to the policy network
Fault tuple is (0, 0, 1.0, 0.08)
-----one episode testing finished without AI-provided actions, total steps: 80 total reward: -3623.717146067013
------------------- Episode total reward with AI provided actions: -3623.717146067013
Fault tuple is (0, 0, 1.0, 0.08)
-----one episode testing finished without any load shedding, total steps: 61 total reward: -14496.355926569628
------------------- Episode total reward without any action: -14496.355926569628
--------- GridPACK HADREC APP MODULE deallocated ----------
!!!!!!!!!-----------------finished gridpack env testing
Note that to get the script ars_rand_faultandpf_cases_LSTM_gridpack_general.py
working on Cori, we had to make a few changes. See the branch Cori/powerGridEnv
of this repo for a working version. These changes seem to have been needed due for compatibility with the version of Ray that was available on Cori at the time of running this code.
salloc --nodes 1 --qos interactive --time 01:00:00 --constraint haswell -A m3363
conda activate gpenv
srun python ars_rand_faultandpf_cases_LSTM_gridpack_general.py --cores 6 --n_iter 20
Partial output:
2022-01-27 19:41:13,564 INFO services.py:1173 -- View the Ray dashboard at http://127.0.0.1:8265
-----------------root path of the rlgc: /global/u2/t/tflynn/powerGridEnv
Trying to init with ip None and pw None...
Did the init!
This cluster consists of
1 nodes in total
64.0 CPU resources in total
Logging data to outputs_training/ars_39_bussys_1_pf_5_faultbus_1_dur_lstm_gridpack_v6/log.txt
Creating deltas table.
Created deltas table.
Initializing multidirection workers.
just set self.num_workers to 1
------------!!!! workers allocation:
------------!!!! total cores: 6 , total directions: 32 , onedirection_numofcasestorun: 5
------------!!!! num_workers: 1 , repeat: 32 , remain: 0
------------!!!!
Initializing policy.
Initializing optimizer.
Initialization of ARS complete.
Total Time Initialize ARS: 11.715039491653442
select_faultbuses_id: [4 3 0 2 1]
select_pfcases_id: [0]
select_fault_cases_tuples: [(0, 4, 1.0, 0.08), (0, 3, 1.0, 0.08), (0, 0, 1.0, 0.08), (0, 2, 1.0, 0.08), (0, 1, 1.0, 0.08)]
rollout_rewards shape: (32, 2)
deltas_idx shape: (32,)
Maximum reward of collected rollouts: -1674.5928014215574
---Time to print rollouts results: 0.00016307830810546875
---Full Time to generate rollouts: 25.364495038986206
----time to aggregate rollouts: 0.018156766891479492
Euclidean norm of update step: 33.69602337952102
g_hat shape, w_policy shape: (6112,) (6112,)
total time of one step 25.386450052261353
iter 0 done
[('cores', 6), ('decay', 0.9985), ('delta_std', 2), ('deltas_used', 16), ('dir_path', 'outputs_training/ars_39_bussys_1_pf_5_faultbus_1_dur_lstm_gridpack'), ('n_directions', 32), ('n_iter', 20), ('onedirection_numofcasestorun', 5), ('policy_file', ''), ('policy_network_size', [32, 32]), ('policy_type', 'LSTM'), ('rollout_length', 90), ('save_per_iter', 10), ('seed', 589), ('step_size', 1), ('tol_p', 0.001), ('tol_steps', 100)]
-------------------------------------
| Time | 25.6 |
| Iteration | 1 |
| AverageReward | -3.87e+03 |
| reward 0: | -3.94e+03 |
| reward 1: | -3.77e+03 |
| reward 2: | -3.93e+03 |
| reward 3: | -3.93e+03 |
| reward 4: | -3.76e+03 |
| timesteps | 5.12e+03 |
-------------------------------------
total time of save: 0.20161938667297363
select_faultbuses_id: [2 3 4 0 1]
select_pfcases_id: [0]
select_fault_cases_tuples: [(0, 2, 1.0, 0.08), (0, 3, 1.0, 0.08), (0, 4, 1.0, 0.08), (0, 0, 1.0, 0.08), (0, 1, 1.0, 0.08)]
rollout_rewards shape: (32, 2)
deltas_idx shape: (32,)
Maximum reward of collected rollouts: -1380.3128771101492
---Time to print rollouts results: 0.00010728836059570312
---Full Time to generate rollouts: 11.247551679611206
----time to aggregate rollouts: 0.0005881786346435547
Euclidean norm of update step: 35.93837568283732
g_hat shape, w_policy shape: (6112,) (6112,)
total time of one step 11.251289129257202
iter 1 done
total time of save: 1.430511474609375e-06
select_faultbuses_id: [3 1 0 4 2]
select_pfcases_id: [0]
select_fault_cases_tuples: [(0, 3, 1.0, 0.08), (0, 1, 1.0, 0.08), (0, 0, 1.0, 0.08), (0, 4, 1.0, 0.08), (0, 2, 1.0, 0.08)]
rollout_rewards shape: (32, 2)
deltas_idx shape: (32,)
Maximum reward of collected rollouts: -1046.8336988165634
---Time to print rollouts results: 0.00015091896057128906
---Full Time to generate rollouts: 12.794855833053589
----time to aggregate rollouts: 0.0006389617919921875
Euclidean norm of update step: 36.19825363812717
g_hat shape, w_policy shape: (6112,) (6112,)
total time of one step 12.798492431640625
iter 2 done
total time of save: 9.5367431640625e-07
select_faultbuses_id: [3 4 2 1 0]
select_pfcases_id: [0]
select_fault_cases_tuples: [(0, 3, 1.0, 0.08), (0, 4, 1.0, 0.08), (0, 2, 1.0, 0.08), (0, 1, 1.0, 0.08), (0, 0, 1.0, 0.08)]
rollout_rewards shape: (32, 2)
deltas_idx shape: (32,)
Maximum reward of collected rollouts: -1800.67744519864
---Time to print rollouts results: 0.0001068115234375
---Full Time to generate rollouts: 11.743767976760864
----time to aggregate rollouts: 0.0005624294281005859
Euclidean norm of update step: 34.23786550032807
g_hat shape, w_policy shape: (6112,) (6112,)
total time of one step 11.747586965560913
iter 3 done
total time of save: 9.5367431640625e-07
select_faultbuses_id: [0 3 2 4 1]
select_pfcases_id: [0]
select_fault_cases_tuples: [(0, 0, 1.0, 0.08), (0, 3, 1.0, 0.08), (0, 2, 1.0, 0.08), (0, 4, 1.0, 0.08), (0, 1, 1.0, 0.08)]
rollout_rewards shape: (32, 2)
...
Our scripts for running on multiple nodes are in the ./cori
folder in this repo. They are based on the example Ray submission scripts from here https://github.com/NERSC/slurm-ray-cluster
.
The only changes to the start-worker.sh
and start-head.sh
scripts from the example are that we need to do conda activate gpenv
in before executing the ray
commands.
See the commit 66bcbcc7a2f1777dcc308d966cc6d35677e09efd
of this repository for a working configuration.
cd ~/powerGridEnv/cori
sbatch test_32.sh
- Official GridPACK docs: https://www.gridpack.org/wiki/index.php/How_to_Build_GridPACK
- Software Dependencies: https://www.gridpack.org/wiki/index.php/Software_Required_to_Build_GridPACK