The choice of MPI process and OpenMP thread. #71

LYZphy · 2024-12-10T10:40:30Z

Dear mVMC developers:
I have some questions about the MPI process and OpenMP threads.
For example, I want to calculate a square lattice model with a 2X2 sublattice on 1 node with 24 cores.
(1) If I do not use translation symmetry, there is no quantum number projection. So I set “NMPTrans=1” and “NVMCSample=1000” in modpara.def file. And I use “mpiexec -np 24 vmc.out -e namelist.def” to run expert mode. Does that mean the total number of Monte Carlo samples is 24X1000=24000?
(2) If I use translation symmetry, there are four quantum number projections according to a 2X2 sublattice. Is it a good choice to set OpenMP threads =4 and MPI process =6 (one OpenMP thread per core) by “export OMP_NUM_THREADS=4” and “mpiexec -np 6 vmc.out -e namelist.def”? If so, is the total number of Monte Carlo samples 6x1000=6000?
(3) For larger quantum number projections (larger sublattice or translation+rotation symmetry), how to set MPI process and OpenMP thread to improve the efficiency of mVMC calculation?

Thanks a lot!

k-ido · 2024-12-11T13:27:51Z

Thank you for using mVMC!
The answers within my experience are given as follows.

(1) If you do not use spin projection (NSPGaussLeg=1) and you set NSplitSize=1, the number of Monte Carlo samples should be 24000 for your case. In most cases, the total number of samples is # of MPI processes/NSplitSize X NVMCSample . (mVMC skips samples if the overlap between these samples and the wavefunction is not finite. Please see l. 157 in vmccal.c.) I recommend that NSplitSize should be 1 unless you can use a sufficiently large number of cores.

(2)

Is it a good choice to set OpenMP threads =4 and MPI process =6 (one OpenMP thread per core) by “export OMP_NUM_THREADS=4” and “mpiexec -np 6 vmc.out -e namelist.def”?

It depends on Hamiltonians you want to solve, but in general, you're right. I recommend that the number of OpenMP threads should be a divisor of the number of quantum number projections (NSPGaussLeg X NMPTrans).

If so, is the total number of Monte Carlo samples 6x1000=6000?

For the same condition of (1), you're right.

(3) The efficiency depends on the problem you want to solve. A recommended way to check the efficiency is that you set NSROptItrStep=2 in modpara.def and check zvo_CalcTimer.dat in output directory after running mVMC. By executing several mVMC jobs with different hyperparameters, you will find the appropriate parameter set.

LYZphy · 2024-12-11T15:58:02Z

Dear Ido, Thanks for your suggestions! I understand the set of MPI and OpenMP.
Best wishes!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The choice of MPI process and OpenMP thread. #71

The choice of MPI process and OpenMP thread. #71

LYZphy commented Dec 10, 2024

k-ido commented Dec 11, 2024 •

edited

Loading

LYZphy commented Dec 11, 2024

The choice of MPI process and OpenMP thread. #71

The choice of MPI process and OpenMP thread. #71

Comments

LYZphy commented Dec 10, 2024

k-ido commented Dec 11, 2024 • edited Loading

LYZphy commented Dec 11, 2024

k-ido commented Dec 11, 2024 •

edited

Loading