Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The choice of MPI process and OpenMP thread. #71

Open
LYZphy opened this issue Dec 10, 2024 · 2 comments
Open

The choice of MPI process and OpenMP thread. #71

LYZphy opened this issue Dec 10, 2024 · 2 comments

Comments

@LYZphy
Copy link

LYZphy commented Dec 10, 2024

Dear mVMC developers:
I have some questions about the MPI process and OpenMP threads.
For example, I want to calculate a square lattice model with a 2X2 sublattice on 1 node with 24 cores.
(1) If I do not use translation symmetry, there is no quantum number projection. So I set “NMPTrans=1” and “NVMCSample=1000” in modpara.def file. And I use “mpiexec -np 24 vmc.out -e namelist.def” to run expert mode. Does that mean the total number of Monte Carlo samples is 24X1000=24000?
(2) If I use translation symmetry, there are four quantum number projections according to a 2X2 sublattice. Is it a good choice to set OpenMP threads =4 and MPI process =6 (one OpenMP thread per core) by “export OMP_NUM_THREADS=4” and “mpiexec -np 6 vmc.out -e namelist.def”? If so, is the total number of Monte Carlo samples 6x1000=6000?
(3) For larger quantum number projections (larger sublattice or translation+rotation symmetry), how to set MPI process and OpenMP thread to improve the efficiency of mVMC calculation?

Thanks a lot!

@k-ido
Copy link
Collaborator

k-ido commented Dec 11, 2024

Thank you for using mVMC!
The answers within my experience are given as follows.

(1) If you do not use spin projection (NSPGaussLeg=1) and you set NSplitSize=1, the number of Monte Carlo samples should be 24000 for your case. In most cases, the total number of samples is # of MPI processes/NSplitSize X NVMCSample . (mVMC skips samples if the overlap between these samples and the wavefunction is not finite. Please see l. 157 in vmccal.c.) I recommend that NSplitSize should be 1 unless you can use a sufficiently large number of cores.

(2)

Is it a good choice to set OpenMP threads =4 and MPI process =6 (one OpenMP thread per core) by “export OMP_NUM_THREADS=4” and “mpiexec -np 6 vmc.out -e namelist.def”?

It depends on Hamiltonians you want to solve, but in general, you're right. I recommend that the number of OpenMP threads should be a divisor of the number of quantum number projections (NSPGaussLeg X NMPTrans).

If so, is the total number of Monte Carlo samples 6x1000=6000?

For the same condition of (1), you're right.

(3) The efficiency depends on the problem you want to solve. A recommended way to check the efficiency is that you set NSROptItrStep=2 in modpara.def and check zvo_CalcTimer.dat in output directory after running mVMC. By executing several mVMC jobs with different hyperparameters, you will find the appropriate parameter set.

@LYZphy
Copy link
Author

LYZphy commented Dec 11, 2024

Dear Ido, Thanks for your suggestions! I understand the set of MPI and OpenMP.
Best wishes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants