Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

slurm command does not use MB, NPROC from config file #56

Open
hughesdj opened this issue Jan 21, 2020 · 0 comments
Open

slurm command does not use MB, NPROC from config file #56

hughesdj opened this issue Jan 21, 2020 · 0 comments

Comments

@hughesdj
Copy link

We successfully ran the small 200kb test case through fc_run and fc_unzip, under slurm using srun, with falcon-kit 1.4.4 & pypeflow 2.3.0.

But a larger data set fails with "out of memory" when the first srun command uses --mem-per-cpu=4000M --cpus-per-task=1 , and not 'NPROC': '6', 'MB': '30000' as specified in the cfg file.

How do we make the first call of srun use 30000MB? There is no mention of 4000MB anywhere in the cfg file.

"job.defaults": {
"JOB_QUEUE": "standard",
"MB": "30000",
"NPROC": "6",
"job_type": "slurm",
"njobs": "8",
"pwatcher_type": "blocking",
"submit": "srun --wait=0 -p ${JOB_QUEUE} -J ${JOB_NAME} -o ${JOB_STDOUT} -e ${JOB_STDERR} --mem-per-cpu=${MB}M --cpus-per-task=${NPROC} ${JOB_SCRIPT}",
"use_tmpdir": false
},
"job.step.asm": {},
"job.step.cns": {},
"job.step.da": {},
"job.step.dust": {},
"job.step.la": {},
"job.step.pda": {},
"job.step.pla": {}
}
[INFO]In simple_pwatcher_bridge, pwatcher_impl=<module 'pwatcher.blocking' from '/home/data/bioinf_resources/programming_tools/miniconda3/envs/denovo_asm5/lib/python3.7/site-packages/pwatcher/blocking.py'>
[INFO]job_type='slurm', (default)job_defaults={'job_type': 'slurm', 'pwatcher_type': 'blocking', 'JOB_QUEUE': 'standard', 'njobs': '8', 'NPROC': '6', 'MB': '30000', 'submit': 'srun --wait=0 -p ${JOB_QUEUE} -J ${JOB_NAME} -o ${JOB_STDOUT} -e ${JOB_STDERR} --mem-per-cpu=${MB}M --cpus-per-task=${NPROC} ${JOB_SCRIPT}', 'use_tmpdir': False}, use_tmpdir=False, squash=False, job_name_style=0
[INFO]Setting max_jobs to 8; was None
[INFO]Num unsatisfied: 2, graph: 2
[INFO]About to submit: Node(0-rawreads/build)
[INFO]Popen: 'srun --wait=0 -p standard -J P26a7bf2afdd410 -o /home/data/pest_genomics/DH_test/falcon_example5/out/0-rawreads/build/run-P26a7bf2afdd410.bash.stdout -e /home/data/pest_genomics/DH_test/falcon_example5/out/0-rawreads/build/run-P26a7bf2afdd410.bash.stderr --mem-per-cpu=4000M --cpus-per-task=1 /home/data/bioinf_resources/programming_tools/miniconda3/envs/denovo_asm5/lib/python3.7/site-packages/pwatcher/mains/job_start.sh'
[INFO](slept for another 0.0s -- another 1 loop iterations)
[INFO](slept for another 0.30000000000000004s -- another 2 loop iterations)
[...]
[...]

[INFO](slept for another 180.0s -- another 18 loop iterations)
[INFO](slept for another 190.0s -- another 19 loop iterations)
[INFO](slept for another 200.0s -- another 20 loop iterations)
srun: error: rothhpc402: task 0: Out Of Memory

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant