Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnpEff config file is passed verbatim (not linked in work directory) #102

Open
DOH-JDJ0303 opened this issue Jun 27, 2023 · 3 comments
Open

Comments

@DOH-JDJ0303
Copy link

Describe the bug
SNPEFF_ANN fails due to SnpEff being unable to read the config file (snpEff.config). I believe this is because it is being passed in the $args, rather than as an input, so the file name is being passed verbatim and is not actually being localized within the work directory (see log below). This was run using MycoSNP v1.5 (from GitHub, not local installation) on Nextflow Tower using an AWS Batch compute environment and an AWS S3 bucket. The command from nf-tower is below:

nextflow run 'https://github.com/CDCgov/mycosnp-nf'
		 -name welsh_validation_dataset
		 -params-file 'https://api.tower.nf/ephemeral/Z6BIZKQ-zmTOPiOCrs2Upg.json'
		 -with-tower
		 -r v1.5

Impact
Unable to run MycoSNP with SnpEff option, which means no detection/reporting of variants.

To Reproduce
Steps to reproduce the behavior:

  1. Run MycoSNP v1.5 on Nextflow Tower using the Welsh et al., 2021 validation dataset from GitHub.
  2. I also tried supplying the config file from an AWS bucket but got the same error (makes sense now that I see what the issue is).

Logs
Note: AWS bucket paths have been modified, but the general format remains the same.
'''
The exit status of the task that caused the workflow execution to fail was: 255

Error executing process > 'NFCORE_MYCOSNP:MYCOSNP:SNPEFF:SNPEFF_ANN (combined)'

Caused by:
Essential container in task exited

Command executed:

snpEff
-Xmx16g
-config s3://bucket-name/nextflow/mycosnp/snpEff.config
-v candida_auris_gca_016772135.1
-csvStats combined.csv
finalfiltered.vcf.gz
> combined.ann.vcf

cat <<-END_VERSIONS > versions.yml
"NFCORE_MYCOSNP:MYCOSNP:SNPEFF:SNPEFF_ANN":
snpeff: $(echo $(snpEff -version 2>&1) | cut -f 2 -d ' ')
END_VERSIONS

Command exit status:
255

Command output:
(empty)

Command error:
Picked up _JAVA_OPTIONS: -Djava.io.tmpdir=./
00:00:00 SnpEff version SnpEff 4.3t (build 2017-11-24 10:18), by Pablo Cingolani
00:00:00 Command: 'ann'
00:00:00 Reading configuration file 's3://bucket-name/nextflow/mycosnp/snpEff.config'. Genome: 'candida_auris_gca_016772135.1'
00:00:00 Reading config file: /tmp/nxf.XXXXhKEuMs/s3:/bucket-name/nextflow/mycosnp/snpEff.config
00:00:00 Reading config file: /usr/local/share/snpeff-4.3.1t-5/s3:/bucket-name/nextflow/mycosnp/snpEff.config
java.lang.RuntimeException: Cannot read config file 's3://bucket-name/nextflow/mycosnp/snpEff.config'
at org.snpeff.snpEffect.Config.readProperties(Config.java:704)
at org.snpeff.snpEffect.Config.readProperties(Config.java:716)
at org.snpeff.snpEffect.Config.readConfig(Config.java:592)
at org.snpeff.snpEffect.Config.init(Config.java:480)
at org.snpeff.snpEffect.Config.(Config.java:117)
at org.snpeff.SnpEff.loadConfig(SnpEff.java:451)
at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:1000)
at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:984)
at org.snpeff.SnpEff.run(SnpEff.java:1183)
at org.snpeff.SnpEff.main(SnpEff.java:162)
00:00:00 Logging
00:00:01 Done.

Work dir:
s3://bucket-name/nextflow/e6/e4bbc87323a4033e379f740e24c897

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line
'''

@urbagal
Copy link
Collaborator

urbagal commented Aug 29, 2023

Thanks for your comment. Is it giving issues when you run on the cluster > At our end, snpEff is running properly on the cluster. Unfortunately, we don't have access to the cloud to test it.

@DOH-JDJ0303
Copy link
Author

Hi @urbagal, this issue is only experienced when running MycoSNP on the cloud and might be specific to Nextflow Tower and/or AWS Batch. As described above, the problem is that the SnpEff config files are passed via the $args variable, rather than being passed as input to the process. This means that Nextflow does not stage the files, which is necessary when using Nextflow Tower or AWS batch. I submitted a pull request that addresses this issue #103.

@urbagal
Copy link
Collaborator

urbagal commented Aug 29, 2023

Oh, sorry I missed it. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants