Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Valid --mirtrace_species is required even with --mirgenedb when --mirgenedb_species is given #348

Closed
tdanhorn opened this issue May 1, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@tdanhorn
Copy link

tdanhorn commented May 1, 2024

Description of the bug

According to the documentation, it appears that the miRNA databases miRBase (used by default) and MirGeneDB are alternative sources of reference files, and either can be used. However, the parameter --mirtrace_species, which sets the species of the miRBase reference files, is required and there is no way to only use MirGeneDB.

Command used and terminal output

nextflow run smrnaseq -profile "$profile" \
        --input "$samplefile" \
        --protocol qiaseq \
        --outdir "$pipeoutdir" \
        --igenomes_ignore \
        --genome null \
        --fasta "$seqref" \
        --mirgenedb \
        --mirgenedb_species Hsa \
        --mirgenedb_gff ... // other mirgene parameters

Error message:
Reference species for miRTrace is not defined via the --mirtrace_species parameter.

When using "--mirtrace_species null", the pipeline continues until MIRTRACE_RUN and crashes there, since "null" is not a valid genome name.

Relevant files

No response

System information

Nextflow version: 23.04.0
Hardware: HPC
Executor (slurm)
Container engine: Apptainer
OS: RedHat Linux 8
Version of nf-core/smrnaseq: 2.3.0
@tdanhorn tdanhorn added the bug Something isn't working label May 1, 2024
@tdanhorn
Copy link
Author

tdanhorn commented May 1, 2024

I believe this is a structural issue. The statements using MirGeneDB is in an if clause governed by the --mirgenedb parameter, but nothing seems to control the use of miRBase ...

@apeltzer
Copy link
Member

apeltzer commented May 4, 2024

It shouldnt break functionality per se, but agree this should be fixed.

@lpantano
Copy link
Contributor

is this still a problem in dev?

I ran with this options, and mirtop ran with mirgenedb and mirtrace worked fine.

    config_profile_name        = 'Test profile'
    config_profile_description = 'Minimal test dataset to check pipeline function'

    // Limit resources so that this can run on GitHub Actions
    max_cpus   = 2
    max_memory = '6.GB'
    max_time   = '6.h'

    // Input data
    input            = 'samplesheet.csv'
    mirgenedb_mature           = 'hsa.fas'
    mirgenedb_hairpin          = 'https://mirgenedb.org/static/data/hsa/hsa-pre.fas'
    mirgenedb_gff        = 'hsa.gff'
    mirgenedb        = true
    mirgenedb_species='Hsa'
    mirtrace_species = 'hsa'
    skip_mirdeep     = true
    protocol         = 'illumina'

Happy to know more so I can help to fix this if I misunderstood.

@lpantano
Copy link
Contributor

just to add more context, mirtrace_species is required to run mirtrace. But mirtop will quantify with mirgenedb files if they are supplied.

@lpantano lpantano moved this to Ready in smrnaseq Jun 28, 2024
@apeltzer apeltzer added this to the 2.4.0 milestone Aug 8, 2024
@atrigila atrigila self-assigned this Aug 15, 2024
@atrigila
Copy link
Contributor

I was able to reproduce the error.

nextflow run smrnaseq/ -profile illumina,docker --outdir mirdb --fasta 'https://github.com/nf-core/test-datasets/raw/smrnaseq/reference/genome.fa' --mirgenedb true --mirgenedb_species Hsa  --input https://github.com/nf-core/test-datasets/raw/smrnaseq/samplesheet/v2.0/samplesheet.csv

Instead of exiting with the error MirGeneDB gff file not found, which would be expected because we are in MirGeneDB "mode" it exits with ERROR ~ Reference species for miRTrace is not defined via the --mirtrace_species parameter.

The fix is to make this check conditional, based on whether mirtrace_species is being used. In particular, the pipeline should only check for --mirtrace_species if --mirgenedb is not set.

I am working on this.

@atrigila
Copy link
Contributor

Adding this #131 as it has the same source of error.

@atrigila
Copy link
Contributor

Closed via #378

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants