nf-cavalier

Nextflow Pipeline for singleton and family based candidate variant prioritisation based on gene lists using the Cavalier R package. This pipeline is a work in progress.

Installation

Clone this repositoty

Usage

Create and navigate to run working directory

Create configuration file in run directory named nextflow.config:

  params {
    // output directory
    outdir = 'output'
    
    // inputs
    snp_vcf = 'my_cohort.SNPs.vcf.gz'
    sv_vcf = 'my_cohort.SVs.vcf.gz'
    ped = 'families.ped'
    bams = 'bams.tsv'
    lists = 'my_list_file.tsv,HP:0001250'
    
    // filtering
    maf_dom = 0.0001
    maf_rec = 0.01
    maf_comp_het = 0.01
    maf_de_novo = 0.0001
    max_cohort_af = 1.0
    max_cohort_ac = 'Inf'
    min_impact = 'LOW'
    exclude_benign_missense = false
    include_sv_csv = true

    // reference config
    ref_hg38 = true
    ref_fasta = '/PATH/TO/GRCh38.fasta'
    pop_sv = '/PATH/TO/gnomad-sv.vcf.gz'
    ref_gene = '/PATH/TO/RefSeqGene.hg38.UCSC.txt'
    vep_cache = '/PATH/T0/vep-cache'
    vep_cache_ver = '104'
  }

Note: Bahlo Lab members should use this config as a starting point.
Params
- outdir - Output directory
- snp_vcf - (Optional) Input VCF file with SNP variant calls for all samples.
- sv_vcf - (Optional) Input VCF file with SV variant calls for all samples.
- ped - A Ped format file describing familial relationships, with 1/2 coding for unaffected/affected phenotypes (missing phenotype not supported).
- bams - TSV file with first column containing individual ID, second column containing path to indexed BAM file (no header row/ column names).
- lists - Comma separated list of gene lists to use for filtering. This may be a local TSV file, e.g. 'my_list_file.tsv' or a web based gene list, e.g. PAA:289.
  - Local TSV file
    - Path to a TSV file with mandatory named column. The file should have at least one of the following column names: ensembl_gene_id, hgnc_id, entrez_id or symbol. Optional column names are list_id, list_name, list_version and inheritance. Note that all gene IDs are ultimately converted to Ensembl Gene IDs using HGNC to match VEP annotation.
      
      e.g.
      list_id list_name list_version symbol inheritance PAA:202 Genetic Epilepsy 1.26 AARS1 AR PAA:202 Genetic Epilepsy 1.26 ABAT AR PAA:202 Genetic Epilepsy 1.26 ABCA2 AR
      or:
      list_id list_name list_version ensembl_gene_id inheritance PAA:202 Genetic Epilepsy 1.26 ENSG00000090861 AR PAA:202 Genetic Epilepsy 1.26 ENSG00000183044 AR PAA:202 Genetic Epilepsy 1.26 ENSG00000107331 AR
  - Web List - Cavalier will automatically retrieve the latest version of these web lists
    - PanelApp: PanelApp Australia or PanelApp Genomics England lists may be specified with "PAA:" or "PAE:" prefix respectively. e.g. PAA:289
    - HPO: Human phenotype ontology terms may be specified with the "HP:" prefex, e.g. HP:0001250
    - Genes4Epilepsy: Genes4Epilepsy lists may be specified with the "G4E:" prexif, e.g. "G4E:ALL" for All Epilepsy genes, or "G4E:Focal" for Focal epilepsy genes only
    - HGNC: Gene subsets by locus group can be extracted from HGNC, for example "HGNC:protein-coding" will give a list of all protein coding genes
- exclude_benign_missense - exclude missense variants that are predicted/annotated as benign by all of Sift, Polyphen and ClinVar. Missing annotations are ignored.
- include_sv_csv - Include "coding_sequence_variant" SVs regardless of VEP Impact.
First run:
nextflow run /PATH/TO/nf-cavalier
Resume run:
nextflow run /PATH/TO/nf-cavalier -resume
Recommended to run workflow in a screen session

Name		Name	Last commit message	Last commit date
Latest commit History 229 Commits
bin		bin
containers		containers
data		data
nf		nf
CHANGELOG.md		CHANGELOG.md
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nf-cavalier

Installation

Usage

About

Releases 4

Packages

Contributors 2

Languages

bahlolab/nf-cavalier

Folders and files

Latest commit

History

Repository files navigation

nf-cavalier

Installation

Usage

About

Resources

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 2

Languages

Packages