pbRUGD-workflow

Workflow for the comprehensive detection and prioritization of variants in human genomes with PacBio HiFi reads

Note: Workflow is committed. Web app code to come.

Authors

William Rowell (@williamrowell)
Aaron Wenger (@amwenger)

Description

This repo consists of three Snakemake workflows:

process_smrtcells
process_samples
process_cohorts

`process_smrtcells`

find new HiFi BAMs or FASTQs under smrtcells/ready/*/
align HiFi reads to reference (GRCh38 by default) with pbmm2
calculate aligned coverage depth with mosdepth
calculate depth ratios (chrX:chrY, chrX:chr2) from mosdepth summary to check for sample swaps
calculate depth ratio (chrM:chr2) from mosdepth summary to check for consistency between runs
count kmers in HiFi reads using jellyfish, dump and export modimers

`process_sample`

launch once sample has been sequenced to sufficient depth
discover and call structural variants with pbsv
call small variants with DeepVariant
phase small variants with WhatsHap
merge per SMRT Cell BAMs and tag merged bam with haplotype based on WhatsHap phased DeepVariant variant calls
merge jellyfish kmer counts
assemble reads with hifiasm and calculate stats with calN50.js
align assembly to reference with minimap2
check for sample swaps by calculate consistency of kmers between sequencing runs

`process_cohort`

launched once all samples in cohort have been processed
if multi-sample cohort
- jointly call structural variants with pbsv
- jointly call small variants with GLnexus
using slivar
- annotate variant calls with population frequency from gnomAD and HPRC variant databases
- filter variant calls according to population frequency and inheritance patterns
- detect possible compound heterozygotes, and filter to remove cis-combinations
- assign a phenotype rank (Phrank) score, based on Jagadeesh KA, et al. 2019. Genet Med.

Dependencies

some tools (e.g. pbsv) require linux
conda
singularity >= 3.5.3 installed by root
environment.yaml

Configuration

config.yaml contains file paths and version numbers for docker images
reference.yaml contains file paths and names related to reference
*.cluster.yaml contains example cluster configuration for a slurm cluster with a compute queue for general compute and a ml queue for GPU.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
rules		rules
scripts		scripts
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
environment.yaml		environment.yaml
example_cohort.yaml		example_cohort.yaml
process_cohort.cluster.yaml		process_cohort.cluster.yaml
process_cohort.sh		process_cohort.sh
process_cohort.smk		process_cohort.smk
process_sample.cluster.yaml		process_sample.cluster.yaml
process_sample.sh		process_sample.sh
process_sample.smk		process_sample.smk
process_smrtcells.cluster.yaml		process_smrtcells.cluster.yaml
process_smrtcells.sh		process_smrtcells.sh
process_smrtcells.smk		process_smrtcells.smk
reference.yaml		reference.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pbRUGD-workflow

Workflow for the comprehensive detection and prioritization of variants in human genomes with PacBio HiFi reads

Authors

Description

`process_smrtcells`

`process_sample`

`process_cohort`

Dependencies

Configuration

About

Releases

Packages

Languages

License

amwenger/pbRUGD-workflow

Folders and files

Latest commit

History

Repository files navigation

pbRUGD-workflow

Workflow for the comprehensive detection and prioritization of variants in human genomes with PacBio HiFi reads

Authors

Description

process_smrtcells

process_sample

process_cohort

Dependencies

Configuration

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`process_smrtcells`

`process_sample`

`process_cohort`

Packages