Skip to content

ref_map

Billy Rowell edited this page Aug 9, 2024 · 1 revision

Reference Map File Specification

Type Key Description Notes
String name Short name for reference Alphanumeric characters, underscores, and dashes only. Will be used in file names.
File fasta Reference genome FASTA
File fasta_index Reference genome FASTA index
File pbsv_splits Regions for pbsv parallelization below
File pbsv_tandem_repeat_bed Tandem Repeat BED used by PBSV to normalize SVs within TRs link
File trgt_tandem_repeat_bed Tandem Repeat catalog (BED) for TRGT genotyping link
File hificnv_exclude_bed Regions to be excluded by HIFICNV in gzipped BED format link
File hificnv_exclude_bed_index BED index link
File hificnv_expected_bed_male Expected allosome copy number BED for XY samples link
File hificnv_expected_bed_female Expected allosome copy number BED for XX samples link
File pharmcat_positions_vcf PharmCAT positions VCF
File pharmcat_positions_vcf_index PharmCAT positions VCF index

pbsv_splits

The pbsv_splits file is a JSON array of arrays of strings. Each inner array contains one or more chromosome names such that each inner array is of roughly equal size in base pairs. The inner arrays are processed in parallel. For example:

[
  ...
    [
        "chr10",
        "chr11"
    ],
    [
        "chr12",
        "chr13"
    ],
    [
        "chr14",
        "chr15"
    ],
  ...
]

PacBio WGS Variant Pipeline

Readme

Workflows

Subworkflows

Reference Inputs and Dependencies

Backends

Clone this wiki locally