Skip to content

Latest commit

 

History

History
59 lines (46 loc) · 2.44 KB

README.md

File metadata and controls

59 lines (46 loc) · 2.44 KB

Traversome

Genomic structure frequency estimation from genome assembly graphs and long reads.

Installation

Install dependencies using conda. I recommend using the mamba version of conda.

mamba create -n traversome_env
mamba activate traversome_env
mamba install python numpy scipy sympy python-symengine dill typer loguru pyyaml
[Optional] Install dependencies for running Bayesian MCMC. If you want to run Bayesian mcmc with Traversome, you have to install pymc and pytensor. Due to the fast evolving of pymc, sometimes its installation may be unsuccessful and not seen during the installation.
mamba install pytensor pymc

Install Traversome using pip.

git clone --depth=1 https://github.com/JianjunJin/Traversome
pip install ./Traversome --no-deps

Command line interface (CLI)

traversome thorough -g graph.gfa -a align.gaf -o outdir --topo circular

Important optional flags to finetune for achieving valid result (high bootstrap support):

--min-read-id         Threshold for alignment identity, read with below which the alignment will be discarded. [default: 0.992]
--min-record-id       Threshold for alignment identity, a record of a read with below which the alignment will be discarded. [default: 0.99]
--min-align-len       Threshold for the continuous alignment length of a read, below which the alignment will be discarded. [default: 5000]
--min-align-counts    Threshold for counts per path, below which the alignment(s) of that path will be discarded. The default automatic selection (-1) does not guarantee the best performance - good bootstrap support. [default: auto]

Use traversome thorough -h to see details for above flags and other flags.

Interpreting results

|-- output_dir
    |-- traversome.log.txt          running log
    |-- variants.info.tab           information of survival variants after model selection and bootstrap
    |-- bootstrap.replicates.tab    bootstrap results
    |-- final.result.tab            summary of pangenome solutions
    |-- variant.*.fasta             sequence of each variant in the best supported result
    |-- pangenome.gfa               pangenome graph of the best supported result
    |-- options.yaml                information of options
    |-- readpath.information.tab    read path index -> alignment record indices
    |-- readpath.record_ids.tab     information of read paths and their congruent variant