Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
asylvz authored Aug 24, 2023
1 parent 6e96410 commit 5ea4fba
Showing 1 changed file with 41 additions and 33 deletions.
74 changes: 41 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,32 +82,6 @@ To test, simply type:
sh ./test_tardis.sh


SONIC file (annotations container)
==================================

SONIC files for some human and mouse genome reference versions are available at external repo: https://github.com/BilkentCompGen/sonic-prebuilt

* human_g1k_v37.sonic: SONIC file for Human Reference Genome GRCh37 (1000 Genomes Project version)
* Also download the reference genome at: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/human_g1k_v37.fasta.gz.
* ucsc_hg19.sonic: SONIC file for the human reference genome, UCSC version build hg19.
* Also download the reference genome at: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.
* ucsc_hg38.sonic: SONIC file for the human reference genome build 38.
* Also download the reference genome at: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.
* mm9.sonic: SONIC file for the mouse reference genome version mm9.
* Also download the reference genome at: http://hgdownload.cse.ucsc.edu/goldenPath/mm9/bigZips/chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.
* mm10.sonic: SONIC file for the mouse reference genome version mm10.
* Also download the reference genome at: http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.

Make sure that the same reference was used to align the reads beforehand (BAM file) and to create the SONIC file. The SONIC files and the reference FASTA files linked above are compatible.

Building the SONIC file
=======================

Please refer to the SONIC development repository: https://github.com/calkan/sonic/

The README.md file includes documentation on how to obtain the necessary files for different genomes from the UCSC Genome Browser.


Running TARDIS
===========================

Expand Down Expand Up @@ -135,33 +109,67 @@ There are three different ways of passing multiple input files to TARDIS:
--out multiplesamples


SONIC file (annotations container)
==================================

SONIC files for some human and mouse genome reference versions are available at external repo: https://github.com/BilkentCompGen/sonic-prebuilt

* human_g1k_v37.sonic: SONIC file for Human Reference Genome GRCh37 (1000 Genomes Project version)
* Also download the reference genome at: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/human_g1k_v37.fasta.gz.
* ucsc_hg19.sonic: SONIC file for the human reference genome, UCSC version build hg19.
* Also download the reference genome at: http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.
* ucsc_hg38.sonic: SONIC file for the human reference genome build 38.
* Also download the reference genome at: http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.
* mm9.sonic: SONIC file for the mouse reference genome version mm9.
* Also download the reference genome at: http://hgdownload.cse.ucsc.edu/goldenPath/mm9/bigZips/chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.
* mm10.sonic: SONIC file for the mouse reference genome version mm10.
* Also download the reference genome at: http://hgdownload.cse.ucsc.edu/goldenPath/mm10/bigZips/chromFa.tar.gz. Deflate the tar archive and concatenate all chromosomes into a single FASTA file.

Make sure that the same reference was used to align the reads beforehand (BAM file) and to create the SONIC file. The SONIC files and the reference FASTA files linked above are compatible.


Building the SONIC file
=======================

Please refer to the SONIC development repository: https://github.com/calkan/sonic/

The README.md file includes documentation on how to obtain the necessary files for different genomes from the UCSC Genome Browser.

*Please note that you can also generate the SONIC file while running TARDIS:*

tardis -i myinput.bam --ref GRCh38.fa --gaps hg38.gap.bed --reps hg38.repeats.out --dups hg38.dups.bed \
--make-sonic my_sonic.sonic --out myoutput


All parameters
==============

Basic Parameters:
Basic Parameters:

--bamlist [bamlist file] : A text file that lists input BAM files one file per line.
--input/-i [BAM files] : Input files in sorted and indexed BAM format. You can pass multiple BAMs using multiple --input parameters.
--out [output prefix] : Prefix for the output file names.
--ref [reference genome] : Reference genome in FASTA format.
--sonic [sonic file] : SONIC file that contains assembly annotations.
--hist-only : Generate fragment size histograms only, then quit.

Advanced Parameters:
Advanced Parameters:

--interdup : Run interspersed duplication clustering.
--read-cluster [int] : # of clusters that a specific read can be involved in (Default is 20).
--rp [int] : Minimum number of supporting read pairs in initial clustering (Default is 5).
--mei ["Alu:L1:SVA"] : List of mobile element names.
--mei [string ] : List of mobile element names separated by colon (Default is ["Alu:L1:SVA"])
--no-soft-clip : Skip soft clip remapping.
--no-interdup : Skip interspersed duplication clustering.
--no-mei : Skip mobile element insertion (MEI) clustering.
--resolved : Output sequence resolved vcf calls.
--xa : Look for the alternative mapping locations in BWA.
--first-chr [chr_index] : Start running from a specific chromosome [0-based index in reference file]
--last-chr [chr_index] : Run up to a specific chromosome [0-based index in reference file]
--first-chr [int] : Start running from a specific chromosome [0-based index in reference file]
--last-chr [int] : Run up to a specific chromosome [0-based index in reference file]

Additional parameters to build SONIC file within TARDIS:

--make-sonic [sonic file] : SONIC file that will contain the assembly annotations.
--sonic-info [\"string\"] : SONIC information string to be used as the reference genome name.
--sonic-info [string] : SONIC information string to be used as the reference genome name, e.g., hg19.
--gaps [gaps file] : Assembly gap coordinates in BED3 format.
--dups [dups file] : Segmental duplication coordinates in BED3 format.
--reps [reps file] : RepeatMasker annotation coordinates in RepeatMasker format. See manual for details.
Expand Down

0 comments on commit 5ea4fba

Please sign in to comment.