Tassel4-Poly

Modified version of Tassel4 (v.4.3.7) for running the Tassel-GBS pipeline modified for polyploid species with high read depths

Overview

Tassel is a well-known Java program that has also implemented a genotyping-by-sequencing (GBS) analysis pipeline called Tassel-GBS. This pipeline has been broadly used by plant geneticists who want to use GBS-based single nucleotide polymorphism (SNP) in their genetic studies. However, one limitation for its use in polyploid species is that the pipeline provides an approximate or truncated number of read counts (up to 127), which are ultimately used to call diploidized genotypes. We modified Tassel-GBS pipeline so that it could store and return the actual read counts per individual per locus.

Running the pipeline

In the same way as Tassel3, Tassel4 is run from the terminal with command lines such as those found in this tutorial. Only two plugins have been directly affected by the modification: FastqToTBT and DiscoverySNPCaller. In order to get the exact read counts, one should change the flag -y to -sh so that a maximum number of 32,767 reads will be recorded (instead of up to 127 with the -y). This number should suffice the GBS experiments with higher depths demanded by polyploid species studies.

Example commands (adapted from the tutorial):

FastqToTBTPlugin

./tassel4-poly/run_pipeline.pl -fork1 -FastqToTBTPlugin -i fastq -k myGBSProject_key.txt -e ApeKI -o tbt -sh –t mergedTagCounts/myMasterTags.cnt -endPlugin -runfork1

DiscoverySNPCallerPlugin

./tassel4-poly/run_pipeline.pl -fork1 -DiscoverySNPCallerPlugin -i mergedTBT/myStudy.tbt.shrt -sh -m topm/myMasterTags.topm -mUpd topm/myMasterTagsWithVariants.topm -vcf -o vcf/raw/myGBSGenos_chr+ -mnF 0.8 -p myPedigreeFile.ped -mnMAF 0.02 -mnMAC 100000 -ref MyReferenceGenome.fa -sC 1 -eC 10 -endPlugin -runfork1

Remember to change the file extension *.byte to *.shrt where needed in subsequent plugins. Also, notice that VCF files returned by Tassel4-Poly still contains genotype calls as in a diploid species. In order to get quantitative (dosage) genotype calls for polyploid species as provided by SuperMASSA software, you may want to use the VCF files as input data in the VCF2SM pipeline.

Cite

If you end up using Tassel4-Poly in your study, please make sure to cite the publications below in your paper:

Glaubitz JC, Casstevens TM, Lu F, Harriman J, Elshire RJ, Sun Q, Buckler ES. (2014) TASSEL-GBS: A High Capacity Genotyping by Sequencing Analysis Pipeline. PLoS ONE 9(2): e90346. https://doi.org/10.1371/journal.pone.0090346
Pereira GS, Garcia AAF, Margarido GRA. (2018) A fully automated pipeline for quantitative genotype calling from next generation sequencing data in autopolyploids. BMC Bioinformatics 19:398. https://doi.org/10.1186/s12859-018-2433-6.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
build/net/maizegenetics		build/net/maizegenetics
common		common
dist		dist
lib		lib
src/net/maizegenetics		src/net/maizegenetics
tassel4-poly		tassel4-poly
LICENSE		LICENSE
README.md		README.md
build.xml		build.xml
cp.bat		cp.bat
run_pipeline.bat		run_pipeline.bat
run_pipeline.pl		run_pipeline.pl
start_tassel.bat		start_tassel.bat
start_tassel.pl		start_tassel.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tassel4-Poly

Overview

Running the pipeline

Cite

About

Releases

Packages

Languages

License

guilherme-pereira/tassel4-poly

Folders and files

Latest commit

History

Repository files navigation

Tassel4-Poly

Overview

Running the pipeline

Cite

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages