-
Notifications
You must be signed in to change notification settings - Fork 3
06 Sort BAM files using PICARD
Neranjan Perera edited this page Dec 6, 2018
·
4 revisions
Once the singletons removed the BAM files are sorted using PICARD tools. For GATK analysis the BAM files need to be correctly formatted as well. The correct formatting includes:
- It must be aligned
- It must be sorted in coordinate order
- It must list the read groups with sample names in the header
- Every read must belong to a read group.
- The BAM file must pass Picard ValidateSamFile validation
In order to pass the files into Picard for processing in this step we will sort the aligned reads in the coordinate order.
The command is as follows:
module load picard/2.9.2 export _JAVA_OPTIONS=-Djava.io.tmpdir=/scratch java -jar $PICARD SortSam \ INPUT=${INPUT_FILE_NAME}_filtered.bam \ OUTPUT=${INPUT_FILE_NAME}_filtered_sort.bam \ SORT_ORDER=coordinate \ CREATE_INDEX=True
This will create sorted BAM format files:
align/ ├── SRR1517848_filtered_sort.bam ├── SRR1517878_filtered_sort.bam ├── SRR1517884_filtered_sort.bam ├── SRR1517906_filtered_sort.bam ├── SRR1517991_filtered_sort.bam ├── SRR1518011_filtered_sort.bam ├── SRR1518158_filtered_sort.bam └── SRR1518253_filtered_sort.bam
After sorting your reads you can check whether the reads are sorted accordingly and does it have the SO: coordinate
flag to satisfy the GATK requirements by using samtools
command to check the Header:
samtools view -H SRR1517848_filtered_sort.bam
which will produce:
@HD VN:1.5 SO:coordinate @SQ SN:chr1 LN:249250621 @SQ SN:chr2 LN:243199373 @SQ SN:chr3 LN:198022430 @SQ SN:chr4 LN:191154276 @SQ SN:chr5 LN:180915260 @SQ SN:chr6 LN:171115067 @SQ SN:chr7 LN:159138663 @SQ SN:chrX LN:155270560 @SQ SN:chr8 LN:146364022 @SQ SN:chr9 LN:141213431 @SQ SN:chr10 LN:135534747 @SQ SN:chr11 LN:135006516 @SQ SN:chr12 LN:133851895 @SQ SN:chr13 LN:115169878 @SQ SN:chr14 LN:107349540 @SQ SN:chr15 LN:102531392 @SQ SN:chr16 LN:90354753 @SQ SN:chr17 LN:81195210 @SQ SN:chr18 LN:78077248 @SQ SN:chr20 LN:63025520 @SQ SN:chrY LN:59373566 @SQ SN:chr19 LN:59128983 @SQ SN:chr22 LN:51304566 @SQ SN:chr21 LN:48129895 @SQ SN:chr6_ssto_hap7 LN:4928567 @SQ SN:chr6_mcf_hap5 LN:4833398 . .