Skip to content

06 Sort BAM files using PICARD

Neranjan Perera edited this page Dec 6, 2018 · 4 revisions

Once the singletons removed the BAM files are sorted using PICARD tools. For GATK analysis the BAM files need to be correctly formatted as well. The correct formatting includes:

  • It must be aligned
  • It must be sorted in coordinate order
  • It must list the read groups with sample names in the header
  • Every read must belong to a read group.
  • The BAM file must pass Picard ValidateSamFile validation

In order to pass the files into Picard for processing in this step we will sort the aligned reads in the coordinate order.

The command is as follows:

module load picard/2.9.2
export _JAVA_OPTIONS=-Djava.io.tmpdir=/scratch

java -jar $PICARD SortSam \
        INPUT=${INPUT_FILE_NAME}_filtered.bam \
        OUTPUT=${INPUT_FILE_NAME}_filtered_sort.bam \
        SORT_ORDER=coordinate \
        CREATE_INDEX=True

This will create sorted BAM format files:

align/
├── SRR1517848_filtered_sort.bam
├── SRR1517878_filtered_sort.bam
├── SRR1517884_filtered_sort.bam
├── SRR1517906_filtered_sort.bam
├── SRR1517991_filtered_sort.bam
├── SRR1518011_filtered_sort.bam
├── SRR1518158_filtered_sort.bam
└── SRR1518253_filtered_sort.bam

Tip

After sorting your reads you can check whether the reads are sorted accordingly and does it have the SO: coordinate flag to satisfy the GATK requirements by using samtools command to check the Header:
samtools view -H SRR1517848_filtered_sort.bam

which will produce:

@HD	VN:1.5	SO:coordinate
@SQ	SN:chr1	LN:249250621
@SQ	SN:chr2	LN:243199373
@SQ	SN:chr3	LN:198022430
@SQ	SN:chr4	LN:191154276
@SQ	SN:chr5	LN:180915260
@SQ	SN:chr6	LN:171115067
@SQ	SN:chr7	LN:159138663
@SQ	SN:chrX	LN:155270560
@SQ	SN:chr8	LN:146364022
@SQ	SN:chr9	LN:141213431
@SQ	SN:chr10	LN:135534747
@SQ	SN:chr11	LN:135006516
@SQ	SN:chr12	LN:133851895
@SQ	SN:chr13	LN:115169878
@SQ	SN:chr14	LN:107349540
@SQ	SN:chr15	LN:102531392
@SQ	SN:chr16	LN:90354753
@SQ	SN:chr17	LN:81195210
@SQ	SN:chr18	LN:78077248
@SQ	SN:chr20	LN:63025520
@SQ	SN:chrY	LN:59373566
@SQ	SN:chr19	LN:59128983
@SQ	SN:chr22	LN:51304566
@SQ	SN:chr21	LN:48129895
@SQ	SN:chr6_ssto_hap7	LN:4928567
@SQ	SN:chr6_mcf_hap5	LN:4833398
. 
.