-
Notifications
You must be signed in to change notification settings - Fork 3
07 Remove PCR Duplicates using PICARD
Neranjan Perera edited this page Dec 6, 2018
·
2 revisions
During the sequencing the same DNA molecules can be sequenced multiple times resulting in duplicates. These reads should not be counted as information in variant detection. In this step we will mark the duplicate reads and will remove them.
Following command will remove the duplicate reads from each sample file.
module load picard/2.9.2 export _JAVA_OPTIONS=-Djava.io.tmpdir=/scratch cd ../${d3}/ java -jar $PICARD MarkDuplicates \ INPUT=../${d2}/${INPUT_FILE_NAME}_filtered_sort.bam \ OUTPUT=${INPUT_FILE_NAME}_nodup.bam \ REMOVE_DUPLICATES=Ture \ METRICS_FILE=${INPUT_FILE_NAME}_metrics.txt \ CREATE_INDEX=True
This will result in duplicates removed BAM files which will be:
noduplicates/ ├── SRR1517848_metrics.txt ├── SRR1517848_nodup.bai ├── SRR1517848_nodup.bam ├── SRR1517878_metrics.txt ├── SRR1517878_nodup.bai ├── SRR1517878_nodup.bam ├── SRR1517884_metrics.txt ├── SRR1517884_nodup.bai ├── SRR1517884_nodup.bam ├── SRR1517906_metrics.txt ├── SRR1517906_nodup.bai ├── SRR1517906_nodup.bam ├── SRR1517991_metrics.txt ├── SRR1517991_nodup.bai ├── SRR1517991_nodup.bam ├── SRR1518011_metrics.txt ├── SRR1518011_nodup.bai ├── SRR1518011_nodup.bam ├── SRR1518158_metrics.txt ├── SRR1518158_nodup.bai ├── SRR1518158_nodup.bam ├── SRR1518253_metrics.txt ├── SRR1518253_nodup.bai └── SRR1518253_nodup.bam