From 5cc846f75b309cc8ad5e554c17a7f33842eecdd0 Mon Sep 17 00:00:00 2001
From: Mike Jackson <michaelj@epcc.ed.ac.uk>
Date: Thu, 31 Oct 2019 05:52:24 -0700
Subject: [PATCH] Clarified Selecting the representative read documentation

A user queried the use of quality in which read is kept upon
deduplication (#261)

> your selection procedure does not seem to take into
> account read sequencing quality (only mapping quality).
>
> In other words, if 2 reads have the same high score
> mapping quality (i.e. unique mappers), one being long and
> with good base scores, the other short with errors, it will
> select randomly among these, too, right?

The response was:

> Yes, that's correct.

Updated umi-tools/dedup.py "Selecting the representative read" comment
section to clarify that the read quality is not used.
---
 umi_tools/dedup.py | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/umi_tools/dedup.py b/umi_tools/dedup.py
index bb48fd1e..cd3fb78f 100644
--- a/umi_tools/dedup.py
+++ b/umi_tools/dedup.py
@@ -23,7 +23,10 @@
 1. The read with the lowest number of mapping coordinates (see
 ``--multimapping-detection-method`` option)
 
-2. The read with the highest mapping quality
+2. The read with the highest mapping quality. Note that this is not
+the read sequencing quality and that if two reads have the same
+mapping quality then one will be picked at random regardless of the
+read quality.
 
 Otherwise a read is chosen at random.