-
-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scala implementations for some SamRecord methods #809
base: main
Are you sure you want to change the base?
Conversation
nh13
commented
Mar 10, 2022
- refPosAtReadPos
- readPosAtRefPos
@@ -236,9 +236,83 @@ trait SamRecord { | |||
// transient attributes | |||
@inline final val transientAttrs: TransientAttrs = new TransientAttrs(this) | |||
|
|||
// TODO long-term: replace these two methods with methods on [[Cigar]] to save creating alignment blocks in memory | |||
@inline final def refPosAtReadPos(pos: Int) = getReferencePositionAtReadPosition(pos) | |||
@inline final def readPosAtRefPos(pos: Int, returnLastBaseIfDeleted: Boolean) = getReadPositionAtReferencePosition(pos, returnLastBaseIfDeleted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are breaking changes to the API. I like returning Option[Int]
better, so I can return None
instead of 0
* @param returnLastBaseIfInserted if the reference is an insertion, true to returned the previous reference base, | ||
* false to return None | ||
* */ | ||
@inline final def refPosAtReadPos(pos: Int, returnLastBaseIfInserted: Boolean = false): Option[Int] = if (pos <= 0) None else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returnLastBaseIfInserted
is a new parameter, which htsjdk doesn't even have!
@@ -289,4 +289,48 @@ class SamRecordTest extends UnitSpec with OptionValues { | |||
rec1.matesOverlap shouldBe None // Mate's start is not enclosed by rec, and mate's end cannot be determined | |||
rec2.matesOverlap.value shouldBe true // Mate's start is enclosed by rec, regardless of where mate end is | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests copied from htsjdk, with a few added
* @param pos the 1-based read position to query | ||
* @param returnLastBaseIfInserted if the reference is an insertion, true to returned the previous reference base, | ||
* false to return None | ||
* */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to think carefully about, and document here, what the behaviour should be if:
a) The read position given is > read.length (probably a require() to throw an exception?)
b) The read position given is within soft-clipping
c) Hard-clipping is present
For (b) I would generally say to return None
, and I think that's probably still the best route, but it does mean that if returnLastBaseIfInserted
is false, it'll be hard to distinguish between an insertion and clipping.
For (c) I suspect we just want to ignore hard clipping? But make sure we do the right thing if it's there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly you return None
if pos < 0
and I think that should probably throw an exception.
* @param returnLastBaseIfDeleted if the reference is a deletion, true to returned the previous read base, false to | ||
* return None | ||
* */ | ||
@inline final def readPosAtRefPos(pos: Int, returnLastBaseIfDeleted: Boolean = false): Option[Int] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here and in the other method we should rename the parameter to returnPriorBaseIf[Deleted]
as I think prior
is clearer that we're talking about the base prior to the one requested, vs. last
could imply the last base in the read.
@inline final def readPosAtRefPos(pos: Int, returnLastBaseIfDeleted: Boolean = false): Option[Int] = { | ||
if (this.unmapped || pos < this.start || this.end < pos) None | ||
else { // overlaps | ||
var readPos = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not 1 since we're returning 1-based?
} | ||
false | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this implementation has the same problem with clipping. If I have a read that has rec.start = 100
and cigar 10S90M
and I request readPosAtRefPos(100)
I think this will erroneously return 1
and not 11
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wont since we check pos
against CoordMath.getEnd(refPos, elem.lengthOnTarget)
. Since a soft-clip has 0
for elem.lengthOnTarget
, then CoordMath.getEnd
will return refPos-1
, which is one before pos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests
* refPosAtReadPos * readPosAtRefPos
1c7826f
to
6908161
Compare
6908161
to
5232ac9
Compare
@tfenne back to you |
Codecov Report
@@ Coverage Diff @@
## main #809 +/- ##
==========================================
- Coverage 95.58% 95.57% -0.02%
==========================================
Files 122 122
Lines 7068 7114 +46
Branches 511 503 -8
==========================================
+ Hits 6756 6799 +43
- Misses 312 315 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|