Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AlignmentRecord.mateAlignmentEnd never set #1290

Closed
ryan-williams opened this issue Nov 24, 2016 · 3 comments
Closed

AlignmentRecord.mateAlignmentEnd never set #1290

ryan-williams opened this issue Nov 24, 2016 · 3 comments

Comments

@ryan-williams
Copy link
Member

Gotcha I just ran into: AlignmentRecord.mateAlignmentEnd is never set in ADAM.

That field doesn't exist on SAMRecords, and afaict mate's end is not inferable in e.g. SAMRecordConverter where the mate-start is set.

My use-case was mimicking samtools view's region-filtering behavior where unmapped reads will be included if their mate is mapped to an included region.

Having dug into it further, samtools view only counts unmapped reads as existing at the one-locus position indicated by mateAlignmentStart.

I have some code I will maybe make into a PR that constructs a ReferenceRegion representing an unmapped AlignmentRecord's mate, setting the end to one more than the start position, to be consistent with samtools view's behavior described above.

I am considering whether .setMateAlignmentEnd(mateStart + 1) (on unmapped reads only?) in SAMRecordConverter makes sense; it's technically not a correct value to set the mate-end to, so I think leaving that as a decision higher in the stack (e.g. when explicitly making a ReferenceRegion representing an unmapped read's mate like I described above) makes more sense.

In that case, there may be nothing actionable here, though maybe some documentation somewhere about mateAlignmentEnd never being set would make sense… maybe that field should be removed, or at least documented as having different set/null contracts than the parallel-named mateAlignmentStart and mateContigName.

@fnothaft
Copy link
Member

Yeah, I ran into this several months back in bigdatagenomics/quinine#38 as well. I think your ReferenceRegion addition makes sense. WRT SAMRecordConverter, my preference would be to nix the mateAlignmentEnd field upstream in bdg-formats.

@heuermh
Copy link
Member

heuermh commented Nov 24, 2016

nice catch! +1 to removing mateAlignmentEnd field

@ryan-williams
Copy link
Member Author

Filed bigdatagenomics/bdg-formats#115, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants