Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple mapping #431

Open
yazhinia opened this issue Jul 23, 2024 · 4 comments
Open

Multiple mapping #431

yazhinia opened this issue Jul 23, 2024 · 4 comments

Comments

@yazhinia
Copy link

Hello,
Thank you developing this tool. I have a question related to the following point mentioned in the readme.
The abundance is the number of bases mapped to a contig, divided by the length of the contig. Reads mapping to n different locations are weighted 1/n.

In my test run (even with -N option set to >1), I only see secondary alignments which need not be equally well mapped to reference like the primary alignment. How is multiple mapping defined here?

Also, is it possible to get MD flag field (like Bowtie2) to help compute sequence identity for the alignment?

Thank you.

@ksahlin
Copy link
Owner

ksahlin commented Jul 23, 2024

Hi @yazhinia,

The abundance is the number of bases mapped to a contig, divided by the length of the contig. Reads mapping to n different locations are weighted 1/n.

The text above refers only when using the command-line option --aemb which outputs coverage estimates only. How are you running strobealign?

Also, is it possible to get MD flag field (like Bowtie2) to help compute sequence identity for the alignment?

Strobealign does not support an MD field, but with option --eqx it outputs CIGAR strings with =/X for matching and mismatching bases. Such CIGAR strings allow identity computation between the reads and the reference.

@yazhinia
Copy link
Author

Thank you for the prompt reply. I wanted to understand how the aligner chooses n alignments for a read, if exist, in the --aemb mode? When you say Reads mapping to n different locations, does it mean mapping to multiple locations are equally perfect alignment? If I wanted to get n alignments that are perfect for every read, how to get it in output?

Is -N option in strobealign same as -k option in bowtie2?

Great, --eqx option is very useful.

@ksahlin
Copy link
Owner

ksahlin commented Jul 24, 2024

I wanted to understand how the aligner chooses n alignments for a read, if exist, in the --aemb mode?

In --aemb mode it chooses based on seed hits and their scores, specifically the NAM score described in the strobealign paper. Aemb uses only the mapping location(s) with the best NAM score. If n identical best scoring locations it uses those n locations. No extension alignment is performed.

Is -N option in strobealign same as -k option in bowtie2?

Yes, it reports N alignment in decreasing order of score. The best score is the primary alignment.

If I wanted to get n alignments that are perfect for every read, how to get it in output?

You can set -N n --eqx (n your integer of choice) and parse cigar strings that only look like e.g. 150= if the read is of length 150. This will give you all "perfect" alignments, i.e., no difference to reference.

@yazhinia
Copy link
Author

Thank you very much for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants