Multiple mapping #431

yazhinia · 2024-07-23T17:15:53Z

Hello,
Thank you developing this tool. I have a question related to the following point mentioned in the readme.
The abundance is the number of bases mapped to a contig, divided by the length of the contig. Reads mapping to n different locations are weighted 1/n.

In my test run (even with -N option set to >1), I only see secondary alignments which need not be equally well mapped to reference like the primary alignment. How is multiple mapping defined here?

Also, is it possible to get MD flag field (like Bowtie2) to help compute sequence identity for the alignment?

Thank you.

ksahlin · 2024-07-23T17:25:45Z

Hi @yazhinia,

The abundance is the number of bases mapped to a contig, divided by the length of the contig. Reads mapping to n different locations are weighted 1/n.

The text above refers only when using the command-line option --aemb which outputs coverage estimates only. How are you running strobealign?

Also, is it possible to get MD flag field (like Bowtie2) to help compute sequence identity for the alignment?

Strobealign does not support an MD field, but with option --eqx it outputs CIGAR strings with =/X for matching and mismatching bases. Such CIGAR strings allow identity computation between the reads and the reference.

yazhinia · 2024-07-23T19:08:06Z

Thank you for the prompt reply. I wanted to understand how the aligner chooses n alignments for a read, if exist, in the --aemb mode? When you say Reads mapping to n different locations, does it mean mapping to multiple locations are equally perfect alignment? If I wanted to get n alignments that are perfect for every read, how to get it in output?

Is -N option in strobealign same as -k option in bowtie2?

Great, --eqx option is very useful.

ksahlin · 2024-07-24T07:41:32Z

I wanted to understand how the aligner chooses n alignments for a read, if exist, in the --aemb mode?

In --aemb mode it chooses based on seed hits and their scores, specifically the NAM score described in the strobealign paper. Aemb uses only the mapping location(s) with the best NAM score. If n identical best scoring locations it uses those n locations. No extension alignment is performed.

Is -N option in strobealign same as -k option in bowtie2?

Yes, it reports N alignment in decreasing order of score. The best score is the primary alignment.

If I wanted to get n alignments that are perfect for every read, how to get it in output?

You can set -N n --eqx (n your integer of choice) and parse cigar strings that only look like e.g. 150= if the read is of length 150. This will give you all "perfect" alignments, i.e., no difference to reference.

yazhinia · 2024-07-24T08:35:01Z

Thank you very much for the explanation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple mapping #431

Multiple mapping #431

yazhinia commented Jul 23, 2024

ksahlin commented Jul 23, 2024

yazhinia commented Jul 23, 2024

ksahlin commented Jul 24, 2024

yazhinia commented Jul 24, 2024

Multiple mapping #431

Multiple mapping #431

Comments

yazhinia commented Jul 23, 2024

ksahlin commented Jul 23, 2024

yazhinia commented Jul 23, 2024

ksahlin commented Jul 24, 2024

yazhinia commented Jul 24, 2024