You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey it's me again :) I briefly corresponded with @ksahlin on a refined version of closed syncmers with 25-30% lower density for realistic values of k. This might be useful for strobealign.
Thanks, Daniel, very interesting work! This will be interesting to try.
Context to Daniel's post:
A closed syncmer is a k-mer sampled when the first or last s-mer is the smallest in the window. We currently sample a syncmer when the middle s-mer is the smallest (open syncmer).
I have been testing closed syncmers at several times in strobealign - they never perform quite as well as open syncmers when sampling middle s-mer (we use the third s-smer when density is 1/5). This is expected because open syncmers have better spread (garanteed lower distance bound of 3 when the density is 1/5), showed by Shaw & Yu, 2021. Many traditional closed syncmers (upper panel in Daniels plot) are sampled at distance 1 from each other (i.e., not a good spread).
However, open syncmers come at the cost of not having a window guarantee, so some regions might be sparsely sampled. Daniels' plot shows that we can possibly get both a good spread and the window guarantee to ensure that all regions have enough seeds.
Hey it's me again :) I briefly corresponded with @ksahlin on a refined version of closed syncmers with 25-30% lower density for realistic values of k. This might be useful for strobealign.
My code for generating these asymptotically optimal density closed syncmers is here: https://github.com/Daniel-Liu-c0deb0t/dlb-kmer-sampling/blob/main/src/lib.rs#L30 and I can explain the algorithm in more detail if needed.
The text was updated successfully, but these errors were encountered: