-
-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Hash2Fragment by using a map to validate allowed sequence characters #402
Improve Hash2Fragment by using a map to validate allowed sequence characters #402
Conversation
I saw the small alphabet size in the PR and was curious what that threshold might be when a map performs better list for another issue I'm working on. I created some benchmarks just to see what would happen with sequences with different possible alphabets of varying length. These were the results: Map:
Contains:
Contains happened to perform better in these bench marks in a significant way. I wonder if caching starts to come into play with all the different data involved. Including the benchmark code and cpu profile in the comment. Also, Contains seems to be more performant in the Small/Complete alphabet with no mistakes case, which seems like it would be the most common case. *Note: updated because because I modified the code to call something that looks like |
cpu profile (ran them together for the profile. They were ran separately for the results above) |
Interesting! @matiasinsaurralde Do you have any thoughts on this? The use case of no-mistakes is definitely the most common by an order of magnitude or more |
@Koeng101 Agree with that, feel free to discard/ignore my suggestion |
Closing pull request, because it seems benchmarks show better efficiency with a Contains with our particular use case. |
Changes in this PR
Hash2Fragment
by using a map for more efficient validation of allowed sequence characters, discussion here.Why are you making these changes?
General improvement.
Are any changes breaking? (IMPORTANT)
No
Pre-merge checklist
All of these must be satisfied before this PR is considered
ready for merging. Mergeable PRs will be prioritized for review.
primers/primers_test.go
for what this might look like.CHANGELOG.md
in the[Unreleased]
section.