Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: fix sketch downsampling across the plugin, incl fastmultigather, manysearch, and multisearch #504

Merged
merged 26 commits into from
Nov 11, 2024

Conversation

ctb
Copy link
Collaborator

@ctb ctb commented Nov 10, 2024

Tackles #505, and maybe fixes #483.

This PR adds tests and short-term fixes for a myriad of downsampling problems, largely caused by sourmash-bio/sourmash#3384, in which Collection:sig_from_record does not downsample the loaded sketch.

Specifically this PR:

  • fixes fastmultigather to downsample the query properly;
  • fixes fasmultigather on a RocksDB to properly handle scaled, includig downsampling the query, and also checking the max scaled on the RocksDB;
  • fixes manysearch to downsample both the queries (via MultiCollection) and database sketches, as well as output scaled, ksize, and moltype;
  • fixes manysearch against RocksDB to downsample the query, as well as check the max scaled on the RocksDB and output scaled, ksize, and moltype;
  • fixes multisearch and pairwise to downsample queries and against, as well as output scaled, ksize, and moltype;
  • adjusts MultiCollection::load_sketches to downsample them to the provided selection (as used in manysearch;
  • adds tests for the above.

I'm doing a patchfix release with this because it's returning incorrect results ;(. At some point we will want to revisit the code and clean it up more - see #510.

@ctb ctb changed the title WIP: fix scaled stuff WIP: fix sketch downsampling in both fastmultigather implementations Nov 10, 2024
@ctb ctb changed the title WIP: fix sketch downsampling in both fastmultigather implementations WIP: fix sketch downsampling across the plugin, incl fastmultigather, manysearch, and multisearch Nov 11, 2024
@ctb ctb changed the title WIP: fix sketch downsampling across the plugin, incl fastmultigather, manysearch, and multisearch MRG: fix sketch downsampling across the plugin, incl fastmultigather, manysearch, and multisearch Nov 11, 2024
@ctb
Copy link
Collaborator Author

ctb commented Nov 11, 2024

Merging without review so I can cut a new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant