Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kilosort4 wrapper tests #3085

Merged
merged 31 commits into from
Aug 21, 2024

Conversation

JoeZiminski
Copy link
Collaborator

@JoeZiminski JoeZiminski commented Jun 26, 2024

This PR makes some refactorings and small changes to kilosort4.py (Kilosort4 Wrapper) and adds tests for the wrapper across all released versions.

Changes to kilosort4.py

  • Changes the _default_params based on version to always match kilosort's DEFAULT_SETTINGS.
  • All KS4 functions are called via kwargs rather than position arguments.
  • Removed keep_good_only and scaleproc which I think were left over from earlier KS versions but are not implemented in KS4.
  • KS4 .__version__ attribute is 4 for all versions < 4.0.10 so I used importlib.version for versioning here.

CI tests

  • The CI tests use a python script to grab the versions of all kilosort released on pypi and set them in a matrix to run tests over, inspired by @tabedzki related PR on auto-building the docker images.
  • Tests for running KS4 natively vs. through SpikeInterface wrapper are added (details on these in the module header). These are stored in .github/scripts so are not part of the test suite. The CI script can be adjusted to run on cron job only. I ran into problems with the module tagging in conftest.py because the script is not a submodule of the package src, so added a try/except. Maybe there is a better solution (e.g. running the script from a different location) happy to adjust this.

Some points for discussion

  • Possible kwargs to fully expose: shift, scale, tmin, tmax, save_preprocesed_copy.
  • docker is not tested, but I think as the docker code is not related to how failthfully the wrapper is, I think thats okay not to test it?
  • save_extra_vars (KS4) is save_extra_kwargs in SpikeInterface, should it be renamed for consistency?

KIlosort versions <= 4.0.4

For some reason on the tests in CI, there are completely wrong results between SI wrapper vs. KS4 in 4.0.4. Also, versions <4.0.4 seem to be susceptible to a bug ValueError: negative axis 0 index: -2147483648 which looks like a KS4-side overflow error that's since fixed. Also, as all KS4 DEFAULT_SETTINGS changes that are a little tricky to handle (see below comment to Alessio) are <=4.0.4 I suggest we just don't support versions earlier than 4.0.5?

Some TODOs:

  • KS4 next version will deprecate duplicate_spike_bins for duplicate_spike_ms, also mention save_preprocesed_copy typo.
  • could test a multi-segment recording to check concatenation is OK.

@JoeZiminski JoeZiminski force-pushed the add_kilosort4_wrapper_tests branch 5 times, most recently from 0a822f5 to 6f9dfc4 Compare June 26, 2024 12:20

Parameters
----------
repo : str, default: "https://gin.g-node.org/NeuralEnsemble/ephy_testing_data"
The repository to download the dataset from
remote_path : str, default: "mearec/mearec_test_10s.h5"
A specific subdirectory in the repository to download (e.g. Mearec, SpikeGLX, etc)
local_folder : str, default: None
local_folder : str, optional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use default: None elsewhere ;)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alejoe91!! Though please ignore most of this PR ATM, it contains a lot of stray commits from random other PRs that are screwing everything up and I that need to drop! 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! I'll refrain myself ;)

settings=settings,
probe=probe,
data_dtype=recording.get_dtype(),
do_CAR=do_CAR,
Copy link
Collaborator Author

@JoeZiminski JoeZiminski Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ported from #3073

Chris

So, at the moment, if a user specifies skip_kilosort_preprocessing = True, then do_CAR is still True by default and is still applied. I guess: is this the expected behaviour?

And then what do we want to do if someone explicitly specifies skip_kilosort_preprocessing = True and do_CAR=True ?

So, at the moment, if a user specifies skip_kilosort_preprocessing = True, then do_CAR is still True by default and is still applied. I guess: is this the expected behaviour?

And then what do we want to do if someone explicitly specifies skip_kilosort_preprocessing = True and do_CAR=True ?

Alessio

So, at the moment, if a user specifies skip_kilosort_preprocessing = True, then do_CAR is still True by default and is still applied. I guess: is this the expected behaviour?

Chris

And then what do we want to do if someone explicitly specifies skip_kilosort_preprocessing = True and do_CAR=True ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because do_CAR is the default True it is not possible to determine if the user wanted to apply CAR or just left the default on. In this case I think most users would expect skip_kilosort_preprocessing to skip all preprocessing and would be surprised if CAR was still performed by default. I would suggest to raise an error if skip_kilosort_preprocessing=False and do_CAR=True. It will be annoying for users first time because by default it will raise an error, but at least it is an easy fix and super explicit. If we get issues raised from people who want to skip KS preprocessing but keep CAR (I would imagine this is very small % of users) we could reconsider.

Further on this, I believe that the KS drift correction is performed in the code block that also performs whitening. So when whitening is turned off as part of skip_kilosort_preprocessing then drift correct is also turned off. This makes it even weirder for skip_kilosort_preprocessing to allow only CAR. Also (if this is correct) this behaviour can be added to the dosctring and I'd suggest raise an error if do_correction=True while skip_kilosort_preprocessing=True. Even though it will raise a lot of errors by default when trying to use skip_kilosort_preprocessing I feel it is quite a complex / confusing case and so best to be super explicit.

@JoeZiminski
Copy link
Collaborator Author

JoeZiminski commented Jun 27, 2024

(In quotes, Alessio original message ported from #3073)

[On shift and scale adding to BinaryFiltered] Honestly, I was a bit torn about including these two, simply because in > SI we have better scaling management, so why would someone that launch KS4 from SI want to use scale and shift there? >Maybe, let's just remove them!

Yes I can definitely see the advantage in not including everything, on balance though as a philosophy I think SpikeInterface, when it comes to sorters, is most effective as as a wrapper that exposes all available sorter options. In some ways it's also easier to maintain because avoids having to decide what is / isn't worthwhile to expose, and avoids potential future issues users might raise wanting things included. As such I'd probably advocate towards exposing shift, scale, tmin, tmax, save_preprocessed_copy as part of this PR.

[On setting default params in SpikeInterface side _default_params vs. KS4 DEFAULT_SETTINGS. That is potentially >a good idea, but I think that having the dictionaries "spelled" out is better for readibility and design and it complies with the >pattern with other sorters. Since we discussed about having automated CI for KS4 versions, we could throw in parameter >checks there ;)

Agree, for some reason I thought this was leading to some large differences in results but its not. They are the same for all versions except two cases where the defaults changed in early KS versions. I made the _param_defaults condition on the version in these cases, but this duplicates the cls.get_sorter_version() but I can see no way around this as the class is not instantiated yet. Anyways, glad its easy to handle!

[EDIT] Turns out it wasn't as easy to handle as I thought as the kilosort import is no longer wrapped at this stage and is causing general test failures. I am also getting weird errors on kilosort4 tests versions <= 4.0.4 (which is also where the default changes are). So I'm thinking just drop support for these versions, which also removes the different-defaults problem. If default change in future we could handle with a parameters getter on the Kilosort4Sorter class.

 On branch add_kilosort4_wrapper_tests
@JoeZiminski JoeZiminski force-pushed the add_kilosort4_wrapper_tests branch from f451240 to 9bc1897 Compare June 27, 2024 11:04
@JoeZiminski JoeZiminski marked this pull request as ready for review June 27, 2024 11:22
@alejoe91 alejoe91 added sorters Related to sorters module testing Related to test routines labels Jun 27, 2024
@carlacodes
Copy link

is there any easy way to change save_preprocessed_copy to true within the existing spikeinterface framework?

@JoeZiminski
Copy link
Collaborator Author

Hi @carlacodes unfortunately not, although I think it would be quick to add. @alejoe91 would you be happy to expose save_preprocessed_copy in the kilosort4 wrapper?

@alejoe91
Copy link
Member

@JoeZiminski sure!

Comment on lines 7 to 10
pull_request:
types: [synchronize, opened, reopened]
branches:
- main
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoeZiminski I would remove this, otherwise it will run on any open PR...no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alejoe91 oh yes good point, should this be a weekly cron job like here?

so just

on:
  workflow_dispatch:
  schedule:
    - cron: "0 12 * * 0"  # Weekly on Sunday at noon UTC

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay cheers @alejoe91, done!

Copy link
Collaborator Author

@JoeZiminski JoeZiminski Aug 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there is a conflict since exposing save_preprocesed_copy, will fix now

@JoeZiminski
Copy link
Collaborator Author

Hey @alejoe91 I think this is good to go now, I added a reminder based on the suggested plan here. The test will error out for spikeinterface >= 0.101.1, as a reminder to update it for supporting the new versions.

@alejoe91 alejoe91 merged commit 73f6151 into SpikeInterface:main Aug 21, 2024
15 checks passed
Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sorters Related to sorters module testing Related to test routines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants