-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor quality metrics tests to use fixture #3249
Refactor quality metrics tests to use fixture #3249
Conversation
@@ -570,27 +540,36 @@ def test_calculate_sd_ratio(sorting_analyzer_simple): | |||
|
|||
if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree I think this is essential for certain Windows run. But it is super inconsistent which Windows. (for example I haven't needed to do if name == main yet, but others have.... So weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no way! 🤯 so it depends on the Windows version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is less "version" and more like version + sub-version + python version + CPU-type + IDE. Who knows what magic determines this.... But one of my co-workers has to do the if name == main, but I don't on my workstation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had responded but somehow dissapeared, "Haha lol the mysteries of multiprocessing, very interesting!"
src/spikeinterface/qualitymetrics/tests/test_quality_metric_calculator.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @chrishalcrow this is really nice, great to centralise these fixtures! and find out some interesting things about pytest and multiprocessing along the way!
Why are we dropping the PC calculation--? I need to look at what the actual test is checking, but without pcs calculated we wouldn't be able to test pc metrics. I will say in general on my windows machine that doing the PCA is the slowest step. So maybe the scipy implementation on Windows is the thing that is so variable. So removing the PC got rid of that variability, but I wonder if we need that calculation. |
We do still test them using the The tests are a bit hodgepodge, but the aim of this PR was just to try and get the times consistent across OSs, and simplify the structure. |
Then my guess for time is that PCA is causing variability. If we aren't testing it here then removing it gets us the speed up (again my guess)--you could add it back in see if we slowdown and then remove it before actual merge. |
Hey @chrishalcrow @zm711 are we almost done here? (sorry @zm711 I accidentally clicked to re-request your review when I meant to click myself). I guess we can continue the code-comment discussion, but maybe it is worth porting it to an issue. Otherwise, it's just to see if reintroducing PC metric calculating in the sorting analyzer fixture brings up the test time, to see if that was the culprit? |
What is the actual change here? Deleting Sam's/Alessio's testing is +/-; And removal of PCA. I think I would actually prefer to see if PCA is the culprit here rather than removal and never really understand why the tests shortened. |
Yes good point worth summarising as the debug code changes can be reverted, to summarise (let me know @chrishalcrow if I missed anything):
|
if __name__ == "__main__": | ||
|
||
sorting_analyzer = _sorting_analyzer_simple() | ||
print(sorting_analyzer) | ||
|
||
test_unit_structure_in_output(_small_sorting_analyzer()) | ||
|
||
# test_calculate_firing_rate_num_spikes(sorting_analyzer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrishalcrow can we add this back so @samuelgarcia is happy? :)
Hoping this is ready if tests pass |
Looks good @chrishalcrow! One last question from above, would you mind adding pack PC metrics into the test just to see if it accounts for the change in test times? (just out of interest?) then can remove again and good to merge I think, @zm711 may want a final pass |
This looks good to me. I would love if you would just test adding PCA back so we can assess if that is the timing issue, but I could also do that in a test PR if it is too annoying. Otherwise this is fine by me. I would recommend changing the title of the PR so it won't be misleading in the changelog, but otherwise good by me. |
Adding that back took Windows tests from 2-4 minutes to 16-17 minutes. So my guess is that PCA stuff is not optimized for Windows. That's really good to know because my lab hates running PCA because it takes forever and that confirms this on Windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good by me :)
Simplify quality metrics tests, so that they mostly share two
sorting_analyzer
s and wrapped all the tests inmain()
, which may be needed for ci multiprocessing(?).This PR moves the
sorting_analyzer_simple
toconftest.py
, which means it can be shared by all the quality metric test files (=> only has to be computed once). Moreover, we weren't using the PCs computed for thissorting_analyzer
, so I got rid of that.On my machine, the old tests took 3 mins 17 sec and the new ones 2 mins 5 sec. Note: the tests are basically the same.
Results from gh actions with
n_jobs=2
(https://github.com/SpikeInterface/spikeinterface/actions/runs/10075290248):mac3.9: 2:22
mac3.12: 2:17
win3.9: 2:09
win3.12: 2:10
ubu3.9: 2:24
ubu3.12: 2:32
Compared to old timing (https://github.com/SpikeInterface/spikeinterface/actions/runs/10063489408):
mac3.9: 2:39
mac3.12: 2:46
win3.9: 9:17
win3.12: 9:09
ubu3.9: 3:23
ubu3.12: 3:08
So: definitely helps the windows runs! Should keep checking to see if this test time is consistent across many runs.