Skip to content

Commit

Permalink
Merge branch 'SpikeInterface:main' into meta_merging_sc2
Browse files Browse the repository at this point in the history
  • Loading branch information
yger authored Oct 8, 2024
2 parents fb27a6b + 298e57a commit 6ec9940
Show file tree
Hide file tree
Showing 77 changed files with 2,816 additions and 2,480 deletions.
1 change: 1 addition & 0 deletions doc/development/development.rst
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,7 @@ Miscelleaneous Stylistic Conventions
#. Avoid using abbreviations in variable names (e.g. use :code:`recording` instead of :code:`rec`). It is especially important to avoid single letter variables.
#. Use index as singular and indices for plural following the NumPy convention. Avoid idx or indexes. Plus, id and ids are reserved for identifiers (i.e. channel_ids)
#. We use file_path and folder_path (instead of file_name and folder_name) for clarity.
#. For the titles of documentation pages, only capitalize the first letter of the first word and classes or software packages. For example, "How to use a SortingAnalyzer in SpikeInterface".
#. For creating headers to divide sections of code we use the following convention (see issue `#3019 <https://github.com/SpikeInterface/spikeinterface/issues/3019>`_):


Expand Down
2 changes: 1 addition & 1 deletion doc/how_to/combine_recordings.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Combine Recordings in SpikeInterface
Combine recordings in SpikeInterface
====================================

In this tutorial we will walk through combining multiple recording objects. Sometimes this occurs due to hardware
Expand Down
2 changes: 1 addition & 1 deletion doc/how_to/load_matlab_data.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Export MATLAB Data to Binary & Load in SpikeInterface
Export MATLAB data to binary & load in SpikeInterface
========================================================

In this tutorial, we will walk through the process of exporting data from MATLAB in a binary format and subsequently loading it using SpikeInterface in Python.
Expand Down
4 changes: 2 additions & 2 deletions doc/how_to/load_your_data_into_sorting.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Load Your Own Data into a Sorting
=================================
Load your own data into a Sorting object
========================================

Why make a :code:`Sorting`?

Expand Down
2 changes: 1 addition & 1 deletion doc/how_to/process_by_channel_group.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Process a Recording by Channel Group
Process a recording by channel group
====================================

In this tutorial, we will walk through how to preprocess and sort a recording
Expand Down
2 changes: 1 addition & 1 deletion doc/how_to/viewers.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Visualize Data
Visualize data
==============

There are several ways to plot signals (raw, preprocessed) and spikes.
Expand Down
Binary file modified doc/images/overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
141 changes: 141 additions & 0 deletions doc/modules/benchmark.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
Benchmark module
================

This module contains machinery to compare some sorters against ground truth in many multiple situtation.


..notes::

In 0.102.0 The previous :py:func:`~spikeinterface.comparison.GroundTruthStudy()` has been replaced by
:py:func:`~spikeinterface.benchmark.SorterStudy()`


This module also aims to benchmark sorting components (detection, clustering, motion, template matching) using the
same base class :py:func:`~spikeinterface.benchmark.BenchmarkStudy()` but specialized to a targeted component.

By design, the main class handle the concept of "levels" : this allows to compare several complexities at the same time.
For instance, compare kilosort4 vs kilsort2.5 (level 0) for different noises amplitudes (level 1) combined with
several motion vectors (leevel 2).

**Example: compare many sorters : a ground truth study**

We have a high level class to compare many sorters against ground truth: :py:func:`~spikeinterface.benchmark.SorterStudy()`


A study is a systematic performance comparison of several ground truth recordings with several sorters or several cases
like the different parameter sets.

The study class proposes high-level tool functions to run many ground truth comparisons with many "cases"
on many recordings and then collect and aggregate results in an easy way.

The all mechanism is based on an intrinsic organization into a "study_folder" with several subfolders:

* datasets: contains ground truth datasets
* sorters : contains outputs of sorters
* sortings: contains light copy of all sorting
* metrics: contains metrics
* ...


.. code-block:: python
import matplotlib.pyplot as plt
import seaborn as sns
import spikeinterface.extractors as se
import spikeinterface.widgets as sw
from spikeinterface.benchmark import SorterStudy
# generate 2 simulated datasets (could be also mearec files)
rec0, gt_sorting0 = generate_ground_truth_recording(num_channels=4, durations=[30.], seed=42)
rec1, gt_sorting1 = generate_ground_truth_recording(num_channels=4, durations=[30.], seed=91)
datasets = {
"toy0": (rec0, gt_sorting0),
"toy1": (rec1, gt_sorting1),
}
# define some "cases" here we want to test tridesclous2 on 2 datasets and spykingcircus2 on one dataset
# so it is a two level study (sorter_name, dataset)
# this could be more complicated like (sorter_name, dataset, params)
cases = {
("tdc2", "toy0"): {
"label": "tridesclous2 on tetrode0",
"dataset": "toy0",
"params": {"sorter_name": "tridesclous2"}
},
("tdc2", "toy1"): {
"label": "tridesclous2 on tetrode1",
"dataset": "toy1",
"params": {"sorter_name": "tridesclous2"}
},
("sc", "toy0"): {
"label": "spykingcircus2 on tetrode0",
"dataset": "toy0",
"params": {
"sorter_name": "spykingcircus",
"docker_image": True
},
},
}
# this initilizes a folder
study = SorterStudy.create(study_folder=study_folder, datasets=datasets, cases=cases,
levels=["sorter_name", "dataset"])
# This internally do run_sorter() for all cases in one function
study.run()
# Run the benchmark : this internanly do compare_sorter_to_ground_truth() for all cases
study.compute_results()
# Collect comparisons one by one
for case_key in study.cases:
print('*' * 10)
print(case_key)
# raw counting of tp/fp/...
comp = study.get_result(case_key)["gt_comparison"]
# summary
comp.print_summary()
perf_unit = comp.get_performance(method='by_unit')
perf_avg = comp.get_performance(method='pooled_with_average')
# some plots
m = comp.get_confusion_matrix()
w_comp = sw.plot_agreement_matrix(sorting_comparison=comp)
# Collect synthetic dataframes and display
# As shown previously, the performance is returned as a pandas dataframe.
# The spikeinterface.comparison.get_performance_by_unit() function,
# gathers all the outputs in the study folder and merges them into a single dataframe.
# Same idea for spikeinterface.comparison.get_count_units()
# this is a dataframe
perfs = study.get_performance_by_unit()
# this is a dataframe
unit_counts = study.get_count_units()
# Study also have several plotting methods for plotting the result
study.plot_agreement_matrix()
study.plot_unit_counts()
study.plot_performances(mode="ordered")
study.plot_performances(mode="snr")
Benchmark spike collisions
--------------------------

SpikeInterface also has a specific toolset to benchmark how well sorters are at recovering spikes in "collision".

We have three classes to handle collision-specific comparisons, and also to quantify the effects on correlogram
estimation:

* :py:class:`~spikeinterface.comparison.CollisionGTComparison`
* :py:class:`~spikeinterface.comparison.CorrelogramGTComparison`

For more details, checkout the following paper:

`Samuel Garcia, Alessio P. Buccino and Pierre Yger. "How Do Spike Collisions Affect Spike Sorting Performance?" <https://doi.org/10.1523/ENEURO.0105-22.2022>`_
165 changes: 4 additions & 161 deletions doc/modules/comparison.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ Comparison module
SpikeInterface has a :py:mod:`~spikeinterface.comparison` module, which contains functions and tools to compare
spike trains and templates (useful for tracking units over multiple sessions).

.. note::

In version 0.102.0 the benchmark part of comparison has moved in the new :py:mod:`~spikeinterface.benchmark`

In addition, the :py:mod:`~spikeinterface.comparison` module contains advanced benchmarking tools to evaluate
the effects of spike collisions on spike sorting results, and to construct hybrid recordings for comparison.

Expand Down Expand Up @@ -242,135 +246,6 @@ An **over-merged** unit has a relatively high agreement (>= 0.2 by default) for
cmp_gt_HS.get_redundant_units(redundant_score=0.2)
**Example: compare many sorters with a Ground Truth Study**

We also have a high level class to compare many sorters against ground truth:
:py:func:`~spikeinterface.comparison.GroundTruthStudy()`

A study is a systematic performance comparison of several ground truth recordings with several sorters or several cases
like the different parameter sets.

The study class proposes high-level tool functions to run many ground truth comparisons with many "cases"
on many recordings and then collect and aggregate results in an easy way.

The all mechanism is based on an intrinsic organization into a "study_folder" with several subfolders:

* datasets: contains ground truth datasets
* sorters : contains outputs of sorters
* sortings: contains light copy of all sorting
* metrics: contains metrics
* ...


.. code-block:: python
import matplotlib.pyplot as plt
import seaborn as sns
import spikeinterface.extractors as se
import spikeinterface.widgets as sw
from spikeinterface.comparison import GroundTruthStudy
# generate 2 simulated datasets (could be also mearec files)
rec0, gt_sorting0 = generate_ground_truth_recording(num_channels=4, durations=[30.], seed=42)
rec1, gt_sorting1 = generate_ground_truth_recording(num_channels=4, durations=[30.], seed=91)
datasets = {
"toy0": (rec0, gt_sorting0),
"toy1": (rec1, gt_sorting1),
}
# define some "cases" here we want to test tridesclous2 on 2 datasets and spykingcircus2 on one dataset
# so it is a two level study (sorter_name, dataset)
# this could be more complicated like (sorter_name, dataset, params)
cases = {
("tdc2", "toy0"): {
"label": "tridesclous2 on tetrode0",
"dataset": "toy0",
"run_sorter_params": {
"sorter_name": "tridesclous2",
},
},
("tdc2", "toy1"): {
"label": "tridesclous2 on tetrode1",
"dataset": "toy1",
"run_sorter_params": {
"sorter_name": "tridesclous2",
},
},
("sc", "toy0"): {
"label": "spykingcircus2 on tetrode0",
"dataset": "toy0",
"run_sorter_params": {
"sorter_name": "spykingcircus",
"docker_image": True
},
},
}
# this initilizes a folder
study = GroundTruthStudy.create(study_folder=study_folder, datasets=datasets, cases=cases,
levels=["sorter_name", "dataset"])
# all cases in one function
study.run_sorters()
# Collect comparisons
#
# You can collect in one shot all results and run the
# GroundTruthComparison on it.
# So you can have fine access to all individual results.
#
# Note: use exhaustive_gt=True when you know exactly how many
# units in the ground truth (for synthetic datasets)
# run all comparisons and loop over the results
study.run_comparisons(exhaustive_gt=True)
for key, comp in study.comparisons.items():
print('*' * 10)
print(key)
# raw counting of tp/fp/...
print(comp.count_score)
# summary
comp.print_summary()
perf_unit = comp.get_performance(method='by_unit')
perf_avg = comp.get_performance(method='pooled_with_average')
# some plots
m = comp.get_confusion_matrix()
w_comp = sw.plot_agreement_matrix(sorting_comparison=comp)
# Collect synthetic dataframes and display
# As shown previously, the performance is returned as a pandas dataframe.
# The spikeinterface.comparison.get_performance_by_unit() function,
# gathers all the outputs in the study folder and merges them into a single dataframe.
# Same idea for spikeinterface.comparison.get_count_units()
# this is a dataframe
perfs = study.get_performance_by_unit()
# this is a dataframe
unit_counts = study.get_count_units()
# we can also access run times
run_times = study.get_run_times()
print(run_times)
# Easy plotting with seaborn
fig1, ax1 = plt.subplots()
sns.barplot(data=run_times, x='rec_name', y='run_time', hue='sorter_name', ax=ax1)
ax1.set_title('Run times')
##############################################################################
fig2, ax2 = plt.subplots()
sns.swarmplot(data=perfs, x='sorter_name', y='recall', hue='rec_name', ax=ax2)
ax2.set_title('Recall')
ax2.set_ylim(-0.1, 1.1)
.. _symmetric:

2. Compare the output of two spike sorters (symmetric comparison)
Expand Down Expand Up @@ -537,35 +412,3 @@ sorting analyzers from day 1 (:code:`analyzer_day1`) to day 5 (:code:`analyzer_d
# match all
m_tcmp = sc.compare_multiple_templates(waveform_list=analyzer_list,
name_list=["D1", "D2", "D3", "D4", "D5"])
Benchmark spike collisions
--------------------------

SpikeInterface also has a specific toolset to benchmark how well sorters are at recovering spikes in "collision".

We have three classes to handle collision-specific comparisons, and also to quantify the effects on correlogram
estimation:

* :py:class:`~spikeinterface.comparison.CollisionGTComparison`
* :py:class:`~spikeinterface.comparison.CorrelogramGTComparison`
* :py:class:`~spikeinterface.comparison.CollisionGTStudy`
* :py:class:`~spikeinterface.comparison.CorrelogramGTStudy`

For more details, checkout the following paper:

`Samuel Garcia, Alessio P. Buccino and Pierre Yger. "How Do Spike Collisions Affect Spike Sorting Performance?" <https://doi.org/10.1523/ENEURO.0105-22.2022>`_


Hybrid recording
----------------

To benchmark spike sorting results, we need ground-truth spiking activity.
This can be generated with artificial simulations, e.g., using `MEArec <https://mearec.readthedocs.io/>`_, or
alternatively by generating so-called "hybrid" recordings.

The :py:mod:`~spikeinterface.comparison` module includes functions to generate such "hybrid" recordings:

* :py:func:`~spikeinterface.comparison.create_hybrid_units_recording`: add new units to an existing recording
* :py:func:`~spikeinterface.comparison.create_hybrid_spikes_recording`: add new spikes to existing units in a recording
Loading

0 comments on commit 6ec9940

Please sign in to comment.