Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify compute_isi_violation docs and add UltraMegaSort2000 citation #3070

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 29 additions & 25 deletions doc/modules/qualitymetrics/isi_violations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,46 +10,48 @@ Calculation
Neurons have a refractory period after a spiking event during which they cannot fire again.
Inter-spike-interval (ISI) violations refers to the rate of refractory period violations (as described by [Hill]_).

We aim to calculate the contamination rate :math:`C`, measuring the ratio of isi violations in the spike-train of a unit.
The calculation works under the assumption that the contaminant events happen randomly or come from another neuron that is not correlated with our unit.
A correlation will lead to an overestimation of the contamination, whereas an anti-correlation will lead to an underestimation.

Different formulas have been developed over the years.
Different formulas have been developed, but all require:

Calculation from the [Hill]_ paper
----------------------------------
- :math:`T` the duration of the recording in seconds.
- :math:`N` the number of spikes in the unit's spike train.
- :math:`t_r` the duration of the unit's refractory period in seconds.

Calculation from the [UMS]_ package
chrishalcrow marked this conversation as resolved.
Show resolved Hide resolved
-----------------------------------

Originally implemented in the `rpv_contamination` calculation of the UltraMegaSort2000 package: `<https://github.com/danamics/UMS2K/blob/master/quality_measures/rpv_contamination.m>`_.

The following quantities are required:
In this method the number of spikes whose refractory period are violated, denoted :math:`n_v`, is used.

- :math:`ISI_t` : biological threshold for ISI violation.
- :math:`ISI_{min}`: minimum ISI threshold enforced by the data recording system used.
- :math:`ISI_s` : the array of ISI violations which are observed in the unit's spike train.
- :math:`\#`: denotes count.
Here, the refactory period :math:`t_r` is adjusted to take account of the data recording system's minimum possible refactory
period. E.g. if a system has a sampling rate of :math:`f \text{ Hz}`, the closest that two spikes from the same unit can possibly
be is :math:`1/f \, \text{s}`. Hence the refactory period :math:`t_r` is the expected biological threshold minus this minimum possible
threshold.

The threshold for ISI violations is the biological ISI threshold, :math:`ISI_t`, minus the minimum ISI threshold, :math:`ISI_{min}` enforced by the data recording system used.
The array of inter-spike-intervals observed in the unit's spike train, :math:`ISI_s`, is used to identify the count (:math:`\#`) of observed ISI's below this threshold.
For a recording with a duration of :math:`T_r` seconds, and a unit with :math:`N_s` spikes, the rate of ISI violations is:
The contamination rate is calculated to be

.. math::

\textrm{ISI violations} = \frac{ \#( ISI_s < ISI_t) T_r }{ 2 N_s^2 (ISI_t - ISI_{min}) }
C = \frac{ n_v T }{ 2 N^2 t_r }

Calculation from the [Llobet]_ paper
------------------------------------

The following quantities are required:

- :math:`T` the duration of the recording.
- :math:`N` the number of spikes in the unit's spike train.
- :math:`t_r` the duration of the unit's refractory period.
- :math:`n_v` the number of violations of the refractory period.
In this method the number spikes which violate other spikes' refractory periods, denoted :math:`\tilde{n}_v`, is used.

The estimated contamination :math:`C` can be calculated with 2 extreme scenarios. In the first one, the contaminant spikes are completely random (or come from an infinite number of other neurons). In the second one, the contaminant spikes come from a single other neuron:
The estimated contamination :math:`C` is calculated in 2 extreme scenarios. In the first, the contaminant spikes
are completely random (or come from an infinite number of other neurons). In the second, the contaminant spikes
come from a single other neuron. In these scenarios, the contamination rate is

.. math::

C = \frac{FP}{TP + FP} \approx \begin{cases}
1 - \sqrt{1 - \frac{n_v T}{N^2 t_r}} \text{ for the case of random contamination} \\
\frac{1}{2} \left( 1 - \sqrt{1 - \frac{2 n_v T}{N^2 t_r}} \right) \text{ for the case of 1 contaminant neuron}
1 - \sqrt{1 - \frac{\tilde{n}_v T}{N^2 t_r}} \text{ for the case of random contamination} \\
\frac{1}{2} \left( 1 - \sqrt{1 - \frac{2 \tilde{n}_v T}{N^2 t_r}} \right) \text{ for the case of 1 contaminant neuron}
\end{cases}

Where :math:`TP` is the number of true positives (detected spikes that come from the neuron) and :math:`FP` is the number of false positives (detected spikes that don't come from the neuron).
Expand All @@ -58,7 +60,9 @@ Expectation and use
-------------------

ISI violations identifies unit contamination - a high value indicates a highly contaminated unit.
Despite being a ratio, ISI violations can exceed 1 (or become a complex number in the [Llobet]_ formula). This is usually due to the contaminant events being correlated with our neuron, and their number is greater than a purely random spike train.
Despite being a ratio, the contamination can exceed 1 (or become a complex number in the [Llobet]_ formula).
This is usually due to the contaminant events being correlated with our neuron, and their number is
greater than a purely random spike train.

Example code
------------
Expand Down Expand Up @@ -86,8 +90,8 @@ With SpikeInterface:
References
----------

Hill implementation (:code:`isi_violation`)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UMS implementation (:code:`isi_violation`)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. autofunction:: spikeinterface.qualitymetrics.misc_metrics.compute_isi_violations

Expand Down Expand Up @@ -160,5 +164,5 @@ Links to original implementations
Literature
----------

Introduced by [Hill]_ (2011).
Introduced in UltraMegaSort2000 [UMS]_ (2011).
Also described by [Llobet]_ (2022)
8 changes: 6 additions & 2 deletions doc/references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,11 @@ If you use the :code:`qualitymetrics` module, i.e. you use the :code:`analyzer.c
or :code:`compute_quality_metrics()` methods, please include the citations for the :code:`metric_names` that were particularly
important for your research:

- :code:`amplitude_cutoff` or :code:`isi_violation` [Hill]_
- :code:`amplitude_median` or :code:`sliding_rp_violation` [IBL]_
- :code:`amplitude_cutoff` [Hill]_
- :code:`amplitude_median` [IBL]_
- :code:`sliding_rp_violation` [IBL]_
- :code:`drift` [Siegle]_
- :code:`isi_violation` [UMS]_
- :code:`rp_violation` [Llobet]_
- :code:`sd_ratio` [Pouzat]_
- :code:`snr` [Lemon]_ [Jackson]_
Expand Down Expand Up @@ -122,6 +124,8 @@ References

.. [Siegle] `Survey of Spiking in the Mouse Visual System Reveals Functional Hierarchy. 2021. <https://pubmed.ncbi.nlm.nih.gov/33473216/>`_

.. [UMS] `UltraMegaSort2000 - Spike sorting and quality metrics for extracellular spike data. 2011. <https://github.com/danamics/UMS2K>`_

.. [Varol] `Decentralized Motion Inference and Registration of Neuropixel Data. 2021. <https://ieeexplore.ieee.org/document/9414145>`_

.. [Windolf] `Robust Online Multiband Drift Estimation in Electrophysiology Data. 2022. <https://www.biorxiv.org/content/10.1101/2022.12.04.519043v2>`_
Expand Down
30 changes: 21 additions & 9 deletions src/spikeinterface/qualitymetrics/misc_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ def compute_isi_violations(sorting_analyzer, isi_threshold_ms=1.5, min_isi_ms=0,

It computes several metrics related to isi violations:
* isi_violations_ratio: the relative firing rate of the hypothetical neurons that are
generating the ISI violations. Described in [Hill]_. See Notes.
generating the ISI violations. See Notes.
* isi_violation_count: number of ISI violations

Parameters
Expand All @@ -261,22 +261,29 @@ def compute_isi_violations(sorting_analyzer, isi_threshold_ms=1.5, min_isi_ms=0,
Returns
-------
isi_violations_ratio : dict
The isi violation ratio described in [Hill]_.
The isi violation ratio.
isi_violation_count : dict
Number of violations.

Notes
-----
You can interpret an ISI violations ratio value of 0.5 as meaning that contaminating spikes are
occurring at roughly half the rate of "true" spikes for that unit.
In cases of highly contaminated units, the ISI violations ratio can sometimes be greater than 1.
The returned ISI violations ratio approximates the fraction of spikes in each
unit which are contaminted. The formulation assumes that the contaminating spikes
are statistically independent from the other spikes in that cluster. This
approximation can break down in reality, especially for highly contaminated units.
See the discussion in Section 4.1 of [Llobet]_ for more details.

This method counts the number of spikes whose isi is violated. If there are three
spikes within `isi_threshold_ms`, the first and second are violated. Hence there are two
spikes which have been violated. This is is contrast to `compute_refrac_period_violations`,
which counts the number of violations.

References
----------
Based on metrics described in [Hill]_
Based on metrics originally implemented in Ultra Mega Sort [UMS]_.

Originally written in Matlab by Nick Steinmetz (https://github.com/cortex-lab/sortingQuality)
and converted to Python by Daniel Denman.
This implementation is based on one of the original implementations written in Matlab by Nick Steinmetz
(https://github.com/cortex-lab/sortingQuality) and converted to Python by Daniel Denman.
"""
res = namedtuple("isi_violation", ["isi_violations_ratio", "isi_violations_count"])

Expand Down Expand Up @@ -324,7 +331,6 @@ def compute_refrac_period_violations(
Calculate the number of refractory period violations.

This is similar (but slightly different) to the ISI violations.
The key difference being that the violations are not only computed on consecutive spikes.

This is required for some formulas (e.g. the ones from Llobet & Wyngaard 2022).

Expand All @@ -351,6 +357,12 @@ def compute_refrac_period_violations(
-----
Requires "numba" package

This method counts the number of violations which occur during the refactory period.
For example, if there are three spikes within `refractory_period_ms`, the second and third spikes
violate the first spike and the third spike violates the second spike. Hence there
are three violations. This is in contrast to `compute_isi_violations`, which
computes the number of spikes which have been violated.
chrishalcrow marked this conversation as resolved.
Show resolved Hide resolved

References
----------
Based on metrics described in [Llobet]_
Expand Down