diff --git a/doc/modules/qualitymetrics/isi_violations.rst b/doc/modules/qualitymetrics/isi_violations.rst index e30a2334d5..4527cdffe9 100644 --- a/doc/modules/qualitymetrics/isi_violations.rst +++ b/doc/modules/qualitymetrics/isi_violations.rst @@ -10,46 +10,48 @@ Calculation Neurons have a refractory period after a spiking event during which they cannot fire again. Inter-spike-interval (ISI) violations refers to the rate of refractory period violations (as described by [Hill]_). +We aim to calculate the contamination rate :math:`C`, measuring the ratio of isi violations in the spike-train of a unit. The calculation works under the assumption that the contaminant events happen randomly or come from another neuron that is not correlated with our unit. A correlation will lead to an overestimation of the contamination, whereas an anti-correlation will lead to an underestimation. -Different formulas have been developed over the years. +Different formulas have been developed, but all require: -Calculation from the [Hill]_ paper ----------------------------------- +- :math:`T` the duration of the recording in seconds. +- :math:`N` the number of spikes in the unit's spike train. +- :math:`t_r` the duration of the unit's refractory period in seconds. + +Calculation from the [UMS]_ package +----------------------------------- + +Originally implemented in the `rpv_contamination` calculation of the UltraMegaSort2000 package: ``_. -The following quantities are required: +In this method the number of spikes whose refractory period are violated, denoted :math:`n_v`, is used. -- :math:`ISI_t` : biological threshold for ISI violation. -- :math:`ISI_{min}`: minimum ISI threshold enforced by the data recording system used. -- :math:`ISI_s` : the array of ISI violations which are observed in the unit's spike train. -- :math:`\#`: denotes count. +Here, the refactory period :math:`t_r` is adjusted to take account of the data recording system's minimum possible refactory +period. E.g. if a system has a sampling rate of :math:`f \text{ Hz}`, the closest that two spikes from the same unit can possibly +be is :math:`1/f \, \text{s}`. Hence the refactory period :math:`t_r` is the expected biological threshold minus this minimum possible +threshold. -The threshold for ISI violations is the biological ISI threshold, :math:`ISI_t`, minus the minimum ISI threshold, :math:`ISI_{min}` enforced by the data recording system used. -The array of inter-spike-intervals observed in the unit's spike train, :math:`ISI_s`, is used to identify the count (:math:`\#`) of observed ISI's below this threshold. -For a recording with a duration of :math:`T_r` seconds, and a unit with :math:`N_s` spikes, the rate of ISI violations is: +The contamination rate is calculated to be .. math:: - \textrm{ISI violations} = \frac{ \#( ISI_s < ISI_t) T_r }{ 2 N_s^2 (ISI_t - ISI_{min}) } + C = \frac{ n_v T }{ 2 N^2 t_r } Calculation from the [Llobet]_ paper ------------------------------------ -The following quantities are required: - -- :math:`T` the duration of the recording. -- :math:`N` the number of spikes in the unit's spike train. -- :math:`t_r` the duration of the unit's refractory period. -- :math:`n_v` the number of violations of the refractory period. +In this method the number spikes which violate other spikes' refractory periods, denoted :math:`\tilde{n}_v`, is used. -The estimated contamination :math:`C` can be calculated with 2 extreme scenarios. In the first one, the contaminant spikes are completely random (or come from an infinite number of other neurons). In the second one, the contaminant spikes come from a single other neuron: +The estimated contamination :math:`C` is calculated in 2 extreme scenarios. In the first, the contaminant spikes +are completely random (or come from an infinite number of other neurons). In the second, the contaminant spikes +come from a single other neuron. In these scenarios, the contamination rate is .. math:: C = \frac{FP}{TP + FP} \approx \begin{cases} - 1 - \sqrt{1 - \frac{n_v T}{N^2 t_r}} \text{ for the case of random contamination} \\ - \frac{1}{2} \left( 1 - \sqrt{1 - \frac{2 n_v T}{N^2 t_r}} \right) \text{ for the case of 1 contaminant neuron} + 1 - \sqrt{1 - \frac{\tilde{n}_v T}{N^2 t_r}} \text{ for the case of random contamination} \\ + \frac{1}{2} \left( 1 - \sqrt{1 - \frac{2 \tilde{n}_v T}{N^2 t_r}} \right) \text{ for the case of 1 contaminant neuron} \end{cases} Where :math:`TP` is the number of true positives (detected spikes that come from the neuron) and :math:`FP` is the number of false positives (detected spikes that don't come from the neuron). @@ -58,7 +60,9 @@ Expectation and use ------------------- ISI violations identifies unit contamination - a high value indicates a highly contaminated unit. -Despite being a ratio, ISI violations can exceed 1 (or become a complex number in the [Llobet]_ formula). This is usually due to the contaminant events being correlated with our neuron, and their number is greater than a purely random spike train. +Despite being a ratio, the contamination can exceed 1 (or become a complex number in the [Llobet]_ formula). +This is usually due to the contaminant events being correlated with our neuron, and their number is +greater than a purely random spike train. Example code ------------ @@ -86,8 +90,8 @@ With SpikeInterface: References ---------- -Hill implementation (:code:`isi_violation`) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +UMS implementation (:code:`isi_violation`) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. autofunction:: spikeinterface.qualitymetrics.misc_metrics.compute_isi_violations @@ -160,5 +164,5 @@ Links to original implementations Literature ---------- -Introduced by [Hill]_ (2011). +Introduced in UltraMegaSort2000 [UMS]_ (2011). Also described by [Llobet]_ (2022) diff --git a/doc/references.rst b/doc/references.rst index ace51db951..5fbcbecb63 100644 --- a/doc/references.rst +++ b/doc/references.rst @@ -50,9 +50,11 @@ If you use the :code:`qualitymetrics` module, i.e. you use the :code:`analyzer.c or :code:`compute_quality_metrics()` methods, please include the citations for the :code:`metric_names` that were particularly important for your research: -- :code:`amplitude_cutoff` or :code:`isi_violation` [Hill]_ -- :code:`amplitude_median` or :code:`sliding_rp_violation` [IBL]_ +- :code:`amplitude_cutoff` [Hill]_ +- :code:`amplitude_median` [IBL]_ +- :code:`sliding_rp_violation` [IBL]_ - :code:`drift` [Siegle]_ +- :code:`isi_violation` [UMS]_ - :code:`rp_violation` [Llobet]_ - :code:`sd_ratio` [Pouzat]_ - :code:`snr` [Lemon]_ [Jackson]_ @@ -122,6 +124,8 @@ References .. [Siegle] `Survey of Spiking in the Mouse Visual System Reveals Functional Hierarchy. 2021. `_ +.. [UMS] `UltraMegaSort2000 - Spike sorting and quality metrics for extracellular spike data. 2011. `_ + .. [Varol] `Decentralized Motion Inference and Registration of Neuropixel Data. 2021. `_ .. [Windolf] `Robust Online Multiband Drift Estimation in Electrophysiology Data. 2022. `_ diff --git a/src/spikeinterface/qualitymetrics/misc_metrics.py b/src/spikeinterface/qualitymetrics/misc_metrics.py index a7d51ad330..433c04d248 100644 --- a/src/spikeinterface/qualitymetrics/misc_metrics.py +++ b/src/spikeinterface/qualitymetrics/misc_metrics.py @@ -241,7 +241,7 @@ def compute_isi_violations(sorting_analyzer, isi_threshold_ms=1.5, min_isi_ms=0, It computes several metrics related to isi violations: * isi_violations_ratio: the relative firing rate of the hypothetical neurons that are - generating the ISI violations. Described in [Hill]_. See Notes. + generating the ISI violations. See Notes. * isi_violation_count: number of ISI violations Parameters @@ -261,22 +261,29 @@ def compute_isi_violations(sorting_analyzer, isi_threshold_ms=1.5, min_isi_ms=0, Returns ------- isi_violations_ratio : dict - The isi violation ratio described in [Hill]_. + The isi violation ratio. isi_violation_count : dict Number of violations. Notes ----- - You can interpret an ISI violations ratio value of 0.5 as meaning that contaminating spikes are - occurring at roughly half the rate of "true" spikes for that unit. - In cases of highly contaminated units, the ISI violations ratio can sometimes be greater than 1. + The returned ISI violations ratio approximates the fraction of spikes in each + unit which are contaminted. The formulation assumes that the contaminating spikes + are statistically independent from the other spikes in that cluster. This + approximation can break down in reality, especially for highly contaminated units. + See the discussion in Section 4.1 of [Llobet]_ for more details. + + This method counts the number of spikes whose isi is violated. If there are three + spikes within `isi_threshold_ms`, the first and second are violated. Hence there are two + spikes which have been violated. This is is contrast to `compute_refrac_period_violations`, + which counts the number of violations. References ---------- - Based on metrics described in [Hill]_ + Based on metrics originally implemented in Ultra Mega Sort [UMS]_. - Originally written in Matlab by Nick Steinmetz (https://github.com/cortex-lab/sortingQuality) - and converted to Python by Daniel Denman. + This implementation is based on one of the original implementations written in Matlab by Nick Steinmetz + (https://github.com/cortex-lab/sortingQuality) and converted to Python by Daniel Denman. """ res = namedtuple("isi_violation", ["isi_violations_ratio", "isi_violations_count"]) @@ -324,7 +331,6 @@ def compute_refrac_period_violations( Calculate the number of refractory period violations. This is similar (but slightly different) to the ISI violations. - The key difference being that the violations are not only computed on consecutive spikes. This is required for some formulas (e.g. the ones from Llobet & Wyngaard 2022). @@ -351,6 +357,12 @@ def compute_refrac_period_violations( ----- Requires "numba" package + This method counts the number of violations which occur during the refactory period. + For example, if there are three spikes within `refractory_period_ms`, the second and third spikes + violate the first spike and the third spike violates the second spike. Hence there + are three violations. This is in contrast to `compute_isi_violations`, which + computes the number of spikes which have been violated. + References ---------- Based on metrics described in [Llobet]_