Merge branch 'main' into sorter-window-doc

SpikeInterface · Mar 8, 2024 · 594cc77 · 594cc77
2 parents 032f920 + fe937a1
commit 594cc77
Show file tree

Hide file tree

Showing 9 changed files with 337 additions and 78 deletions.
diff --git a/doc/how_to/index.rst b/doc/how_to/index.rst
@@ -8,3 +8,4 @@ How to guides
     analyse_neuropixels
     handle_drift
     load_matlab_data
+    process_by_channel_group
diff --git a/doc/how_to/process_by_channel_group.rst b/doc/how_to/process_by_channel_group.rst
@@ -0,0 +1,188 @@
+Processing a Recording by Channel Group
+=======================================
+
+In this tutorial, we will walk through how to preprocess and sort a recording
+separately for *channel groups*. A channel group is a subset of channels grouped by some
+feature - for example a multi-shank Neuropixels recording in which the channels
+are grouped by shank.
+
+**Why preprocess by channel group?**
+
+Certain preprocessing steps depend on the spatial arrangement of the channels.
+For example, common average referencing (CAR) averages over channels (separately for each time point)
+and subtracts the average. In such a scenario it may make sense to group channels so that
+this averaging is performed only over spatially close channel groups.
+
+**Why sort by channel group?**
+
+When sorting, we may want to completely separately channel groups so we can
+consider their signals in isolation. If recording from a long
+silicon probe, we might want to sort different brain areas separately,
+for example using a different sorter for the hippocampus, the thalamus, or the cerebellum.
+
+
+Splitting a Recording by Channel Group
+--------------------------------------
+
+In this example, we create a 384-channel recording with 4 shanks. However this could
+be any recording in which the channel are grouped in some way, for example
+a multi-tetrode recording with channel groups representing the channels on each individual tetrodes.
+
+First, let's import the parts of SpikeInterface we need into Python, and generate our toy recording:
+
+.. code-block:: python
+
+    import spikeinterface.extractors as se
+    import spikeinterface.preprocessing as spre
+    from spikeinterface import aggregate_channels
+    from probeinterface import generate_tetrode, ProbeGroup
+    import numpy as np
+
+    # Create a toy 384 channel recording with 4 shanks (each shank contain 96 channels)
+    recording, _ = se.toy_example(duration=[1.00], num_segments=1, num_channels=384)
+    four_shank_groupings = np.repeat([0, 1, 2, 3], 96)
+    recording.set_property("group", four_shank_groupings)
+
+    print(recording.get_channel_groups())
+    """
+    array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+           0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+           0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+           0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+           0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+           1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+           1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+           1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
+           1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
+           2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+           2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+           2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+           2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
+           2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+           3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+           3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+           3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
+           3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
+    """
+
+We can split a recording into multiple recordings, one for each channel group, with the :py:func:`~split_by` method.
+
+.. code-block:: python
+
+    split_recording_dict = recording.split_by("group")
+
+Splitting a recording by channel group returns a dictionary containing separate recordings, one for each channel group:
+
+.. code-block:: python
+
+    print(split_recording_dict)
+    """
+    {0: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
+                           10.99 MiB, 1: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
+                           10.99 MiB, 2: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
+                           10.99 MiB, 3: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
+                           10.99 MiB}
+    """
+Preprocessing a Recording by Channel Group
+------------------------------------------
+
+The essence of preprocessing by channel group is to first split the recording
+into separate recordings, perform the preprocessing steps, then aggregate
+the channels back together.
+
+In the below example, we loop over the split recordings, preprocessing each channel group
+individually. At the end, we use the :py:func:`~aggregate_channels` function
+to combine the separate channel group recordings back together.
+
+.. code-block:: python
+
+    preprocessed_recordings = []
+
+   # loop over the recordings contained in the dictionary
+    for chan_group_rec in split_recordings_dict.values():
+
+        # Apply the preprocessing steps to the channel group in isolation
+        shifted_recording = spre.phase_shift(chan_group_rec)
+
+        filtered_recording = spre.bandpass_filter(shifted_recording)
+
+        referenced_recording = spre.common_reference(filtered_recording)
+
+        preprocessed_recordings.append(referenced_recording)
+
+    # Combine our preprocessed channel groups back together
+    combined_preprocessed_recording = aggregate_channels(preprocessed_recordings)
+
+Now, when this recording is used in sorting, plotting, or whenever
+calling its :py:func:`~get_traces` method, the data will have been
+preprocessed separately per-channel group (then concatenated
+back together under the hood).
+
+It is strongly recommended to use the above structure to preprocess by channel group.
+
+.. note::
+
+    The splitting and aggregation of channels for preprocessing is flexible.
+    Under the hood, :py:func:`~aggregate_channels` keeps track of when a recording was split. When
+    :py:func:`~get_traces` is called, the preprocessing is still performed per-group,
+    even though the recording is now aggregated.
+
+    To ensure data is preprocessed by channel group, the preprocessing step must be
+    applied separately to each split channel group recording.
+    For example, the below example will NOT preprocess by channel group:
+
+    .. code-block:: python
+
+        split_recording = recording.split_by("group")
+        split_recording_as_list = list(**split_recording.values())
+        combined_recording = aggregate_channels(split_recording_as_list)
+
+        # will NOT preprocess by channel group.
+        filtered_recording = common_reference(combined_recording)
+
+
+    In general, it is not recommended to apply :py:func:`~aggregate_channels` more than once.
+    This will slow down :py:func:`~get_traces` calls and may result in unpredictable behaviour.
+
+
+Sorting a Recording by Channel Group
+------------------------------------
+
+We can also sort a recording for each channel group separately. It is not necessary to preprocess
+a recording by channel group in order to sort by channel group.
+
+There are two ways to sort a recording by channel group. First, we can split the preprocessed
+recording (or, if it was already split during preprocessing as above, skip the :py:func:`~aggregate_channels` step
+directly use the :py:func:`~split_recording_dict`).
+
+**Option 1: Manual splitting**
+
+In this example, similar to above we loop over all preprocessed recordings that
+are grouped by channel, and apply the sorting separately. We store the
+sorting objects in a dictionary for later use.
+
+.. code-block:: python
+
+    split_preprocessed_recording = preprocessed_recording.split_by("group")
+
+    sortings = {}
+    for group, sub_recording in split_preprocessed_recording.items():
+        sorting = run_sorter(
+            sorter_name='kilosort2',
+            recording=split_preprocessed_recording,
+            output_folder=f"folder_KS2_group{group}"
+            )
+        sortings[group] = sorting
+
+**Option 2 : Automatic splitting**
+
+Alternatively, SpikeInterface provides a convenience function to sort the recording by property:
+
+.. code-block:: python
+
+     aggregate_sorting = run_sorter_by_property(
+        sorter_name='kilosort2',
+        recording=preprocessed_recording,
+        grouping_property='group',
+        working_folder='working_path'
+    )
diff --git a/installation_tips/README.md b/installation_tips/README.md
@@ -9,17 +9,14 @@ This environment will install:
  * spikeinterface-gui
  * phy
  * tridesclous
- * spyking-circus (not on mac)
- * herdingspikes (not on windows)
 
 Kilosort, Ironclust and HDSort are MATLAB based and need to be installed from source.
-Klusta does not work anymore with python3.8 you should create a similar environment with python3.6.
 
 ### Quick installation
 
 Steps:
 
-1. Download anaconda individual edition [here](https://www.anaconda.com/products/individual)
+1. Download anaconda individual edition [here](https://www.anaconda.com/download)
 2. Run the installer. Check the box “Add anaconda3 to my Path environment variable”. It makes life easier for beginners.
 3. Download with right click + save the file corresponding to your OS, and put it in "Documents" folder
     * [`full_spikeinterface_environment_windows.yml`](https://raw.githubusercontent.com/SpikeInterface/spikeinterface/master/installation_tips/full_spikeinterface_environment_windows.yml)
@@ -64,3 +61,10 @@ This script tests the following:
   * running herdingspikes (not on windows)
   * opening the spikeinterface-gui
   * exporting to Phy
+
+
+## Installing before release
+
+Some tools in the spikeinteface ecosystem are getting regular bug fixes (spikeinterface, spikeinterface-gui, probeinterface, python-neo, sortingview).
+We are making releases 2 to 4 times a year. In between releases if you want to install from source you can use the `full_spikeinterface_environment_rolling_updates.yml` file to create the environment. This will install the packages of the ecosystem from source.
+This is a good way to test if patch fix your issue.
diff --git a/installation_tips/check_your_install.py b/installation_tips/check_your_install.py
@@ -17,41 +17,54 @@ def _create_recording():
     rec.save(folder='./toy_example_recording')
 
 
-def _run_one_sorter_and_exctract_wf(sorter_name):
+def _run_one_sorter_and_analyzer(sorter_name):
+    job_kwargs = dict(n_jobs=-1, progress_bar=True, chunk_duration="1s")
     import spikeinterface.full as si
-    rec = si.load_extractor('./toy_example_recording')
-    sorting = si.run_sorter(sorter_name, rec, output_folder=f'{sorter_name}_output', verbose=False)
-    si.extract_waveforms(rec, sorting, f'{sorter_name}_waveforms',
-                         n_jobs=1, total_memory="10M", max_spikes_per_unit=500, return_scaled=False)
-
-def run_tridesclous():
-    _run_one_sorter_and_exctract_wf('tridesclous')
+    recording = si.load_extractor('./toy_example_recording')
+    sorting = si.run_sorter(sorter_name, recording, output_folder=f'./sorter_with_{sorter_name}', verbose=False)
+
+    sorting_analyzer = si.create_sorting_analyzer(sorting, recording,
+                                                format="binary_folder", folder=f"./analyzer_with_{sorter_name}",
+                                                **job_kwargs)
+    sorting_analyzer.compute("random_spikes", method="uniform", max_spikes_per_unit=500)
+    sorting_analyzer.compute("waveforms", **job_kwargs)
+    sorting_analyzer.compute("templates")
+    sorting_analyzer.compute("noise_levels")
+    sorting_analyzer.compute("unit_locations", method="monopolar_triangulation")
+    sorting_analyzer.compute("correlograms", window_ms=100, bin_ms=5.)
+    sorting_analyzer.compute("principal_components", n_components=3, mode='by_channel_global', whiten=True, **job_kwargs)
+    sorting_analyzer.compute("quality_metrics", metric_names=["snr", "firing_rate"])
 
 
-def run_spykingcircus():
-    _run_one_sorter_and_exctract_wf('spykingcircus')
+def run_tridesclous():
+    _run_one_sorter_and_analyzer('tridesclous')
 
+def run_tridesclous2():
+    _run_one_sorter_and_analyzer('tridesclous2')
 
-def run_herdingspikes():
-    _run_one_sorter_and_exctract_wf('herdingspikes')
 
 
 def open_sigui():
     import spikeinterface.full as si
     import spikeinterface_gui
     app = spikeinterface_gui.mkQApp()
-    waveform_forlder = 'tridesclous_waveforms'
-    we = si.WaveformExtractor.load_from_folder(waveform_forlder)
-    pc = si.compute_principal_components(we, n_components=3, mode='by_channel_local', whiten=True, dtype='float32')
-    win = spikeinterface_gui.MainWindow(we)
+
+    sorter_name = "tridesclous2"
+    folder = f"./analyzer_with_{sorter_name}"
+    analyzer = si.load_sorting_analyzer(folder)
+
+    win = spikeinterface_gui.MainWindow(analyzer)
     win.show()
     app.exec_()
 
 def export_to_phy():
     import spikeinterface.full as si
-    we = si.WaveformExtractor.load_from_folder("tridesclous_waveforms")
+    sorter_name = "tridesclous2"
+    folder = f"./analyzer_with_{sorter_name}"
+    analyzer = si.load_sorting_analyzer(folder)
+
     phy_folder = "./phy_example"
-    si.export_to_phy(we, output_folder=phy_folder, verbose=False)
+    si.export_to_phy(analyzer, output_folder=phy_folder, verbose=False)
 
 
 def open_phy():
@@ -61,10 +74,12 @@ def open_phy():
 def _clean():
     # clean
     folders = [
-        'toy_example_recording',
-        "tridesclous_output", "tridesclous_waveforms",
-        "spykingcircus_output", "spykingcircus_waveforms",
-        "phy_example"
+        "./toy_example_recording",
+        "./sorter_with_tridesclous",
+        "./analyzer_with_tridesclous",
+        "./sorter_with_tridesclous2",
+        "./analyzer_with_tridesclous2",
+        "./phy_example"
     ]
     for folder in folders:
         if Path(folder).exists():
@@ -86,25 +101,24 @@ def _clean():
         ('Import spikeinterface', check_import_si),
         ('Import spikeinterface.full', check_import_si_full),
         ('Run tridesclous', run_tridesclous),
+        ('Run tridesclous2', run_tridesclous2),
         ]
 
     # backwards logic because default is True for end-user
     if args.ci:
-        steps.insert(3, ('Open spikeinterface-gui', open_sigui) )
+        steps.append(('Open spikeinterface-gui', open_sigui))
 
     steps.append(('Export to phy', export_to_phy)),
         # phy is removed from the env because it force a pip install PyQt5
         # which break the conda env
         # ('Open phy', open_phy),
 
-    if platform.system() == "Windows":
-        pass
-        # steps.insert(3, ('Run spykingcircus', run_spykingcircus))
-    elif platform.system() == "Darwin":
-        steps.insert(3, ('Run herdingspikes', run_herdingspikes))
-    else:
-        steps.insert(3, ('Run spykingcircus', run_spykingcircus))
-        steps.insert(4, ('Run herdingspikes', run_herdingspikes))
+    # if platform.system() == "Windows":
+    #     pass
+    # elif platform.system() == "Darwin":
+    #     pass
+    # else:
+    #     pass
 
     for label, func in steps:
         try:

diff --git a/installation_tips/full_spikeinterface_environment_linux_dandi.yml b/installation_tips/full_spikeinterface_environment_linux_dandi.yml
@@ -3,37 +3,35 @@ channels:
   - conda-forge
   - defaults
 dependencies:
-  - python=3.10
-  - pip>=21.0
-  - mamba
-  - numpy<1.22
+  - python=3.11
+  - pip
+  - numpy
+  - scipy
   - joblib
   - tqdm
   - matplotlib
   - h5py
   - pandas
+  - xarray
+  - zarr
   - scikit-learn
+  - hdbscan
   - networkx
   - pybind11
   - loky
-  - hdbscan
   - numba
   - jupyter
-  - mpi4py
-  - cxx-compiler
-  - libxcb
   - pyqt=5
   - pyqtgraph
-  - htop
   - ipywidgets
   - ipympl
+  - htop
+  - cxx-compiler
+  - libxcb
   - pip:
     - ephyviewer
-    - neo>=0.12
-    - probeinterface>=0.2.11
-    - MEArec>=1.8
-    - spikeinterface[full, widgets]
+    - MEArec
+    - spikeinterface[full,widgets]
     - spikeinterface-gui
-    - tridesclous>=1.6.8
-    - spyking-circus>=1.1.0
+    - tridesclous
     # - phy==2.0b5