Skip to content

Commit

Permalink
Merge branch 'main' into sorter-window-doc
Browse files Browse the repository at this point in the history
  • Loading branch information
zm711 authored Mar 8, 2024
2 parents 032f920 + fe937a1 commit 594cc77
Show file tree
Hide file tree
Showing 9 changed files with 337 additions and 78 deletions.
1 change: 1 addition & 0 deletions doc/how_to/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ How to guides
analyse_neuropixels
handle_drift
load_matlab_data
process_by_channel_group
188 changes: 188 additions & 0 deletions doc/how_to/process_by_channel_group.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
Processing a Recording by Channel Group
=======================================

In this tutorial, we will walk through how to preprocess and sort a recording
separately for *channel groups*. A channel group is a subset of channels grouped by some
feature - for example a multi-shank Neuropixels recording in which the channels
are grouped by shank.

**Why preprocess by channel group?**

Certain preprocessing steps depend on the spatial arrangement of the channels.
For example, common average referencing (CAR) averages over channels (separately for each time point)
and subtracts the average. In such a scenario it may make sense to group channels so that
this averaging is performed only over spatially close channel groups.

**Why sort by channel group?**

When sorting, we may want to completely separately channel groups so we can
consider their signals in isolation. If recording from a long
silicon probe, we might want to sort different brain areas separately,
for example using a different sorter for the hippocampus, the thalamus, or the cerebellum.


Splitting a Recording by Channel Group
--------------------------------------

In this example, we create a 384-channel recording with 4 shanks. However this could
be any recording in which the channel are grouped in some way, for example
a multi-tetrode recording with channel groups representing the channels on each individual tetrodes.

First, let's import the parts of SpikeInterface we need into Python, and generate our toy recording:

.. code-block:: python
import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
from spikeinterface import aggregate_channels
from probeinterface import generate_tetrode, ProbeGroup
import numpy as np
# Create a toy 384 channel recording with 4 shanks (each shank contain 96 channels)
recording, _ = se.toy_example(duration=[1.00], num_segments=1, num_channels=384)
four_shank_groupings = np.repeat([0, 1, 2, 3], 96)
recording.set_property("group", four_shank_groupings)
print(recording.get_channel_groups())
"""
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3])
"""
We can split a recording into multiple recordings, one for each channel group, with the :py:func:`~split_by` method.

.. code-block:: python
split_recording_dict = recording.split_by("group")
Splitting a recording by channel group returns a dictionary containing separate recordings, one for each channel group:

.. code-block:: python
print(split_recording_dict)
"""
{0: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
10.99 MiB, 1: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
10.99 MiB, 2: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
10.99 MiB, 3: ChannelSliceRecording: 96 channels - 30.0kHz - 1 segments - 30,000 samples - 1.00s - float32 dtype
10.99 MiB}
"""
Preprocessing a Recording by Channel Group
------------------------------------------

The essence of preprocessing by channel group is to first split the recording
into separate recordings, perform the preprocessing steps, then aggregate
the channels back together.

In the below example, we loop over the split recordings, preprocessing each channel group
individually. At the end, we use the :py:func:`~aggregate_channels` function
to combine the separate channel group recordings back together.

.. code-block:: python
preprocessed_recordings = []
# loop over the recordings contained in the dictionary
for chan_group_rec in split_recordings_dict.values():
# Apply the preprocessing steps to the channel group in isolation
shifted_recording = spre.phase_shift(chan_group_rec)
filtered_recording = spre.bandpass_filter(shifted_recording)
referenced_recording = spre.common_reference(filtered_recording)
preprocessed_recordings.append(referenced_recording)
# Combine our preprocessed channel groups back together
combined_preprocessed_recording = aggregate_channels(preprocessed_recordings)
Now, when this recording is used in sorting, plotting, or whenever
calling its :py:func:`~get_traces` method, the data will have been
preprocessed separately per-channel group (then concatenated
back together under the hood).

It is strongly recommended to use the above structure to preprocess by channel group.

.. note::

The splitting and aggregation of channels for preprocessing is flexible.
Under the hood, :py:func:`~aggregate_channels` keeps track of when a recording was split. When
:py:func:`~get_traces` is called, the preprocessing is still performed per-group,
even though the recording is now aggregated.

To ensure data is preprocessed by channel group, the preprocessing step must be
applied separately to each split channel group recording.
For example, the below example will NOT preprocess by channel group:

.. code-block:: python
split_recording = recording.split_by("group")
split_recording_as_list = list(**split_recording.values())
combined_recording = aggregate_channels(split_recording_as_list)
# will NOT preprocess by channel group.
filtered_recording = common_reference(combined_recording)
In general, it is not recommended to apply :py:func:`~aggregate_channels` more than once.
This will slow down :py:func:`~get_traces` calls and may result in unpredictable behaviour.


Sorting a Recording by Channel Group
------------------------------------

We can also sort a recording for each channel group separately. It is not necessary to preprocess
a recording by channel group in order to sort by channel group.

There are two ways to sort a recording by channel group. First, we can split the preprocessed
recording (or, if it was already split during preprocessing as above, skip the :py:func:`~aggregate_channels` step
directly use the :py:func:`~split_recording_dict`).

**Option 1: Manual splitting**

In this example, similar to above we loop over all preprocessed recordings that
are grouped by channel, and apply the sorting separately. We store the
sorting objects in a dictionary for later use.

.. code-block:: python
split_preprocessed_recording = preprocessed_recording.split_by("group")
sortings = {}
for group, sub_recording in split_preprocessed_recording.items():
sorting = run_sorter(
sorter_name='kilosort2',
recording=split_preprocessed_recording,
output_folder=f"folder_KS2_group{group}"
)
sortings[group] = sorting
**Option 2 : Automatic splitting**

Alternatively, SpikeInterface provides a convenience function to sort the recording by property:

.. code-block:: python
aggregate_sorting = run_sorter_by_property(
sorter_name='kilosort2',
recording=preprocessed_recording,
grouping_property='group',
working_folder='working_path'
)
12 changes: 8 additions & 4 deletions installation_tips/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,17 +9,14 @@ This environment will install:
* spikeinterface-gui
* phy
* tridesclous
* spyking-circus (not on mac)
* herdingspikes (not on windows)

Kilosort, Ironclust and HDSort are MATLAB based and need to be installed from source.
Klusta does not work anymore with python3.8 you should create a similar environment with python3.6.

### Quick installation

Steps:

1. Download anaconda individual edition [here](https://www.anaconda.com/products/individual)
1. Download anaconda individual edition [here](https://www.anaconda.com/download)
2. Run the installer. Check the box “Add anaconda3 to my Path environment variable”. It makes life easier for beginners.
3. Download with right click + save the file corresponding to your OS, and put it in "Documents" folder
* [`full_spikeinterface_environment_windows.yml`](https://raw.githubusercontent.com/SpikeInterface/spikeinterface/master/installation_tips/full_spikeinterface_environment_windows.yml)
Expand Down Expand Up @@ -64,3 +61,10 @@ This script tests the following:
* running herdingspikes (not on windows)
* opening the spikeinterface-gui
* exporting to Phy
## Installing before release
Some tools in the spikeinteface ecosystem are getting regular bug fixes (spikeinterface, spikeinterface-gui, probeinterface, python-neo, sortingview).
We are making releases 2 to 4 times a year. In between releases if you want to install from source you can use the `full_spikeinterface_environment_rolling_updates.yml` file to create the environment. This will install the packages of the ecosystem from source.
This is a good way to test if patch fix your issue.
76 changes: 45 additions & 31 deletions installation_tips/check_your_install.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,41 +17,54 @@ def _create_recording():
rec.save(folder='./toy_example_recording')


def _run_one_sorter_and_exctract_wf(sorter_name):
def _run_one_sorter_and_analyzer(sorter_name):
job_kwargs = dict(n_jobs=-1, progress_bar=True, chunk_duration="1s")
import spikeinterface.full as si
rec = si.load_extractor('./toy_example_recording')
sorting = si.run_sorter(sorter_name, rec, output_folder=f'{sorter_name}_output', verbose=False)
si.extract_waveforms(rec, sorting, f'{sorter_name}_waveforms',
n_jobs=1, total_memory="10M", max_spikes_per_unit=500, return_scaled=False)

def run_tridesclous():
_run_one_sorter_and_exctract_wf('tridesclous')
recording = si.load_extractor('./toy_example_recording')
sorting = si.run_sorter(sorter_name, recording, output_folder=f'./sorter_with_{sorter_name}', verbose=False)

sorting_analyzer = si.create_sorting_analyzer(sorting, recording,
format="binary_folder", folder=f"./analyzer_with_{sorter_name}",
**job_kwargs)
sorting_analyzer.compute("random_spikes", method="uniform", max_spikes_per_unit=500)
sorting_analyzer.compute("waveforms", **job_kwargs)
sorting_analyzer.compute("templates")
sorting_analyzer.compute("noise_levels")
sorting_analyzer.compute("unit_locations", method="monopolar_triangulation")
sorting_analyzer.compute("correlograms", window_ms=100, bin_ms=5.)
sorting_analyzer.compute("principal_components", n_components=3, mode='by_channel_global', whiten=True, **job_kwargs)
sorting_analyzer.compute("quality_metrics", metric_names=["snr", "firing_rate"])


def run_spykingcircus():
_run_one_sorter_and_exctract_wf('spykingcircus')
def run_tridesclous():
_run_one_sorter_and_analyzer('tridesclous')

def run_tridesclous2():
_run_one_sorter_and_analyzer('tridesclous2')

def run_herdingspikes():
_run_one_sorter_and_exctract_wf('herdingspikes')


def open_sigui():
import spikeinterface.full as si
import spikeinterface_gui
app = spikeinterface_gui.mkQApp()
waveform_forlder = 'tridesclous_waveforms'
we = si.WaveformExtractor.load_from_folder(waveform_forlder)
pc = si.compute_principal_components(we, n_components=3, mode='by_channel_local', whiten=True, dtype='float32')
win = spikeinterface_gui.MainWindow(we)

sorter_name = "tridesclous2"
folder = f"./analyzer_with_{sorter_name}"
analyzer = si.load_sorting_analyzer(folder)

win = spikeinterface_gui.MainWindow(analyzer)
win.show()
app.exec_()

def export_to_phy():
import spikeinterface.full as si
we = si.WaveformExtractor.load_from_folder("tridesclous_waveforms")
sorter_name = "tridesclous2"
folder = f"./analyzer_with_{sorter_name}"
analyzer = si.load_sorting_analyzer(folder)

phy_folder = "./phy_example"
si.export_to_phy(we, output_folder=phy_folder, verbose=False)
si.export_to_phy(analyzer, output_folder=phy_folder, verbose=False)


def open_phy():
Expand All @@ -61,10 +74,12 @@ def open_phy():
def _clean():
# clean
folders = [
'toy_example_recording',
"tridesclous_output", "tridesclous_waveforms",
"spykingcircus_output", "spykingcircus_waveforms",
"phy_example"
"./toy_example_recording",
"./sorter_with_tridesclous",
"./analyzer_with_tridesclous",
"./sorter_with_tridesclous2",
"./analyzer_with_tridesclous2",
"./phy_example"
]
for folder in folders:
if Path(folder).exists():
Expand All @@ -86,25 +101,24 @@ def _clean():
('Import spikeinterface', check_import_si),
('Import spikeinterface.full', check_import_si_full),
('Run tridesclous', run_tridesclous),
('Run tridesclous2', run_tridesclous2),
]

# backwards logic because default is True for end-user
if args.ci:
steps.insert(3, ('Open spikeinterface-gui', open_sigui) )
steps.append(('Open spikeinterface-gui', open_sigui))

steps.append(('Export to phy', export_to_phy)),
# phy is removed from the env because it force a pip install PyQt5
# which break the conda env
# ('Open phy', open_phy),

if platform.system() == "Windows":
pass
# steps.insert(3, ('Run spykingcircus', run_spykingcircus))
elif platform.system() == "Darwin":
steps.insert(3, ('Run herdingspikes', run_herdingspikes))
else:
steps.insert(3, ('Run spykingcircus', run_spykingcircus))
steps.insert(4, ('Run herdingspikes', run_herdingspikes))
# if platform.system() == "Windows":
# pass
# elif platform.system() == "Darwin":
# pass
# else:
# pass

for label, func in steps:
try:
Expand Down
28 changes: 13 additions & 15 deletions installation_tips/full_spikeinterface_environment_linux_dandi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,37 +3,35 @@ channels:
- conda-forge
- defaults
dependencies:
- python=3.10
- pip>=21.0
- mamba
- numpy<1.22
- python=3.11
- pip
- numpy
- scipy
- joblib
- tqdm
- matplotlib
- h5py
- pandas
- xarray
- zarr
- scikit-learn
- hdbscan
- networkx
- pybind11
- loky
- hdbscan
- numba
- jupyter
- mpi4py
- cxx-compiler
- libxcb
- pyqt=5
- pyqtgraph
- htop
- ipywidgets
- ipympl
- htop
- cxx-compiler
- libxcb
- pip:
- ephyviewer
- neo>=0.12
- probeinterface>=0.2.11
- MEArec>=1.8
- spikeinterface[full, widgets]
- MEArec
- spikeinterface[full,widgets]
- spikeinterface-gui
- tridesclous>=1.6.8
- spyking-circus>=1.1.0
- tridesclous
# - phy==2.0b5
Loading

0 comments on commit 594cc77

Please sign in to comment.