Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add import specific documentation #2168

Merged
merged 12 commits into from
Nov 22, 2023
85 changes: 85 additions & 0 deletions doc/import.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
Importing SpikeInterface
========================

SpikeInterface allows for the generation of powerful and reproducible spike sorting pipelines.
Flexibility is built into the package starting from import to maximize the productivity of
the developer and the scientist. Thus there are three ways that SpikeInterface and its components
can be imported:


Importing by Module
-------------------

Since each spike sorting pipeline involves a series of often repeated steps, many of the developers
working on SpikeInterface recommend importing in a module by module fashion. This will allow you to
keep track of your processing steps (preprocessing, postprocessing, quality metrics, etc.). This can
be accomplished by:

.. code-block:: python

import spikeinterface as si
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the speed metrics @h-mayorquin supplied why don't we suggest

import spikeinterface.core as si

It has a slight speed advantage directly importing rather than doing without core? And it follows the syntax for all the other submodules?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is perfect!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

zm711 marked this conversation as resolved.
Show resolved Hide resolved

to import the :code:`core` module followed by:

.. code-block:: python

import spikeinterface.extractors as se
import spikeinterface.preprocessing as spre
import spikeinterface.sorters as ss
import spikinterface.postprocessing as spost
import spikeinterface.qualitymetrics as sqm
import spikeinterface.exporters as sexp
import spikeinterface.comparsion as scmp
import spikeinterface.curation as scur
import spikeinterface.sortingcomponents as sc
import spikeinterface.widgets as sw

to import any of the other modules you wish to use.

The benefit of this approach is that it is lighter than importing the whole library as a flat module and allows
you to choose which of the modules you actually want to use. It also reminds you what step of the pipeline each
submodule is meant to be used for. If you don't plan to export the results out of SpikeInterface then you
don't have to :code:`import spikeinterface.exporters`. Additionally the documentation of SpikeInterface is set-up
in a modular fashion, so if you have a problem with the submodule :code:`spikeinterface.curation`,you will know
to go to the :code:`curation` section of this documention. The disadvantage of this approach is that you have
more aliases to keep track of.


Flat Import
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did anyone favor this option in the issue? I kind of forgot it. Could we have this to the end? I am not sure this is what numpy or pandas do btw. Is that the case? They also have np.linalg in numpy at least.

Copy link
Collaborator Author

@zm711 zm711 Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. It was perhaps an oversimplification to explain a point.

numpy does hive some things off like numpy.testing, numpy.linalg, intentionally, but for most beginning users I think they really start with the things open with the import of numpy as np.

Pandas I don't know. I just picked it as a common library people would likely know. I've never had to import a submodule from Pandas. I think for me I would love to give an example that people latch on to. Is there a common package that you think is flat enough to count for this @h-mayorquin ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelgarcia , you like this sometimes right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean, I think that your mind is on the right place. For user what matters is that: "all the functions that I could use or want will be available without nesting" so si.function1, si.function2, si.function3 will work I think your example is fine from that perspective as it conveys the right idea with a familiar example.

Mine was more a technical caveat by saying that I don't think numpy or pandas are putting ALL their stuff the way that we are doing it:

https://github.com/SpikeInterface/spikeinterface/blob/329197618a9b48ef876d1e8b8e79f07f4abf5e49/src/spikeinterface/full.py#L26

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very true! They keep more functions more private. I can change the wording slightly so more technically users are put off, but keep the simplicity of the example. Thanks :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the question. Do I like this flat import ? yes I use it almost always.

For me the comparison to numpy/pndas is not the best I think.
I would more compare to scipy many modules that could be imported flat.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelgarcia, I'm off GH for this week, but just to briefly comment on this one. I think this is a "simplification" for tutorial purposes vs being as exact as possible. (Pandas I think is closer than Numpy, though). My problem with using scipy here is that above we say you can import modules, so if we then talk about scipy modules here it will be mixing concepts (even if their flatness is more accurate to the .full flatness). Maybe it is best just to remove the reference to other packages for flat, so not muddy the water because I don't think we will find a perfect example.

Honestly for me I would expect

import spikeinterface as si # import the full package flat
import spikeinterface.core as si # import only core submodule

I think I understand why you did it your way for the purpose of dependency management (keep core light--but correct me if there was another design choice there), but it makes it a little tricker to find parallels with other packages.

tl;dr:
Should I just delete the outside examples?

I'll fix the numba etc this weekend when I'm back on.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelgarcia
The question was if that somebody was using it. You have answered it already by saying you do. Thanks.

-----------

A second option is to import the SpikeInterface package in :code:`full` mode. This would be similar to
what is seen with packages like NumPy (:code:`np`) or Pandas (:code:`pd`), which offer the majority of
their functionality with a single alias and the option to import additional functionality separately.
zm711 marked this conversation as resolved.
Show resolved Hide resolved
To accomplish this one does:


.. code-block:: python

import spikeinterface.full as si


This import statement will import all of the SpikeInterface modules as one flattened module.
Note that importing :code:`spikeinterface.full` will take a few extra seconds, because some modules use
zm711 marked this conversation as resolved.
Show resolved Hide resolved
just-in-time :code:`numba` compilation performed at the time of import.
We recommend this approach for advanced (or lazy) users, since it requires a deeper knowledge of the API. The advantage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is anyone of us using it?
I would recommend the functionality that we are using ourselves as that will be the one that is better tested.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the old docs it went
Import core
Import full
no mention of by function.

So I just kept the same general flow and added by function at the bottom. Happy to move unless Sam strong wants it above. He added edits to the full paragraph so he definitely took a peek.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right. Makes sense to not make a lot of changes on the same PR. Thanks for explaining yourself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. When I teahc spikeinterface with tutorial very often I use this spikeinterface.full so participant do not have to focus on the submodules concept and take spikeinterface a a whole.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I teach/show tutorials I use the by module approach

zm711 marked this conversation as resolved.
Show resolved Hide resolved
being that users can access all functions using one alias without the need of memorizing all aliases.


Importing Individual Functions
------------------------------

Finally, some users may find it useful to have extremely light imports and only import the exact functions
they plan to use. This can easily be accomplished by importing functions directly into the name space.

For example:

.. code-block:: python

from spikeinterface.preprocessing import bandpass_filter, common_reference
from spikeinterface.core import extract_waveforms
from spikeinterface.extractors import read_binary

As mentioned this approach only imports exactly what you plan on using so it is the most minimalist. It does require
knowledge of the API to know which module to pull a function from. It could also lead to naming clashes if pulling
functions directly from other scientific libraries. Type :code:`import this` for more information.
1 change: 1 addition & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ SpikeInterface is made of several modules to deal with different aspects of the

overview
installation
import
modules/index
how_to/index
modules_gallery/index
Expand Down
12 changes: 8 additions & 4 deletions doc/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ To install the current release version, you can use:

The :code:`[full]` option installs all the extra dependencies for all the different sub-modules.

Note that if using Z shell (:code:`zsh` - the default shell on mac), you will need to use quotes (:code:`pip install "spikeinterface[full]"`).
Note that if using Z shell (:code:`zsh` - the default shell on macOS), you will need to use quotes (:code:`pip install "spikeinterface[full]"`).


To install all interactive widget backends, you can use:
Expand Down Expand Up @@ -63,14 +63,14 @@ as :code:`spikeinterface` strongly relies on these packages to interface with va


It is also sometimes useful to have local copies of :code:`neo` and :code:`probeinterface` to make changes to the code. To achieve this, repeat the first set of commands,
replacing `https://github.com/SpikeInterface/spikeinterface.git` with the appropriate repository in the first code block of this section.
replacing :code:`https://github.com/SpikeInterface/spikeinterface.git` with the appropriate repository in the first code block of this section.

For beginners
-------------

We provide some installation tips for beginners in Python here:

https://github.com/SpikeInterface/spikeinterface/tree/master/installation_tips
https://github.com/SpikeInterface/spikeinterface/tree/main/installation_tips



Expand All @@ -89,12 +89,16 @@ Requirements
Sub-modules have more dependencies, so you should also install:

* zarr
* h5py
* scipy
* pandas
* xarray
* sklearn
* scikit-learn
* networkx
* matplotlib
* numba
* distinctipy
* cuda-python (for non-macOS users)


All external spike sorters can be either run inside containers (Docker or Singularity - see :ref:`containerizedsorters`)
Expand Down
31 changes: 0 additions & 31 deletions doc/modules/core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,37 +22,6 @@ All classes support:
* multiple segments, where each segment is a contiguous piece of data (recording, sorting, events).


Import rules
------------

Importing the SpikeInterface module

.. code-block:: python

import spikeinterface as si

will only import the :code:`core` module. Other submodules must be imported separately:

.. code-block:: python

import spikeinterface.extractors as se
import spikeinterface.sorters as ss
import spikeinterface.widgets as sw


A second option is to import the SpikeInterface package in :code:`full` mode:

.. code-block:: python

import spikeinterface.full as si

This import statement will import all of SpikeInterface modules as a flattened module.
Note that importing :code:`spikeinterface.full` will take a few extra seconds, because some modules use
just-in-time :code:`numba` compilation performed at the time of import.
We recommend this approach to advanced users, since it requires a deeper knowledge of the API.



Recording
---------

Expand Down