Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A couple updates to mountainsort5 sorter #2225

Merged
merged 11 commits into from
Jan 22, 2024
Merged

A couple updates to mountainsort5 sorter #2225

merged 11 commits into from
Jan 22, 2024

Conversation

magland
Copy link
Member

@magland magland commented Nov 17, 2023

Here are a couple important updates for mountainsort5 that I would like to merge in.

First, it's important to use dtype=float for the bandpass filter because it seems that it will preserve the int type otherwise. I saw one example where this made a big difference for this algorithm.

Second, for efficiency, it makes a big difference to cache the recording after preprocessing to a temporary directory as a binary recording, because otherwise the data will be re-filtered throughout the process, including potentially expensive whitening.

sorting = ms5.sorting_scheme2(recording=recording, sorting_parameters=scheme2_sorting_parameters)
elif p["scheme"] == "3":
sorting = ms5.sorting_scheme3(recording=recording, sorting_parameters=scheme3_sorting_parameters)
assert isinstance(recording, BaseRecording)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this assert doing? is there a concern that users wouldn't supply a recording. Should we tell the user what they put in wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to keep the linter happy. The output of load_recording_from_folder is BaseExtractor | None, whereas mountainsort5 expects a BaseRecording. I will update this to make it more clear.

@magland
Copy link
Member Author

magland commented Nov 17, 2023

@zm711 I made the type checking more clear in that last commit.

@zm711 zm711 added the sorters Related to sorters module label Nov 17, 2023
@samuelgarcia
Copy link
Member

Second, for efficiency, it makes a big difference to cache the recording after preprocessing to a temporary directory as a binary recording, because otherwise the data will be re-filtered throughout the process, including potentially expensive whitening.

This really depend on the IO speed and how many times you need to access every piece of chunks.
Sometimes on network drive writting is slow.
My plan is to make a in memory (shared) to cache the recording soon.

@magland
Copy link
Member Author

magland commented Nov 17, 2023

This really depend on the IO speed and how many times you need to access every piece of chunks. Sometimes on network drive writting is slow. My plan is to make a in memory (shared) to cache the recording soon.

Thanks @samuelgarcia. Is that something that would block this PR from being merged?

sorting = ms5.sorting_scheme3(recording=recording, sorting_parameters=scheme3_sorting_parameters)
with TemporaryDirectory() as tmpdir:
# cache the recording to a temporary directory for efficient reading (so we don't have to re-filter)
recording_cached = create_cached_recording(recording=recording, folder=tmpdir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n_jobs=1 here will make this extremely slow for long recordings.

@samuelgarcia
Copy link
Member

Thanks @samuelgarcia. Is that something that would block this PR from being merged?

Of course. I am just happy to chat.

@alejoe91
Copy link
Member

Thanks @samuelgarcia. Is that something that would block this PR from being merged?

Of course. I am just happy to chat.

I think @samuelgarcia means "of course NOT, it's not blocking the PR" :)

In my opinion forcing n_jobs=1 will be a bottleneck, especiall for high-channel counts. Have you encountered any errors reading with remfile in parallel? IMO, streaming is an edge case, so it should be handled separately

@magland
Copy link
Member Author

magland commented Nov 20, 2023

@alejoe91

In my opinion forcing n_jobs=1 will be a bottleneck, especiall for high-channel counts. Have you encountered any errors reading with remfile in parallel? IMO, streaming is an edge case, so it should be handled separately

That's a good point. Right now, remfile is not thread safe. But maybe it is okay with multiprocessing, not sure. I can make the necessary changes if needed.

But what about bandpass_filter and whiten? I know whiten requires an initial step of computing the whitening matrix. That should be done just once right? In general, since an arbitrary recording extractor could be passed in, how do we know whether it can be parallelized for reading?

EDIT: In the case of either (a) reading from a remote file or (b) doing heaving multi-threaded preprocessing or (c) both, the bottleneck may not be writing to disk, and so in those cases maybe n_jobs=1 is most appropriate.

@alejoe91 @samuelgarcia what would you recommend here?

@alejoe91
Copy link
Member

@magland for whitening we have a trick to only compute the whitening matrix once (at initialization):
basically, we compute W and M and then add them to the self._kwargs, so that subprocesses just load them (see here).

I don't understand your comment:

EDIT: In the case of either (a) reading from a remote file or (b) doing heaving multi-threaded preprocessing or (c) both, the bottleneck may not be writing to disk, and so in those cases maybe n_jobs=1 is most appropriate.

Even if writing is not the bottleneck, parallel processing should speed things up dramatically. Say you stream the file, apply bp filter, and whiten in parallel. Each process will stream smaller portions of the file, process them, and writing to disk, so effectively streaming should be faster!

You could actually test this in the remfile PR!

@magland
Copy link
Member Author

magland commented Nov 20, 2023

@magland for whitening we have a trick to only compute the whitening matrix once (at initialization): basically, we compute W and M and then add them to the self._kwargs, so that subprocesses just load them (see here).

Okay great.

I don't understand your comment:

EDIT: In the case of either (a) reading from a remote file or (b) doing heaving multi-threaded preprocessing or (c) both, the bottleneck may not be writing to disk, and so in those cases maybe n_jobs=1 is most appropriate.

Even if writing is not the bottleneck, parallel processing should speed things up dramatically. Say you stream the file, apply bp filter, and whiten in parallel. Each process will stream smaller portions of the file, process them, and writing to disk, so effectively streaming should be faster!

You could actually test this in the remfile PR!

Good idea I'll test it out. What I'm saying is that since multi-threaded preprocessing already is parallelized (e.g., by numpy) there may not be further gains... similar for downloading... the streaming pipe may already be full (and remfile already uses some parallelization).

@alejoe91
Copy link
Member

Oh for preprocessing we see a massive increase in performance :)

@magland
Copy link
Member Author

magland commented Nov 20, 2023

Thinking more about remfile, I don't think it is suited for parallel use unless the different processes are processing entirely different parts of the file. That's because it needs to do smart chunk retrieval, and if there are multiple processes loading from generally the same part of the file, then the download will be very redundant and inefficient. I expect you'd encounter similar problems with fsspec.

Do you think we should address this by creating a (perhaps manditory) n_jobs_for_preprocessing sorting parameter?

@alejoe91
Copy link
Member

At least having the parameter would give some flexibility. Also I wouldn't use a MS5 function to save the recording with SpikeInterface. You can save it in the wrapper directly in the output_folder (which is exactly what Matlab-based sorters do).

@magland
Copy link
Member Author

magland commented Nov 20, 2023

@alejoe91 thanks so much. I made a few adjustments in response to your comments.

Now using tempfile.TemporaryDirectory()

There are two new optional parameters for this sorter: n_jobs_for_preprocessing and temporary_base_dir

Comment on lines +189 to +191
recording_cached = create_cached_recording(
recording=recording, folder=tmpdir, n_jobs=p["n_jobs_for_preprocessing"]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magland you should use the recording.save() function here!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but I want to have control of exactly how it is saved, and also be consistent with the mountainsort5 docs. Here's my implementation

https://github.com/flatironinstitute/mountainsort5/blob/main/mountainsort5/util/create_cached_recording.py

The SI implementation is doing a lot more indirect stuff. I see save_metadata_to_folder ._save copy_metadata dump. I realize that is all needed in the more general context. But I don't need all that and I want to do only what is needed for mountainsort5 in a very straightforward way. That's why I created this as a utility of ms5.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@magland

Sam and Pierre made a new convenient function that allows you to "cache" a recording with different modes. See here: https://github.com/SpikeInterface/spikeinterface/pull/2276/files

Maybe worth using it here too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe lateer ;)

@alejoe91 alejoe91 added this to the 0.100.0 milestone Jan 9, 2024
@alejoe91 alejoe91 merged commit fb3c9c9 into SpikeInterface:main Jan 22, 2024
11 checks passed
@zm711
Copy link
Collaborator

zm711 commented Feb 5, 2024

I'm just writing myself a note so that I'll be able to find this later. After this update I just tried doing mountainsort5 on one of my datasets and it errored out. Mountainsort5 was working fine before, the error seems to be with tempfile use on Windows. I'll open issues here and on Mountainsort5 tomorrow with the trace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sorters Related to sorters module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants