Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the method of creating an empty file with right size when saving binary files #3408

Merged
merged 6 commits into from
Sep 16, 2024

Conversation

jiumao2
Copy link
Contributor

@jiumao2 jiumao2 commented Sep 13, 2024

The current method of creating an empty file with file_size_bytes is significantly slow, especially for large files. This PR addresses the issue by introducing a more efficient approach.

Current Implementation in write_binary_recording (located in src/spikeinterface/core/recording_tools.py):

file = open(file_path, "wb+")
file.truncate(file_size_bytes)
file.close()

Proposed Improvement:

file = open(file_path, "wb+")
file.seek(file_size_bytes - 1)
file.write(b'\0')
file.close()

Creating a 100 GB empty file on SSD:

Time using seek: 0.05744457244873047 seconds
Time using truncate: 88.06437873840332 seconds

jiumao2 and others added 2 commits September 13, 2024 21:20
Update the method to create an empty file with the right size
@h-mayorquin h-mayorquin self-assigned this Sep 13, 2024
@h-mayorquin h-mayorquin self-requested a review September 13, 2024 14:00
@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

This is cool. Just tagging myself so I can follow this story as Heberto reviews it.

@h-mayorquin
Copy link
Collaborator

h-mayorquin commented Sep 13, 2024

Hi, thanks for the contribution. I am unable to reproduce your results on Ubuntu 24.04 with kernel 6.8 and an SSD with NVMe:

image

I also kind of expect that one syscall (truncate) would be faster than two (seek and write). Can you give me more context about how you are doing your profiling and the reason for why seek plus truncate should be faster? Maybe I am doing something wrong on my profiling.

@h-mayorquin h-mayorquin added the performance Performance issues/improvements label Sep 13, 2024
@jiumao2
Copy link
Contributor Author

jiumao2 commented Sep 13, 2024

I’m working with Windows 10 and Python 3.10. Could the issue be related to the operating system?

@h-mayorquin
Copy link
Collaborator

@zm711 could you try the profile that I wrote above?

@h-mayorquin
Copy link
Collaborator

For your convenience:

%%timeit

file_size_bytes = 10 * 1024 * 1024 * 1024
file = open("temporary.bin", "wb+")
file.truncate(file_size_bytes)
file.close()


%%timeit
file_size_bytes = 10 * 1024 * 1024 * 1024
file = open("temporary.bin", "wb+")
file.seek(file_size_bytes - 1)
file.write(b'\0')
file.close()

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

Sure. Running to meeting, but right after :)

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

Windows 10, python 3.9 10GiB file

image
image

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

This could be a windows thing. So maybe the better strategy is actually per OS. I've been telling my lab mates to deal with it, but it is brutally slow waiting for large binary files to write for KS and MS5.

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

I'll test on my Mac too and report that here as well.

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

macOS python 3.11 10 GiB file

Screenshot 2024-09-13 at 1 20 09 PM

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

Just to be fair I'm also testing python 3.11 on Windows now, but it'll take a little while.

@h-mayorquin
Copy link
Collaborator

@zm711
If you can test the 100 GiB version that would be great as well.

Yet another place where our linux bias shows up. I am confused on why seeking would be faster on windows but it should be easy to make this procedure os based as the difference in time for the @jiumao2 is indeed substantial.

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

Sure I'll start with my Mac and do windows later.

@h-mayorquin
Copy link
Collaborator

Whenever you can, man, no rush : )
Thanks for helping!

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

Times are exactly the same for 100 GiB on Mac.
24 vs 56

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

To be honest running the timeit on Windows is causing my computer to freak out... I don't know what was happening for the truncate. It worked instantly for seek.....

@jiumao2
Copy link
Contributor Author

jiumao2 commented Sep 13, 2024

@zm711 Great speed in testing!

@h-mayorquin
Copy link
Collaborator

Maybe this is related to the zero filling?

https://docs.python.org/3/library/io.html#io.IOBase.truncate

@zm711
Copy link
Collaborator

zm711 commented Sep 13, 2024

The 100 GiB test failed on my Windows computer due to lack of storage. (I've only got a small SSD so this test might be hard for me to do).

Copy link
Collaborator

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be my suggestion unless you have a better idea Heberto.

src/spikeinterface/core/recording_tools.py Outdated Show resolved Hide resolved
src/spikeinterface/core/recording_tools.py Outdated Show resolved Hide resolved
@jiumao2
Copy link
Contributor Author

jiumao2 commented Sep 14, 2024

@h-mayorquin I agree that Python on Windows does zero-filling in truncate. It will be slower on Windows with larger files and slower SSDs/HHDs. It took me hours to write a 2TB file on my HHD, and it ultimately failed due to memory issues after creating the empty file...

Besides, since seek and write are both O(1) operations across systems, it might be less meaningful to differentiate the systems for sub-millisecond speed up?

@zm711
Copy link
Collaborator

zm711 commented Sep 14, 2024

Besides, since seek and write are both O(1) operations across systems, it might be less meaningful to differentiate the systems for sub-millisecond speed up?

I think this is a fair point but at least for me it's less about differentiating for a sub ms speedup and more about keeping the default for posix systems and then optimizing for Windows systems. Until someone feels comfortable knowing exactly why there is this difference (maybe zero filling but the docs aren't clear on this) we could keep our original implementation for posix and the speedup for Windows. Maybe you have some clearer docs on the situation so we can better understand the differences across OSs? Either way we should probably add a comment (either explaining why we added the Windows difference) or a comment why we changed everything to seek-write.

@h-mayorquin
Copy link
Collaborator

@h-mayorquin I agree that Python on Windows does zero-filling in truncate. It will be slower on Windows with larger files and slower SSDs/HHDs. It took me hours to write a 2TB file on my HHD, and it ultimately failed due to memory issues after creating the empty file...

Besides, since seek and write are both O(1) operations across systems, it might be less meaningful to differentiate the systems for sub-millisecond speed up?

This is a good point, maybe it doesn't make sense to differentiate.

@jiumao2
Copy link
Contributor Author

jiumao2 commented Sep 15, 2024

I think this is a fair point but at least for me it's less about differentiating for a sub ms speedup and more about keeping the default for posix systems and then optimizing for Windows systems. Until someone feels comfortable knowing exactly why there is this difference (maybe zero filling but the docs aren't clear on this) we could keep our original implementation for posix and the speedup for Windows. Maybe you have some clearer docs on the situation so we can better understand the differences across OSs? Either way we should probably add a comment (either explaining why we added the Windows difference) or a comment why we changed everything to seek-write.

I'm primarily a Windows user. It is totally up to you how to implement it for posix systems.

@zm711
Copy link
Collaborator

zm711 commented Sep 15, 2024

I'm primarily a Windows user. It is totally up to you how to implement it for posix systems.

I'm between Windows and Mac so I think this Windows update will be wonderful. I could go either way on the posix implementation vs taking a minuscule slow down to reduce complexity. Could you add a comment just linking back to this PR where we did the profiling to indicate why we needed to switch :) for Windows. If you want to revert back to just one implementation that's fine with me. But I do want a comment so we know why we are doing what we are doing for this really low level operation.

@jiumao2
Copy link
Contributor Author

jiumao2 commented Sep 15, 2024

@zm711 I reverted it back. And I found with open fits better here. Is that comment okay?

Copy link
Collaborator

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for me. context manager seems safer too. Thanks @jiumao2.

Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this to our attention. This is a great contribution that would improve the lives of many window users.

Thanks @zm711 for reviewing this as well!

Copy link
Collaborator

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good by Heberto good by me.

@alejoe91
Copy link
Member

Should I squeeze it in into the release? Or it can wait for the next one?

@h-mayorquin
Copy link
Collaborator

My vote is for waiting on this one.

@zm711
Copy link
Collaborator

zm711 commented Sep 16, 2024

I have everyone install from main so for me it's not a big deal/ my group. But this is a huge quality of life. We have so many issues on the tracker for Windows writing binary that we thought was concurrency, but I bet this could contribute.

@alejoe91 alejoe91 merged commit 55c7de1 into SpikeInterface:main Sep 16, 2024
15 checks passed
alejoe91 added a commit that referenced this pull request Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues/improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants