Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mmap's not getting closed? #132

Open
yalwan-iqvia opened this issue Feb 22, 2021 · 5 comments
Open

Mmap's not getting closed? #132

yalwan-iqvia opened this issue Feb 22, 2021 · 5 comments

Comments

@yalwan-iqvia
Copy link

When reading several thousands of Parquet files, at some point I can encounter an error such as this:

ERROR: LoadError: SystemError: memory mapping failed: Cannot allocate memory

I've ensured to call Parquet.close on my files.
I also wrote a small program to watch the number of memory maps that is occurring during the program running.

image

The behaviour is very strange, it seems like some references to the memmaps get closed, but are somehow "held" and then re-opened on subsequent file opens (see the regular jumping)?

Zoom in on the end portion:
image

I was simply running wc -l against /proc/{PID}/maps in order to make this measurement

I'm in the process of trying to figure out how I can make a reproducible example of this situation, so updates may follow.

@tanmaykm
Copy link
Member

Could you check if invoking GC helps as discussed here: https://discourse.julialang.org/t/mmap-mmap-leaves-the-file-open/19413 ?

@yalwan-iqvia
Copy link
Author

@tanmaykm thanks for responding -- I did try adding the GC to the main loop and it "improved" the situation (in that the process could go further before ultimately crashing) but I still ultimately found a similar crash situation (I didn't inspect mmap open file graphs for this case). I'm still trying to work on producing an example which can be reproduced easily, but it's proving quite challenging.

@yalwan-iqvia
Copy link
Author

Also its worth noting that the error doesn't often manifest as the SystemError but instead an uninterpretable LLVM memory allocation error

@tanmaykm
Copy link
Member

Few more pointers: this is where Parquet does memory map. Memory mapped pages are cached as weakrefs here, so they should get gc'd on memory pressure.

It may also be possible that you are hitting some system limits, in which case reconfiguring your system to allow for more mmaps may help.

Or, you could also ask Parquet to not use memory maps by invoking Parquet.use_mmap(false)

@yalwan-iqvia
Copy link
Author

Thanks, I'll try use_mmap(false) and see what happens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants