-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSError: [Errno 24] Too many open files when RandomForestRegressor has 140 estimators #22
Comments
I had a look at code_gen.py. Perhaps the CodeGenerator class could build a string instead of opening and writing to a file. When the .file method is called, it could write to a file, close it and return the name. |
On Linux and macOS you have to do simply issue On Windows there is no way to raise the limit globally, but there is an internal solution, which you have to include in your script:
I used to write one cpp file, but it didn't work for large forests - especially if you have lots of data and allow for full growth. For my example this translate to 500 .cpp files over 100MB (50GB+ of RAM). Keeping all those files in StringIO's would probably work, although .o files would also still be there, so we would go down to To sum up - I regard it as not an issue, and overcoming it would probably cost a lot of RAM in return, which ultimately is a deal-breaker (at least for me). |
I see what you mean. I've fixed the problem for myself, like you say, it isn't hard. I am concerned that users could be put off by this. How about an informative error for them, like this? class CodeGenerator(object):
def __init__(self):
try:
self._file = tempfile.NamedTemporaryFile(prefix='compiledtrees_', suffix='.cpp', delete=True)
except OSError as e:
if e.errno == 24:
print("Too many open files. Increase limit to 2 * n_trees + 2" \
+ "(unix / mac: ulimit -n [limit], windows: http://bit.ly/2fAKnz0)", file=sys.stderr)
raise e
self._indent = 0 edit: added if |
That might be good solution if Although I fear we will catch some false positives. Also an unittest for that would be usefull (see hints on changing limits on all platforms) |
Here's a loop that fits and compiles trees, stepping up the number of estimators each time:
It crashes on 140:
This is on mac OS.
I haven't looked into workarounds - perhaps I can increase the number of files that can be open at once. But if there's a way to limit the open files in the library, that would probably be better.
The text was updated successfully, but these errors were encountered: