find-symbol can fail with "Requested array size exceeds VM limit" #245

vemv · 2019-03-28T01:22:49Z

Hi there,

I've faced this issue repeatedly. In the past I had worked around it via :jvm-opts ["-Xmx18G"], but today I hit it again (long time no see!). I could try setting an even higher bound, but I guess a GH issue will be more useful.

The error is:

When I cljr-rename-symbol, through natural usage,
I get the attached stacktrace printed to Emacs' *Messages*

Fetching symbol occurrences...
Error in nrepl-refactor: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at java.util.Arrays.copyOf (Arrays.java:3332)
    java.lang.AbstractStringBuilder.ensureCapacityInternal (AbstractStringBuilder.java:124)
    java.lang.AbstractStringBuilder.append (AbstractStringBuilder.java:448)
    java.lang.StringBuffer.append (StringBuffer.java:270)
    java.io.StringWriter.write (StringWriter.java:101)
    clojure.core$fn__7283.invokeStatic (core_print.clj:159)
    clojure.core/fn (core_print.clj:158)
    clojure.lang.MultiFn.invoke (MultiFn.java:234)
    clojure.core$pr_on.invokeStatic (core.clj:3674)
    clojure.core$pr_on.invoke (core.clj:3668)
    clojure.core$print_prefix_map$fn__7317.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:66)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7344.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:234)
    clojure.core$pr_on.invokeStatic (core.clj:3674)
    clojure.core$pr_on.invoke (core.clj:3668)
    clojure.core$print_prefix_map$fn__7317.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:66)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7344.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:234)
    clojure.core$pr_on.invokeStatic (core.clj:3674)
    clojure.core$pr_on.invoke (core.clj:3668)
    clojure.core$print_prefix_map$fn__7317.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:66)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)

Interestingly, I can reproduce the error via the repl, skipping the middleware.

(refactor-nrepl.find.find-symbol/find-symbol   {:file "/path/to/file/page.clj",
                                                :ns "some.ns"
                                                :name "handler"
                                                :line 43
                                                :column 6
                                                :ignore-errors false})

...and further playing at the repl, I could find the exact expression at fault:

refactor-nrepl/src/refactor_nrepl/find/find_symbol.clj

Lines 138 to 140 in bb27d13

    
           (->> (core/dirs-on-classpath) 
        
                (mapcat (partial core/find-in-dir (some-fn core/clj-file? core/cljc-file?))) 
        
                (mapcat (partial find-symbol-in-file fully-qualified-name ignore-errors)))))

I can run for example

(->> (core/dirs-on-classpath)
     (mapcat (partial core/find-in-dir (some-fn core/clj-file? core/cljc-file?)))
     (mapcat (partial find-symbol-in-file "clojure.core/+" false))
     count)

And it will reliably OOM.

Let me know if I can help you reproduce the error again. Perhaps you have it easy by setting a deliberately small -Xmx.

Victor

The text was updated successfully, but these errors were encountered:

vemv · 2019-03-28T01:23:57Z

Side note, the pinpointed code seems a good target for parallelization. Perhaps if done with reducers, the change wouldn't be too invasive.

vemv · 2019-03-28T01:27:24Z

One last note, (->> (core/dirs-on-classpath) (mapcat (partial core/find-in-dir (some-fn core/clj-file? core/cljc-file?))) count) returns 276 for me, each file having a reasonable size. i.e. it's not an humongous project.

vemv · 2019-03-28T02:01:51Z

I further debugged this.

The culrpit boils down to refactor-nrepl.find.find-symbol/find-symbol-in-file
- Specifically, its (ana/ns-ast file-content) call
One specific input that will cause it to OOM is the following:

(defn foo
  [{:keys [x]}]
  (go-loop []
    ))

...Namely:

clojure.core.async/go-loop must be there
A destructured argument must also be there.

You can apply this patch, and easily reproduce the issue:

lein repl
(require 'refactor-nrepl.analyzer)
(refactor-nrepl.analyzer/ns-ast (slurp "test/resources/testproject/src/com/example/vemv.clj")) -> OOM

benedekfazekas · 2019-03-28T06:34:31Z

thanks for the really good analysis. we have several probs with core.async as far as i remember due to its macro heavy nature. will have a closer look, ta

vemv · 2019-04-09T18:51:54Z

Hey there,

have you had a chance (or foresee one soon enough)?

Btw, this issue report is a bit messy, it boils down to the last comment. I can close + re-create this ticket if that tidies up things.

slipset · 2019-12-07T21:21:48Z

I've been hit by this as well.
On our largish codebase (70k lines of Clojure), it seems like the caching of the ast done in
refactor-nrepl.analyzer fails to handle the complete ast of our project (at least on my machine)

When commenting out https://github.com/clojure-emacs/refactor-nrepl/blob/master/src/refactor_nrepl/analyzer.clj#L53 it seems to run without the OOM, which I think makes sense, as I can imagine the total ast for this project would be rather large?

expez · 2019-12-09T16:09:27Z

@slipset I just had the same error occur on a smaller project (8k lines of clojure).

error in process filter: Error in nrepl-refactor: java.lang.OutOfMemoryError: Java heap space
 at java.util.Arrays.copyOf (Arrays.java:3332)
    java.lang.AbstractStringBuilder.ensureCapacityInternal (AbstractStringBuilder.java:124)
    java.lang.AbstractStringBuilder.append (AbstractStringBuilder.java:448)
    java.lang.StringBuffer.append (StringBuffer.java:270)
    java.io.StringWriter.write (StringWriter.java:101)
    cider.nrepl.print_method$eval31775$fn__31776.invoke (print_method.clj:69)
    clojure.lang.MultiFn.invoke (MultiFn.java:233)
    clojure.core$pr_on.invokeStatic (core.clj:3666)
    clojure.core$pr_on.invoke (core.clj:3660)
    clojure.core$print_prefix_map$fn__7041.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:61)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7068.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:233)
    clojure.core$pr_on.invokeStatic (core.clj:3666)
    clojure.core$pr_on.invoke (core.clj:3660)
    clojure.core$print_prefix_map$fn__7041.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:61)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7068.invokeStatic (core_print.clj:266)
    clojure.core/fn (core_print.clj:263)
    clojure.lang.MultiFn.invoke (MultiFn.java:233)
    clojure.core$pr_on.invokeStatic (core.clj:3666)
    clojure.core$pr_on.invoke (core.clj:3660)
    clojure.core$print_prefix_map$fn__7041.invoke (core_print.clj:233)
    clojure.core$print_sequential.invokeStatic (core_print.clj:61)
    clojure.core$print_prefix_map.invokeStatic (core_print.clj:229)
    clojure.core$print_map.invokeStatic (core_print.clj:238)
    clojure.core$fn__7068.invokeStatic (core_print.clj:266)

This is the same stacktrace as in the OP, but the error is slightly different.

Looking at the stacktrace print_prefix_map occurs 6 times. Perhaps there's a problem with cycles here?

mbuczko · 2019-12-17T09:15:12Z

@slipset @vemv could you try to reproduce the problem against newest master?

vemv · 2019-12-18T23:54:51Z

Hi, thanks for attending the issue!

Yes I could still reproduce it.

Updated repro instuctions are as follow:

git checkout fresh https://github.com/clojure-emacs/refactor-nrepl master
curl https://gist.githubusercontent.com/vemv/a01b63102c3fb6b19d37f376e78546fd/raw/74d064cac449f6063f4751f6fa9f3e8e8fe53899/gistfile1.txt | git apply
(require 'refactor-nrepl.analyzer)
(refactor-nrepl.analyzer/ns-ast (slurp "testproject/src/com/example/vemv.clj")) -> OOM

My env:

Personal .lein profile stuff was left nil
High-end MBP

mbuczko · 2019-12-23T10:00:44Z

I think this is still related to the problem I tried to solve in #278. (refactor-nrepl.analyzer/ns-ast (slurp "test/resources/testproject/src/com/example/vemv.clj")) at some point tries to stringify the result to display it in REPL I guess, and we painfully end up with OOM.

Let's try the other way:

(def ast 
   (refactor-nrepl.analyzer/ns-ast (slurp "test/resources/testproject/src/com/example/vemv.clj")))

if we now try to stringify the result, like this:

(str ast)

again - we get OOM. This is more or less what I fixed in #278.

Lets instead redirect output directly to file:

(with-open [w (clojure.java.io/writer "/tmp/ast.txt")]
  (binding [*out* w] (pr ast)))

Now, after few minutes (and assuming you had enough free space on your drive) you can see where is the problem . Generated file is gigantic. I interrupted entire dump after reaching 37GB.

That's insane :)

expez · 2021-06-28T12:12:11Z

@vemv Any hope of narrowing this done? I'm not sure how to tackle this one, but it's probably one of the most important outstanding issues in refactor-nrepl atm.

Since eastwood manages to use tools.analyzer on this code I'm guessing we're doing something we shouldn't somewhere, much like the bug @mbuczko closed in #278.

vemv · 2021-06-28T12:27:58Z

As a thing that maybe can help, this wall of flags helps me avoid all sort of OOMs (most specifically, for stack-heavy workloads):

[;; Prevents a specific type of OOMs:
 "-XX:CompressedClassSpaceSize=3G"
 ;; Prevents trivial StackOverflow errors:
 "-XX:MaxJavaStackTraceDepth=1000000"
 ;; Set a generous limit as the maximum that can be allocated, preventing certain types of OOMs:
 "-Xmx24G"
 ;; increase stack size x6, for preventing SO errors:
 ;;   (The current default can be found with
 ;;    `java -XX:+PrintFlagsFinal -version 2>/dev/null | grep "intx ThreadStackSize"`)
 "-Xss6144k"]

Also @slipset's comment seems on point. Keeping an AST cache seems a delicate business: outdated data can grow without control, correct? So something like a core.cache SoftCache might be more cautious.

Or keeping things simpler, one could have a per-run cache: things would be cached for a specific refactor-nrepl op, but not across ops. I realise one will have more cache misses this way, but one gains simplicity and a reduced chance of OOMs.

Finally, one nice thing I came up for Eastwood recently is jonase/eastwood@aa569a5 . It might not be related directly to our issue here, but omitting core.async exceptions can be good in other contexts / for other issues.

expez · 2021-06-28T12:36:21Z

outdated data can grow without control, correct?

Looks like we have a bit of a bug in the cache, in the sense that it's append-only and old shit is never thrown out. On the other hand it won't grow on it's own. Unless you're using the refactoring a lot and/or have a very large change set I doubt this is it. I'll add another issue tracking that (should be very easy to fix), but I think this one here is something different.

edit: I just did a quick read-through of the cache code and it looks OK. It won't grow out of bounds. We also know that a single file can produce an AST that's too large to handle. Guess the next step might be to see what sort of recursion happens when such an AST is created...

vemv · 2021-06-28T12:41:51Z

Yes the #245 (comment) repro hints so

Part of #245

mbuczko mentioned this issue Dec 13, 2019

find-symbol leads to OutOfMemoryError #278

Closed

vemv self-assigned this Jul 1, 2021

vemv added a commit that referenced this issue Jul 9, 2021

Always perform a require prior to analyzing a ns

eeca2f6

Part of #245

vemv mentioned this issue Jul 9, 2021

Always perform a require prior to analyzing a ns #321

Merged

vemv added a commit that referenced this issue Jul 9, 2021

Always perform a require prior to analyzing a ns (#321)

c05e692

Part of #245

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

find-symbol can fail with "Requested array size exceeds VM limit" #245

find-symbol can fail with "Requested array size exceeds VM limit" #245

vemv commented Mar 28, 2019 •

edited

Loading

vemv commented Mar 28, 2019

vemv commented Mar 28, 2019

vemv commented Mar 28, 2019 •

edited

Loading

benedekfazekas commented Mar 28, 2019

vemv commented Apr 9, 2019

slipset commented Dec 7, 2019

expez commented Dec 9, 2019 •

edited

Loading

mbuczko commented Dec 17, 2019

vemv commented Dec 18, 2019 •

edited

Loading

mbuczko commented Dec 23, 2019

expez commented Jun 28, 2021

vemv commented Jun 28, 2021

expez commented Jun 28, 2021 •

edited

Loading

vemv commented Jun 28, 2021

find-symbol can fail with "Requested array size exceeds VM limit" #245

find-symbol can fail with "Requested array size exceeds VM limit" #245

Comments

vemv commented Mar 28, 2019 • edited Loading

vemv commented Mar 28, 2019

vemv commented Mar 28, 2019

vemv commented Mar 28, 2019 • edited Loading

benedekfazekas commented Mar 28, 2019

vemv commented Apr 9, 2019

slipset commented Dec 7, 2019

expez commented Dec 9, 2019 • edited Loading

mbuczko commented Dec 17, 2019

vemv commented Dec 18, 2019 • edited Loading

mbuczko commented Dec 23, 2019

expez commented Jun 28, 2021

vemv commented Jun 28, 2021

expez commented Jun 28, 2021 • edited Loading

vemv commented Jun 28, 2021

vemv commented Mar 28, 2019 •

edited

Loading

vemv commented Mar 28, 2019 •

edited

Loading

expez commented Dec 9, 2019 •

edited

Loading

vemv commented Dec 18, 2019 •

edited

Loading

expez commented Jun 28, 2021 •

edited

Loading