There is a PermGen memory leak that we have tracked down to protocol methods and multimethods called inside an {{eval}}, because of the caches these methods use. The problem only arises when the value being cached is an instance of a class (such as a function or reify) that was defined inside the {{eval}}. Thus extending {{IFn}} or dispatching a multimethod on an {{IFn}} are likely triggers.
*Reproducing:* The easiest way that I have found to test this is to set "{{-XX:MaxPermSize}}" to a reasonable value so you don't have to wait too long for the PermGen spaaaaace to fill up, and to use "{{-XX:+TraceClassLoading}}" and "{{-XX:+TraceClassUnloading}}" to see the classes being loaded and unloaded.
{code:title=leiningen project.clj}
(defproject permgen-scratch "0.1.0-SNAPSHOT"
:dependencies [[org.clojure/clojure "1.5.0-RC1"]]
:jvm-opts ["-XX:MaxPermSize=32M"
"-XX:+TraceClassLoading"
"-XX:+TraceClassUnloading"])
You can use {{lein swank 45678}} and connect with slime in emacs via {{M-x slime-connect}}.
To monitor the PermGen usage, you can find the Java process to watch with "{{jps -lmvV}}" and then run "{{jstat -gcold +_<PROCESS_ID>_+ 1s}}". According to [the jstat docs|
http://docs.oracle.com/javase/7/docs/technotes/tools/share/jstat.html#gcold_option], the first column (PC) is the "Current permanent space capacity (KB)" and the second column (PU) is the "Permanent space utilization (KB)". VisualVM is also a nice tool for monitoring this.
h2. Multimethod leak
Evaluating the following code will run a loop that eval's {{(take* (fn foo []))}}.
{code:title=multimethod leak}
(defmulti take* (fn [a] (type a)))
(defmethod take* clojure.lang.Fn
[a]
'())
(def stop (atom false))
(def sleep-duration (atom 1000))
(defn run-loop []
(when-not @stop
(eval '(take* (fn foo [])))
(Thread/sleep @sleep-duration)
(recur)))
(future (run-loop))
(reset! sleep-duration 0)
In the {{lein swank}} session, you will see many lines like below listing the classes being created and loaded.
[Loaded user$eval15802$foo__15803 from __JVM_DefineClass__]
[Loaded user$eval15802 from __JVM_DefineClass__]
These lines will stop once the PermGen space fills up.
In the jstat monitoring, you'll see the amount of used PermGen space (PU) increase to the max and stay there.
- PC PU OC OU YGC FGC FGCT GCT
31616.0 31552.7 365952.0 0.0 4 0 0.000 0.129
32000.0 31914.0 365952.0 0.0 4 0 0.000 0.129
32768.0 32635.5 365952.0 0.0 4 0 0.000 0.129
32768.0 32767.6 365952.0 1872.0 5 1 0.000 0.177
32768.0 32108.2 291008.0 23681.8 6 2 0.827 1.006
32768.0 32470.4 291008.0 23681.8 6 2 0.827 1.006
32768.0 32767.2 698880.0 24013.8 8 4 1.073 1.258
32768.0 32767.2 698880.0 24013.8 8 4 1.073 1.258
32768.0 32767.2 698880.0 24013.8 8 4 1.073 1.258
A workaround is to run {{prefer-method}} before the PermGen space is all used up, e.g.
(prefer-method take* clojure.lang.Fn java.lang.Object)
Then, when the used PermGen space is close to the max, in the {{lein swank}} session, you will see the classes created by the eval'ing being unloaded.
[Unloading class user$eval5950$foo__5951]
[Unloading class user$eval3814]
[Unloading class user$eval2902$foo__2903]
[Unloading class user$eval13414]
In the jstat monitoring, there will be a long pause when used PermGen space stays close to the max, and then it will drop down, and start increasing again when more eval'ing occurs.
- PC PU OC OU YGC FGC FGCT GCT
32768.0 32767.9 159680.0 24573.4 6 2 0.167 0.391
32768.0 32767.9 159680.0 24573.4 6 2 0.167 0.391
32768.0 17891.3 283776.0 17243.9 6 2 50.589 50.813
32768.0 18254.2 283776.0 17243.9 6 2 50.589 50.813
The {{defmulti}} defines a cache that uses the dispatch values as keys. Each eval call in the loop defines a new foo class which is then added to the cache when {{take*}} is called, preventing the class from ever being GCed.
The prefer-method workaround works because it calls {{clojure.lang.MultiFn.preferMethod}}, which calls the private {{MultiFn.resetCache}} method, which completely empties the cache.
h2. Protocol leak
The leak with protocol methods similarly involves a cache. You see essentially the same behavior as the multimethod leak if you run the following code using protocols.
{code:title=protocol leak}
(defprotocol ITake (take* [a]))
(extend-type clojure.lang.Fn
ITake
(take* [this] '()))
(def stop (atom false))
(def sleep-duration (atom 1000))
(defn run-loop []
(when-not @stop
(eval '(take* (fn foo [])))
(Thread/sleep @sleep-duration)
(recur)))
(future (run-loop))
(reset! sleep-duration 0)
Again, the cache is in the {{take*}} method itself, using each new {{foo}} class as a key.
*Workaround:* A workaround is to run {{-reset-methods}} on the protocol before the PermGen space is all used up, e.g.
(-reset-methods ITake)
This works because {{-reset-methods}} replaces the cache with an empty MethodImplCache.
*Patch:* protocol_multifn_weak_ref_cache.diff
*Screened by:*