Welcome! Please see the About page for a little more info on how this works.

+3 votes
ago in Refs, agents, atoms by

We used seque to manage a large database migration and observed a possible memory leak that required us to restart it several times.

We iterated through hundreds of queries to extract data via reduce, each using its own seque but eventually ran out of memory.

I believe it could have been related to seque and its use of agents, which was predicted to leak memory in CLJ-1125. We have not since tried a migration without seque for comparison, but I instead turned my attention to seque and found possibly-related problems.

seque uses an agent to offer items to its buffer. Agents have a memory leak where the conveyed bindings of a send are held by the executing Thread (the most relevant being *agent*). This means even if the seque is gc'ed, the agent persists if the thread is part of a cached thread pool, usually containing realized items from the producing seq (such as the first item to fail to be offered to the buffer).

I believe this demonstrates seque leaking memory (Clojure 1.12.0):

(let [pool-size 500]
  (defn expand-thread-pool! []
    (let [p (promise)]
      (mapv deref (mapv #(future (if (= (dec pool-size) %) (deliver p true) @p)) (range pool-size))))))

(let [_ (expand-thread-pool!) ;; increases likelihood of observing leak
      ready (promise)
      strong-ref (volatile! (Object.))
      weak-ref (java.lang.ref.WeakReference. @strong-ref)
      the-seque (volatile! (seque 1 (lazy-seq
                                      (let [s (repeat @strong-ref)]
                                        (deliver ready true)
                                        s))))]
  @ready
  (vreset! strong-ref nil)
  (vreset! the-seque nil)
  (System/gc)
  (doseq [i (range 10)
          :while (some? (.get weak-ref))]
    (prn "waiting for gc...")
    (Thread/sleep 1000)
    (System/gc))
  (prn (if (nil? (.get weak-ref))
         "garbage collection successful"
         "seque memory leak!!")))
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"seque memory leak!!"

This can be reproduced with agents. Once an agent has executed an action, a strong reference persists to the agent via *agent* in the cached thread it was executed in. Here we observe the agent is not garbage collected if it has executed an action on a cached thread pool (Clojure 1.12.0):

(let [_ (expand-thread-pool!)
      strong-ref (volatile! (agent nil))
      weak-ref (java.lang.ref.WeakReference. @strong-ref)]
  ;#_#_ ;;uncomment this and the agent is freed
  (send-off @strong-ref vector)
  (doseq [i (range 10)
          :while (not (vector? @@strong-ref))]
    (Thread/sleep 1000))
  (vreset! strong-ref nil)
  (System/gc)
  (doseq [i (range 10)
          :while (some? (.get weak-ref))]
    (prn "waiting for gc...")
    (Thread/sleep 1000)
    (System/gc))
  (prn (if (nil? (.get weak-ref))
         "garbage collection successful"
         "agent memory leak!!")))
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"waiting for gc..."
;"agent memory leak!!"
ago by
Since we used reduce to entirely consume each seque, I'm more skeptical that the memory leak in `seque` caused the database migration problem.
ago by
This leak could also affect overtone's ability to GC the cycled functions in this helper once the returned fn is GC'ed: https://github.com/overtone/overtone/blob/535c9c50cb52aa275f3e2e7474842dbcfaa6eff5/src/overtone/algo/fn.clj#L27

Please log in or register to answer this question.

...