Welcome! Please see the About page for a little more info on how this works.

+13 votes
in Multimethods by

An easy way to reproduce:

Clojure 1.10.2
user=> (defmulti f identity)
user=> (defmethod f :default [x] nil)
#object[clojure.lang.MultiFn 0x4bb8855f "clojure.lang.MultiFn@4bb8855f"]
user=> (dotimes [_ 100000] (f (byte-array 1000000)))

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "main"

If you adjust the numbers in such a way so that there's no OOM error, you'll see a persistent leak.

The culprit seems to be MultiFn.findAndCacheBestMethod caching every single value that was matched with the default method.

1 Answer

0 votes

I don't know if this is a leak so much as a design assumption that the number of dispatch values is bounded.

At one time we did not cache the dispatch values resulting in :default case and this made the :default very slow (this was the primary reason multimethods were considered "slow" for a long time). In 1.8, http://dev.clojure.org/jira/browse/CLJ-1429 changed this and it was a substantial performance boost for a lot of real programs.

Maybe there is some middle ground, not sure.

Is using protocols a better alternative to avoiding cache overflow for this scenario? Or maybe just implement a `case` manually and avoid multimethids and protocols alltogether?
My concern, rather than accidental leaks due to a developer mistake, is actually around the fact could cause real issues if (potentially malicious, or highly random) user input is allowed to hit a multimethod.

I am sure most Clojure programmers are very careful and if it was made clear that this caching was going on ones brain would think twice about having a :default method or finding another way to close the inputs to the multimethod - but its not clear. I've been using Clojure professionally for many years and I didn't know :default matches were cached.

e.g memoize is going to cause the same kind of issue but I think its clearer in that case (you might say obvious) whereas multi-methods its a little too subtle for me.

I do not know if there is a solution, of course it is important to be fast, and almost all of the time, I'm sure its fine.
Supposing it were possible, would a defmulti option to use an LRU or other cache from clojure.core.cache resolve the problem?  (Was the "slow" un-cached multimethod lookup slower than the overhead of a safely-bounded cache?)
It's important to note that this caching occurs on the output from the dispatch function, not on the inputs to the multimethod or dispatch function. The most common dispatch functions are things like `type`, `class`, or a keyword used to extract a field from a map. While those all have unbounded ranges they are generally much more bounded than the domains. I do not recall anyone else reporting this in the 5+ years since it was changed. Protocols also use caches (per call-site) and in fact (other than the ns/var tables) these are the two primary stateful things inside the Clojure runtime.

There are potentially a lot of controls you could add to this, very similar to memoize. In the latter case, we pushed that out to core.memoize/core.cache and kept the simpler version in core. I'm not arguing for/against any of that, would need some evaluation. Created https://clojure.atlassian.net/browse/CLJ-2626
I totally makes sense that there shouldn't be too many different dispatch values. But sometimes the dispatch value is not what the developer has in mind.

I encountered a memory leak by accident in production, when using a keyword as a dispatch function. My intent was to dispatch on the value associated to a field in the first arg, but by mistake when the keyword was not found, the keyword as a function treated the second arg of the multi method as the dispatch value.

Here is a simple reproduction:

(defmulti foo :type)
(defmethod foo :default [_ _] nil)
(foo {} (Math/random))

My intent was to have:
(defmulti foo (fn [x y]
                (:type x)))

Notice that my buggy code didn't cause any functional misbehaviour, only a memory leak due to the caching mechanism of the multi method.
even though it applies to results of dispatch, the results could be provided as input from the outside world or other programs that cannot be assumed to be bounded unless the programmer has made sure that is the case. I do think this is rather different to the protocol caches due to type vs value, imagine a multimethod that dispatches on count, you could specialise for 0,1,2 and then leave :default to handle other cases, or a :type field whose value is selected on a HTML form, :default could be a fallback for invalid selections.
Just to add another case to the list of accidental problems that can trip people up. A memory leak appeared in a service recently that was caused by upgrading some dependencies. The problem combination was plumatic schema + ring swagger + compojure api. Schema changed to return an anon fn in this commit: https://github.com/plumatic/schema/commit/2191f9e2982da74410c14686ca6c3436e802afc0 and ring swagger uses `identity` as a dispatch fn https://github.com/metosin/ring-swagger/blob/0ef9046174dec0ebd4cbf97d6cc5c6846ef11996/src/ring/swagger/coerce.clj#L178 so when using a map in the query params in a compojure api call, it eventually bottoms out at putting an anon fn into the method cache for `ring.swagger.coerce/custom-matcher` (and `time-matcher`) on every request. We solved our problem by downgrading schema and increasing the priority on migrating to spec/malli, but I just wanted to add a note that there are some libs in the wild that use `identity` as a multimethod dispatch fn which seems like a red-flag we should be aware of.
I was not aware of this and the ring-swagger thing is my doing. Happy to take a fix there.