Description of one example with poor performance discovered by Michał Marczyk in the discussion thread linked below.
The difference between the compiled versions of:
(defn foo [x]
(if (> x 0)
(inc x)
(locking o
(dec x))))
and
(defn bar [x]
(if (> x 0)
(inc x)
(let [res (locking o
(dec x))]
res)))
is quite significant. foo gets compiled to a single class, with invocations handled by a single invoke method; bar gets compiled to a class for bar + an extra class for an inner function which handles the (locking o (dec x)) part -- probably very similar to the output for the version with the hand-coded locking-part (although I haven't really looked at that yet). The inner function is a closure, so calling it involves an allocation of a closure object; its ctor receives the closed-over locals as arguments and stores them in two fields (lockee and x). Then they get loaded from the fields in the body of the closure's invoke method etc.
Note: The summary line may be too narrow a description of the root cause, and simply the first example of a case where this issue was noticed and examined. Please make the summary and this description more accurate if you diagnose this issue.
See discussion thread on Clojure group here:
https://groups.google.com/forum/#!topic/clojure/x86VygZYf4Y