Welcome! Please see the About page for a little more info on how this works.

+3 votes
in Macros by

I think auto-gensym (using # syntax) can shadow itself when used recursively, which seems like a bug. jumar confirmed this happens in 1.10.3, 1.6.0, 1.3.0, and 1.0.0

(defmacro recursive-macro-1 [[x & more :as xs] acc]
  (if (seq xs)
    `(let [x# ~x]
       (recursive-macro-1 ~more (conj ~acc x#)))

(defmacro recursive-macro-2 [[x & more :as xs] acc]
  (if (seq xs)
    (let [gx (gensym 'x)]
      `(let [~gx ~x]
         (recursive-macro-2 ~more (conj ~acc ~gx))))

  (recursive-macro-1 [1 2] []) ;; => [2 2]
  (recursive-macro-2 [1 2] []) ;; => [1 2]

I would expect recursive-macro-1 to produce the same results as recursive-macro-2 but you can see, from running clojure.walk/macroexpand-all on these forms, that the shadowing prevents this:

;; auto-gensym - the name of the local binding is the same in the nested let*
(let* [x__10885__auto__ 1]
  (let* [x__10885__auto__ 2]
    (conj (conj [] x__10885__auto__) x__10885__auto__)))

;; gensym
(let* [x11095 1]
  (let* [x11096 2] 
    (conj (conj [] x11095) x11096)))

1 Answer

+2 votes
selected by
Best answer

auto gensyming does the gensym at read time, not at macroexpand time.

in order for auto gensym to gensym at macroexpand time, the syntax quote reader could be altered to produce a let binding.

user=> (read-string "`(x#)") (clojure.core/seq (clojure.core/concat (clojure.core/list (quote x__9__auto__)))) user=>

would become:
user=> (read-string "`(x#)") (clojure.core/let [foo (gensym 'x)] (clojure.core/seq (clojure.core/concat (clojure.core/list foo))) user=>

a consequence of that change would be a macro like clojure.core/and (picking a commonly used macro that uses gensyms at random) would gensym new symbols everytime it was expanded instead of once when the implementation is read. That might be fine? gensyming should be fast, the new symbols are used when compiling then forgotten about, the counter for gensym names can count really high.

Thanks - I'd noticed this may be the case when running my forms through `clojure.tools.reader/read-string`. I'm sure I'm overlooking something, but couldn't the macroexpansion re-run the forms through the reader each time (thereby incrementing the gensym id counter)? Either way, your suggestion above looks very sensible. Is this something that I (or someone) can raise as a Clojure JIRA issue, or would it need some sort of assent from the core team here beforehand?
I think a well-stated ticket would be worthwhile. It's unclear where it would fall in the priorities but having a clear problem statement and alternative approach ideas would help us to determine that more quickly. Thanks!
@Fogus I discovered - from my recent question in clojurians #clojure-dev, that I can't make a JIRA ticket, as I'm not a contributor, but I do agree it's worthwhile.
I ran into this problem for exactly the instance of `clojure.core/and`. It was always using the gensym `and__5579__auto__`. This created a problem when transpiling Clojure code into Java, which does not allow shadowed local variables. I agree with Tom Dalziel that incrementing the gensym id counter would work.

@Fogus the desired behavior is that we don't reuse the same gensym symbols globally.
this sounds like more a problem with your transpiler?

If you are operating on clojure source and compiling to java, well clojure allows for the same named to be bound in nested static scopes, you you'll have to deal with it via something like alpha conversion.

If you are decompiling jvm byte code to java, the jvm doesn't actually name locals at all, they just get numbered slows. The names are stored as optional debug information, and there is no requirement for uniqueness https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.7.13

So in either case to faithful capture the semantics (clojure->java, jvm bytecode->java) you need to be able to deal with possibly duplicate names