Share your thoughts in the 2024 State of Clojure Survey!

Welcome! Please see the About page for a little more info on how this works.

+4 votes
in Compiler by
Copied from the proposal email to the Clojure/dev Google group:

  https://groups.google.com/d/msg/clojure-dev/zlMGmv60MVA/beyIRTrhAgAJ

Hi,

Currently loop/recur is limited to "single-layered" loops: loop forms can occur inside other loop forms, but there is no facility for "recuring to an outer loop".

A few years ago I posted a proposal to add support for nested loops to Clojure with a proof-of-concept patch to ClojureScript with syntax and semantics that I think suffice to make nested loops feel natural while remaining a natural extension of the core loop/recur model, with the same explicit tail recursion benefits:


(loop foo [...]    ; named loop
  ...
  (loop [...]      ; intervening loop
    ...
    (loop [...]    ; innermost loop
      ...
      (recur-to foo ...)))) ; recur to the named loop;
                            ; NB. this must be in tail position
                            ; w.r.t. *all* three loops


  https://groups.google.com/d/msg/clojure-dev/imtbY1uqpIc/8DWLw8Ymf4IJ
  https://dev.clojure.org/display/design/Named+loops+with+recur-to
  https://github.com/michalmarczyk/clojurescript/tree/recur-to
  https://github.com/michalmarczyk/clojurescript/commit/feba0a078138da08d584a67e671415fc403fa093

I have now implemented a complete patch enabling the proposed feature in Clojure (the first link is to a branch based on current master, that is, the "prepare for next development iteration" commit after 1.9.0-alpha17; the second is to the current tip of this branch for future reference):

  https://github.com/michalmarczyk/clojure/tree/recur-to

  https://github.com/michalmarczyk/clojure/commit/212ea06d21d3b336ac35480c99170e81defaf956

I also opened a ticket in JIRA so as to have a place to post the above in the form of a patch file:

  https://dev.clojure.org/jira/browse/CLJ-2235

The remainder of this email sets out the proposal in more detail, states its key properties in a somewhat rigorous form, briefly summarizes the implementation approach and discusses certain design choices made in the patch.

Overview
========

The idea is that one could write e.g.


(let [m (two-dimensional-matrix)]
  (loop iloop [i 0]           ; named loop
    (if (< i i-lim)
      (loop [j 0]             ; nested anonymous loop
        (if (< j j-lim)
          (do
            (work)
            (recur (inc j)))  ; recur to the innermost loop
          (recur-to iloop (inc i)))) ; recur to iloop
      (done))))


and, provided that each recur-to form is in tail position with respect to all its enclosing loop forms up to and including its target and the number of arguments passed to each recur-to form matches the number of loop locals of the target loop (plus one for the leading loop name argument), this should compile and behave much like nested loops in Java.

The proposed syntax is modelled on Scheme's named lets, although semantically
those are quite different - this proposal is strictly limited to expanding the loop/recur model to nested loops in a natural way. Of course named fn forms ought also to be valid recur-to targets.

Key properties of named loops and recur-to
==========================================

With the above-linked patch in place, the following rules are enforced at compilation time:

1. Each recur-to form must be in tail position with respect to all its enclosing loop forms, whether named or not, up to and including its target (which may be a named loop or fn form).

2. It is an error to specify a recur-to target which does not occur among the names of the recur-to form's enclosing loop or fn forms with respect to which it is in tail position.

3. It is not possible to recur-to across try.

4. The number of arguments passed to recur-to beyond the initial target/label argument must match the number of formal parameters of the target loop or fn form.

5. Shadowing loop names is permissible; recur-to can only target the innermost loop of the given name among those with respect to which it is in tail position. Loop locals introduced by a shadowed named loop remain visible within the shadowing loop (unless they are themselves shadowed by identically named locals).

NB. loop names are *not* locals. In particular, they neither shadow nor are shadowed by locals of the same name. This point merits a longer discussion; see the section on design choices at the end of this email.

The innermost loop or fn form can always be targeted using plain recur, whether it is named or not. Additionally (recur-to nil ...) is equivalent to (recur ...) (even when the innermost loop or fn form is actually named), and (loop nil [...] ...) is equivalent to (loop [...]).

Summary of the implementation approach
======================================

The patch modifies the handling of loop labels in the compiler and implements the few necessary tweaks to the loop macro.

It also introduces an optional name argument to the loop* special form. (It is optional primarily so as to avoid breaking any non-core macros that emit loop* directly.)

Finally, it renames the recur special form to recur*; recur and recur-to become macros defined in clojure.core. See the section on design choices below for alternative approaches.

Design choices
==============

1. During development, purely as a matter of convenience at that stage, I had a separate loop-as macro that accepted a name argument. I thought it reasonable to add the naming feature directly to loop, particularly since fn already takes an optional name. Still, loop-as is a valid alternative design.

2. Should it be desirable to avoid renaming the existing recur special form to recur* and reimplementing recur as a macro, a new recur-to special form could be added. (Alternatively, one could keep recur as it is while adding recur-to as a macro delegating to a new recur* special form.)

3. Should it be desirable to preserve the option of treating loop names as locals in the future, it would probably be preferable to make them shadow and be shadowed by locals now, as otherwise elevating them to the status of locals at a later point would be a breaking change. To give an example of why such a future change might be useful, if tail call elimination support arrives in a future JDK spec, one might consider whether it'd be useful to adopt a Scheme-like approach with the loop name treated as a local bound to a function with a single arglist corresponding to the loop locals of the named loop; the closure allocations this would entail could perhaps be optimized away if the local is never referenced.

It bears pointing out that if TCE support does materialize, it will enable a range of alternative designs. For example, Scheme-style named lets could then be introduced as a feature of the let macro. Thus it seems to me that it is reasonable to restrict loop/recur/recur-to to label+goto-style looping only, even in a hypothetical future with VM TCE support, and that there is no reason to afford local-like treatment to loop names; hence the current no-shadowing-interactions approach of the patch.

Cheers,
Michał

5 Answers

0 votes
by
_Comment made by: alexmiller_

Thanks Michał, it looks like you've done a lot of good work here. I think you've just missed the window for looking at new feature stuff for 1.9 but would like to circle back in next release on this.

It seems undesirable for recur to change from special form to macro, so would probably be better to either extend the existing form or add a recur-to special form.

Did you consider other options for specifying a name, like a keyword? Keywords don't carry the expectation of lexical shadowing you have with locals so could side-step those issues maybe? Maybe they are undesirable for other reasons.
0 votes
by
_Comment made by: michalmarczyk_

Cheers, that's good to know.

*recur-to as a separate special form*

That's fair enough re: changing recur. Here's a version of the patch that makes recur-to a new, separate special form. Note that it still shares the RecurExpr class in the compiler the way loop and let share LetExpr. This patch is meant to be applied on top of the previous one to make it clear what the delta is:

    0002-CLJ-2235-implement-recur-recur-to-as-separate-specia.patch

    https://github.com/michalmarczyk/clojure/tree/recur-to

I've also prepared a squashed version that takes the current master directly to the same state:

    0001-CLJ-2235-implement-named-loop-recur-to-separate-special-form.patch

    https://github.com/michalmarczyk/clojure/tree/recur-to-separate-special-form

Incidentally, in implementing this patch I had to revert a change that the original patch makes to clojure.pprint that I guess demonstrates why it's a better idea to go with a new, separate recur-to – I overlooked it when preparing the original posting, otherwise it probably would have tipped the scales for me. (The change is marked by a #_ comment that I forgot to remove in the original patch. The new squashed patch cleans it up automatically by not touching that file.)

*Symbols vs keywords as loop names*

I used symbols partly because fn expects the optional name argument to be a symbol if present, and so recur-to had to support symbolic names anyway (assuming here that we want to recur-to to use the same form of the recur target name that was used to introduce it, which seems reasonable); and partly just "by default", because static symbolic references generally use symbols. (clojure.core/extend uses keywords, but those aren't really static references.) Not sure if the ability to attach metadata to loop names is likely to be useful, but there's that as well.

The second part about existing usage may or may not be a major consideration, particularly since loop names are somewhat unique in that they can only be referred to by a single special form – recur-to – and are otherwise invisible in the source. This also distinguishes them from fn-introduced named recur targets which of course do double as locals.

So I suppose we could use keywords for loop names and symbols for fn names if we're happy to have metadata-less loop names. We could even allow fn forms to use keyword names if the intention is to establish a named recur target without providing a local reference to the fn instance. (That'd basically be sugar over fn + loop.)

I'll have to think some more about which approach I prefer. I still like the consistency of using symbols for recur targets everywhere (fn / loop / recur-to), but having the local/not-a-local difference be accompanied by a syntactic distinction is tempting.

In the way of some brainstorming-in-the-open, I find it interesting that by using keyword loop names now we'd keep the possibility open of adding support for symbols in the future – perhaps for those VM-TCE-based Scheme-like loop names that would provide local references to closures. Or we could make "symbol labels" a generic feature of "let-like" forms (the ones backed by LetExpr, i.e. let & loop) once TCE lands. Then we'd have consider whether it should be possible for recur-to to target such named lets… And what about plain let? Might be simpler to stick to label+goto looping in loop/recur/recur-to and Scheme-like lets supported through a separate facility (perhaps simply let), with fn something of a point of intersection of the two models (which it already is, since it does introduce recur targets even when unnamed).

As a final note, the one instance where I think the possibility of using keywords for something comparable was brought up was shorthand field access syntax – IIRC (. x :foo) / (.:foo x) was brought up as a possible syntax for what has ultimately become (. x -foo) / (.-foo x). Despite being a road not taken, I think it illustrates how one could plausibly get used to keywords/symbols in effect accessing two namespaces. (Well, in the longhand version; .:foo is technically a symbol. Using keywords for loop names would not involve affording special treatment to any class of "keyword-derived" symbols.)

In any case, I'll see about preparing a version of the patch using keywords for loop names.
0 votes
by

Comment made by: michalmarczyk

Here's a first cut at a "keyword loop names" patch:

0002-CLJ-2235-use-keywords-as-loop-names.patch

This is to be applied on top of the squashed "separate special forms" patch:

0001-CLJ-2235-implement-named-loop-recur-to-separate-special-form.patch

Also attaching squashed patch for convenience:

0001-CLJ-2235-recur-to-keyword-loop-names.patch

Here's an example of the new scheme:

`
(let [m [[[1 2 3] [4 5 6] [7 8 9]]

             [[10 11 12] [13 14 15] [16 17 18]]
             [[19 20 21] [22 23 24] [25 26 27]]]]
      ((fn iloop [i ires]
         (if (== i (count m))
           ires
           (loop :jloop [j 0 jres 0]
             (if (== j (count (get m i)))
               (recur-to iloop (inc i) (+ ires jres))
               (loop [k 0 kres 0]
                 (if (== k (count (get-in m [i j])))
                   (recur-to :jloop (inc j) (+ jres kres))
                   (recur (inc k) (+ kres (get-in m [i j k])))))))))
        0
        0))

`

Note that recur-to still uses symbols where the target is an fn form.

Also note that the approach taken in this patch has the side effect that a loop named :foo won't shadow an fn-introduced recur target named foo. If we wanted eventually to introduce non-fn-introduced recur target using symbols as names (recur-to-targetable Scheme-style let/loop forms as discussed in the previous comment), that would definitely be the way to go. If not, perhaps it's still less arbitrary than declaring that :foo shadows foo in this context?

0 votes
by
Reference: https://clojure.atlassian.net/browse/CLJ-2235 (reported by michalmarczyk)
0 votes
by

I've been experimenting with this patch and found an unfortunate incompatibility. The following fails to compile with message "No matching recur/recur-to target"

(defn example
  [x]
  (loop top []
    (case (int x)
      0 (loop [] (recur-to top)))))

Looking at Compiler.java, I believe the problem is that, in this case at least, the case expression is emitting the branch using C.EXPRESSION (the stacktrace includes CaseExpr.emitExpr), rather than propagating the context passed to CaseExpr.emit (which in this case should be C.RETURN). This context causes the recur-to to disregard the outer loop.

For a regular recur in an unpatched Compiler.java, it looks like the tail position check is done during RecurExpr.parse. It looks like CaseExpr.parse is propagating the context, so the tail position check passes. So the following compiles without an error:

(defn example2
  [x]
  (loop top []
    (case (int x)
      0 (loop [] (recur)))))
...