Welcome! Please see the About page for a little more info on how this works.

+2 votes
in Transducers by

I've seen this bug report before: https://clojure.atlassian.net/browse/CLJ-1569.
Now, while trying to implement Clojure-like transducers in another language for educational purposes, I've realized how many things are affected by this. So some thoughts first...
Performing initialization lazily with the current implementation in general case requires extra checks in the step function and in the completion function as well in case of empty input. So it's not like there are no workarounds, but this is very unintuitive, especially with all standard transducers still having nullary versions that are never called (so in terms of implementation it's skipping a step in the process, but in terms of logic it's adding an implicit step of dropping a transformed initial value and replacing it with an unrelated value).
Before any further examples, I should note that alt-transduce in the linked issue doesn't check if the initial value is already reduced after initialization process, which I believe it should do because reduce itself doesn't do that. So my updated alt-reduce is

(defn alt-transduce
  ([xform f coll]
     (let [rf (xform f)
           result (rf)]
       (rf (if (reduced? result)
             (unreduced result)
             (reduce rf (rf) coll)))))
  ([xform f init coll]
     (let [rf (xform
               (fn
                 ([] init)
                 ([result] (f result))
                 ([result input] (f result input))))
           result (rf)]
       (rf (if (reduced? result)
             (unreduced result)
             (reduce rf (rf) coll))))))

Speaking of affected laziness and extra checks, examples are already there in the standard library. take is defined as

(defn take
...
  ([n]
     (fn [rf]
       (let [nv (volatile! n)]
         (fn
           ([] (rf))
           ([result] (rf result))
           ([result input]
              (let [n @nv
                    nn (vswap! nv dec)
                    result (if (pos? n)
                             (rf result input)
                             result)]
                (if (not (pos? nn))
                  (ensure-reduced result)
                  result)))))))
...

First, take 0 still has to consume one item because it's the first time it has control. Second, even though I don't know why it's implemented in this somewhat complicated way, there's only one value to keep track of and yet two comparisons, apparently, just because it has to handle n = 0 when it receives its extra input. With a functioning initialization stage, it can be implemented like this:

    (defn alt-take
  ([n]
     (fn [rf]
       (let [nv (volatile! n)]
         (fn
           ([]
              (let [result (rf)]
                (if (not (pos? n))
                  (ensure-reduced result)
                  result)))
           ([result] (rf result))
           ([result input]
              (let [nn (vswap! nv dec)
                    result (rf result input)]
                (if (not (pos? nn))
                  (ensure-reduced result)
                  result))))))))

(Close to original, can be reordered if necessary, the point is that the two comparisons are in the places where they naturally should be.)
If this issue isn't fixed yet, there must be some strong reasons for that? It looks like this questioned never was answered, I guess my question is the same at this point: https://clojure.atlassian.net/browse/CLJ-1569?focusedCommentId=18296.

I’d appreciate any further insight you can offer on why this design choice has been taken.

1 Answer

0 votes
by

I'm not sure if you're on Slack but there was a really good answer to this in a thread about six months ago: https://clojurians.slack.com/archives/C053AK3F9/p1657570994323509?thread_ts=1657568229.677049&cid=C053AK3F9

The TL;DR is that "you can use a transducer to build a reducing function, so that's why transducers must support the init arity" with the following example give:

(def useless-xform
  (fn [rf]
    (fn
      ([] (rf))
      ([result] (rf result))
      ([result input]
       (rf result input)))))

(transduce identity
           (useless-xform +)
           [1 2])
;; 3

This is the most plausible explanation I've read so far on the otherwise puzzling init arity of transducers "not being used".

by
edited by
I'm not on Slack and I'm getting an error trying to sign up there.
However, that answer seems to answer a totally different question. Again, the question essentially is: why was transduce deliberately made to ignore transforming arity 0? There have been examples showing why it should do it, but examples of why it shouldn't remain to be seen.
I've already shown an example of a useful transducer with zero arity (take) and mentioned another one that should be possible to create (prepending items to a sequence). Building a reducing function is the whole purpose of transducers and, as the meaning or even the type of the accumulator changes, this process requires two additional functions (arities) for starting and finishing the process. Both initializing (nullary) and finalizing (unary) functions are not only useful here, but required. But the current implementation of transduce fails to handle one of the two. I can show and example with + as well, although arithmetic examples may be less clear.
Do you see the big difference in meaning between
(transduce identity (xform +) 10 [1 2])
and
(transduce xform + 10 [1 2])
Providing an initial value is supposed to replace arity 0 in the same way as `completing` replaces arity 1. But it currently replaces the transformed (!) accumulator instead. It can be demonstrated with
(defn reinit-xform [init]
  (fn [rf]
    (fn
      ([] (rf) init) ; <-- !!!
      ([result] (rf result))
      ([result input]
       (rf result input)))))
(transduce identity ((reinit-xform 100) +) 10 [1 2])
(transduce (reinit-xform 100) + 10 [1 2])
The version with identity correctly returns 13 as the initial value is replaced by 10. The version with reinit outside of an untransformed (as seen by transduce) reduction function is supposed to return 103 as the initial value should be transformed by xform and this is what the suggested alt-reduce (in the original issue report or in my post, no difference in this context) returns, but not the current version of transduce.
by
You asked "Why doesn't transduce call initializers?" and I showed you an example where it does, so I'm not sure how to respond to this...?
by
The question in the header may be ambiguous on its own, but the whole explanation is a part of the question, otherwise it wouldn't be there. Sorry if it's too long, but previous attempts seem to have been ineffective, so I felt that more examples were necessary.
Of course, if the transformation is just identity, transduce becomes reduce. A counterexample is enough to disprove something, but a working example doesn't prove anything except that it happens to work in this specific case.
...