Welcome! Please see the About page for a little more info on how this works.

+1 vote
in Sequences by
edited by

For collections and sequences, when calling str on them, Clojure will return an EDN-like representation of them as string more similar to the result of pr-str:

(str (vector 1 2 3))
"[1 2 3]"

(str '(1 2 3))
"(1 2 3)"

(str (seq [1 2 3]))
"(1 2 3)"

(str {:a 1 :b 2})
"{:a 1, :b 2}"

(str #{1 2})
"#{1 2}"

But when used on lazy-seq, str will print the type of lazy-seq followed by the hash of its realized values:

(str (lazy-seq [1 2 3]))

Not only could this be argued to be a rather useless behavior for most use case (as most people probably would rather it stringified the same as seq does. It also force realizes the lazy-seq:

(let [ls (map inc [1 2 3])]
  (realized? ls))
;;=> false
(let [ls (map inc [1 2 3])]
  (str ls)
  (realized? ls))
;;=> true

Similarly, when used on eductions, str will print the type of eduction this time followed by the memory location:

(str (eduction identity [1 2 3]))

Unlike for lazy-seq, it will not "realize" the eduction:

(str (eduction (fn[e] (println e) (identity e)) [1 2 3]))

(str (map (fn[e] (println e) (identity e)) [1 2 3]))

It seems like the current behavior for lazy-seq and eduction is either inconsistent or wrong.

Ideally it would behave either where:

  1. str would cause neither lazy-seq nor eduction to realize themselves, but would still stringify as only their type. The idea here would be that str does not realize "pending" computation, thus it would be safe to use on infinite lazy-seqs or repeatedly on eductions.
  2. Or str would always realize the "pending computation", and thus lazy-seq and eduction would both stringify the same as other collections and sequences do, more similar to calling pr-str on them.

P.S.: It might be relevant when making a decision to consider that ClojureScript currently behaves differently in that it does option 2 for lazy-seq and I'm not sure what it does for eduction:

cljs.user=> (str (lazy-seq [1 2 3]))
"(1 2 3)"

cljs.user=> (str (eduction identity [1 2 3]))
"[object Object]"

P.S.2: It also appears the behavior when a lazy-seq or an eduction is nested inside another collection or sequence differs in that in those scenarios, they will get stringified in an EDN-like fashion:

(str [1 (map identity [2 3]) 4])
"[1 (2 3) 4]"

(str (seq [1 (map identity [2 3]) 4]))
"(1 (2 3) 4)"

(str [1 (eduction identity [2 3]) 4])
"[1 (2 3) 4]"

(str (seq [1 (eduction identity [2 3]) 4]))
"(1 (2 3) 4)"

This seems to be the case in ClojureScript as well.

3 Answers

0 votes

Your sentence near the end says "in that it does option 1 for lazy-seq". Did you mean option 2, which is what it appears to be from the example ClojureScript REPL interaction?

Yes, good catch, I corrected it to say option 2.
0 votes

Minor comments (i.e. ones that do not affect the substance of the requested changes in behavior at all -- merely notes on current implementation details): All of the hex numbers printed in the output examples you show are the return value from the Java hashCode method, in hex, which in general might be different than the return value of clojure.core/hash, or it might be the same, depending on the class of the object and whether it defines hasheq differently than hashCode.

For the lazy seq examples you give, the value is immutable, so the hashCode method return value is a pure function of the immutable collection contents.

For the eduction printed value, it is still the return value of hashCode, but for this class the return value of hashCode is the default Java identityHashCode value defined for the class java.lang.Object, and is not overridden: https://docs.oracle.com/javase/7/docs/api/java/lang/System.html#identityHashCode(java.lang.Object)

I see, good to know, so in a way they both strigify as type + hashCode, but their hashCode implementation differs. Maybe it is that implementation for lazy-seq which also force realizes it?
Yes, the `hashCode` method called on lazy sequences forces the entire sequence to be realized, and `hashCode` to be called on every element, similar to how `clojure.core/hash` does.  They just calculate different values for most Clojure collections, since Clojure 1.6.0 improved the variety of values returned from `clojure.core/hash` for several types of Clojure collections, as compared to `hashCode` which was fairly low variety of values for some kinds of collections (e.g. 2-element or 3-element vectors/sequences, all integers, is pretty low variety with `hashCode`).
A tiny note:  `realized?` reports on what you might think of as the head element of a lazy seq - not the whole thing.  That's because there is, of course, no reason why the tail of a lazy seq should also be lazy.

    (let [a (map inc (range))]
      [(realized? a)
       (first a)
       (realized? a)
       (realized? (rest a))])
0 votes