Welcome! Please see the About page for a little more info on how this works.

+2 votes
in Collections by
Couldn't find whether it was brought up earlier, but it seems that {{empty?}} predicate is broken for transient collections


user=> (empty? (transient []))
IllegalArgumentException Don't know how to create ISeq from: clojure.lang.PersistentVector$TransientVector  clojure.lang.RT.seqFrom (RT.java:528)

user=> (empty? (transient {}))
IllegalArgumentException Don't know how to create ISeq from: clojure.lang.PersistentArrayMap$TransientArrayMap  clojure.lang.RT.seqFrom (RT.java:528)

user=> (empty? (transient #{}))
IllegalArgumentException Don't know how to create ISeq from: clojure.lang.PersistentHashSet$TransientHashSet  clojure.lang.RT.seqFrom (RT.java:528)


The workaround is to use {{(zero? (count (transient ...)))}} check instead.

*Cause:* {{empty?}} is based on seqability, which transients don't implement.

*Proposed* Add a branch to {{empty?}} for counted? colls. Transients implement Counted so gain support via this branch. Other colls that are counted are faster. Seq branch continues to work for seqs.

Perf test:


(def p [])
(def p1 [1])
(def t (transient []))
(def t1 (transient [1]))

;; take last time of all these
(dotimes [i 20] (time (dotimes [_ 10000] (empty? p))))
(dotimes [i 20] (time (dotimes [_ 10000] (empty? p1))))
(dotimes [i 20] (time (dotimes [_ 10000] (empty? t))))
(dotimes [i 20] (time (dotimes [_ 10000] (empty? t1))))


Results:

||coll||before||after||result||
|p|0.72 ms|0.08 ms|much faster when empty|
|p1|0.15 ms|0.13 ms|slightly faster when not empty|
|t|error|0.19 ms|no longer errors|
|t1|error|0.20 ms|no longer errors|

Not sure if doc string should be tweaked to be more generic, particularly the "same as (not (seq coll))" which is now only true for Seqable but not Counted. I think the advise to use (seq coll) for seq checks is still good there.

I did a skim for other types that are Counted but not seqs/Seqable and didn't find much other than internal things like ChunkBuffer. Many are both and would thus use the counted path instead (all the persistent colls for example and any kind of IndexedSeq).

I guess another option would be just to fully switch empty? to be about (zero? (bounded-count 1 coll)) and lean on count's polymorphism completely.

*Patch:* clj-1872.patch

5 Answers

0 votes
by

Comment made by: alexmiller

Probably similar to CLJ-700.

0 votes
by

Comment made by: devn

As mentioned in CLJ-700, this is a different issue.

0 votes
by

Comment made by: devn

First things first, the original description brings up (empty? (transient ())). Per the documentation at https://clojure.org/reference/transients, there is no benefit to be had for supporting transients on lists.

Current behavior for java collections:

`
(empty? (java.util.HashMap. {}))
=> true

(empty? (java.util.HashMap. {1 2}))
=> false

(seq (java.util.HashMap. {1 2}))
=> (#object[java.util.HashMap$Node 0x4335c9c3 "1=2"])

(seq (java.util.HashMap. {}))
=> nil
`

The same behavior is true of java arrays.

Over in CLJS-2802, the current patch's approach is to cond around the problem in empty? by explicitly checking whether it's a TransientCollection, and if so, using (zero? (count coll)) as the original description mentions as a workaround.

Currently, transient collections do not implement Iterable as the persistent ones do. If Iterable were implemented, I believe RT.seqFrom would work, and by extension, empty?.

0 votes
by

Comment made by: alexmiller

I think there are good reasons for transient collections not to be Seqable - seqs imply caching, caching hurts perf, and the whole reason to be using transients is for batch load perf. So that seems counter-productive. Iterators are stateful and again, I suspect that is probably a bad thing to add just for the sake of checking empty?.

An explicit check for emptiness of counted? colls would cover all the transient colls and anything else counted without making a seq. That might be faster for all those cases, and doesn't require anything new of anybody in the impl.

Another option would be to have an IEmptyable interface and/or protocol to indicate explicit empty? check support. Probably overkill.

0 votes
by
Reference: https://clojure.atlassian.net/browse/CLJ-1872 (reported by alex+import)
...