Why do sets not implements ILookup?

Question

Why do sets not implements ILookup?

asked Nov 24, 2019 in Clojure by Reut Sharabani
edited Aug 3, 2021 by Reut Sharabani

This is to follow an answer I gave on StackOverflow (which I hope is mostly correct, but I'm not experienced enough to know for sure). More details are there than in this question.

The gist is that (:k my-set) is much slower than (my-set :k) and (:k my-map). This is very counter-intuitive since when I hold some items that I want to repeatedly query for membership I hold them in a set. It makes much more sense to hold them in a map to always be performant (map performs well in both forms of invocation).

The reason I found for the latency difference was that calling set's invoke is much faster than calling keyword's invoke which does a bunch of delegations and instanceof checks.

I was able to improve the performance of {:k my-set} by implementing ILookup using proxy:

(def uids #{:a :b :c :d :e :f :g :h :i :j :k :l :m :n :o :p :a1 :b1 :c1 :d1 :e1 :f1 :h1 :i1 :j1 :k1 :l1 :m1 :n1 :o1 :p1})
(def uids-map (into {} (for [k uids] [k k])))
(def lookupable-set (proxy [clojure.lang.APersistentSet clojure.lang.ILookup] [uids-map]
                      (valAt [k] (get uids-map k))))

;; verify
(instance? clojure.lang.APersistentSet lookupable-set) ;; true
(instance? clojure.lang.ILookup lookupable-set) ;; true

(time (dotimes [i 1000000] (:o1 uids))) ;; 134.703101 msecs
(time (dotimes [i 1000000] (:o1 lookupable-set))) ;; 63.187353 msecs  <-- faster
(time (dotimes [i 1000000] (:o1 uids-map))) ;; 35.802762 msecs <-- still fastest

I was wondering why clojure's set(s) do not implement ILookup to begin with? Isn't lookup a big part of what people expect to do with sets? They already have the appropriate functions to do so easily. What would break if they just implemented ILookup? Or is there another reason not to do so?

Thanks.

Edit:

I've re-tested with (contains? uids :o1) as well, as @alexmiller suggested in the comments and its still slower than the naive ILookup implementation:

(println "kw set")
(time (dotimes [i 1000000] (:o1 uids)))

(println "kw lookupable set")
(time (dotimes [i 1000000] (:o1 lookupable-set)))

(println "kw map")
(time (dotimes [i 1000000] (:o1 uids-map)))

(println "contains? set")
(time (dotimes [i 1000000] (contains? uids :o1)))

Which gave me::

kw set
"Elapsed time: 283.526096 msecs"

kw lookupable set
"Elapsed time: 121.766786 msecs"

kw map
"Elapsed time: 70.514017 msecs"

contains? set
"Elapsed time: 153.092212 msecs"

Map is still X2 times faster than set for lookups and implementing ILookup for sets is faster than contains?.

1 Answer

commented Dec 8, 2019 by Reut Sharabani

Why do sets not implements ILookup?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Categories

Why do sets not implements ILookup?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Categories