I am using core.cache and was wondering why the creation of lru cache with large threshold and no intial data was taking so long.
(require '[clojure.core.cache.wrapped :as cw])
(time (cw/lu-cache-factory {} :threshold 10000000))
The bottleneck seems to be this line where the lru list gets filled with dummy values. I looked at the LRUCache code and don't see any reason for the lru list to not grow organically and only initialize it with the base. I also tested the LRUCache with the simpler initialization function
(defn- build-leastness-queue
[base start-at]
(into (clojure.data.priority-map/priority-map) (for [[k _] base] [k start-at])))
and all LRUCache tests still pass.
The original build-leastness-queue
is also used for the LUCache and here the tests no longer pass when using the new version. While trying to also make the modified version work for the LUCache I realized that it is currently kind of broken.
(require '[clojure.core.cache :as c])
(def ca (c/lu-cache-factory {:a 1 :b 2} :threshold 2))
(-> ca
(assoc :c 3)
.lu)
;; => {:a 0, :c 1}
(-> ca
(dissoc :a)
(assoc :c 3)
.lu)
;; => {:c 0, :b 0}
So with the current implementation the initial LU priority of newly assoced (missed) item depends on whether the lu cache is currently full or not.
I signed the Clojure CA this morning and was wondering if somebody could add me to the JIRA so I can propose a patch for these issues (name is FiVo over there).
One more question concerning the semantics of the LUCache. The tests contain the comment ensure that the test result does not rely on arbitrary tie-breakers in number of hits, but it seems it relies on arbitrary tie-breakes:
(def ca (c/lu-cache-factory {:a 1} :threshold 2))
(-> ca
(assoc :b 2)
(assoc :c 3)
(assoc :d 4))
;; => {:b 2, :d 4}
Should later assoced items have a higher priority? This would make the patch a lot more complicated and I strongly favor keeping arbitrary tie-breaks.