Share your thoughts in the 2024 State of Clojure Survey!

Welcome! Please see the About page for a little more info on how this works.

+7 votes
in ClojureScript by

js/BigInt supports equality testing (= (js/BigInt 4) (js/BigInt 4)) is true so I expected to be able to use it as a key in a map (I don't know what hash function is used for the lookup)
However I see some unexpected behaviour:

(get {0 0 1 1 2 2 3 3 (js/BigInt 4) 4 5 5 7 7 8 8} (js/BigInt 4))

returns 4 as expected, but

(get {0 0 1 1 2 2 3 3 (js/BigInt 4) 4 5 5 7 7 8 8 9 9} (js/BigInt 4))

returns nil

This was tested using Firefox or Safari as the javascript engine.

Is this a bug or was I just lucky that it worked for maps with less than 10 entries?

Thanks

2 Answers

+1 vote
by

js/BigInt doesn't work as hash map key.

Hash maps require keys that correctly support hashCode and equals.
(https://clojure.org/reference/data_structures#Maps)

-
> Returns the hash code of its argument. Note this is the hash code
consistent with =.
> (https://cljs.github.io/api/cljs.core/hash)

Now if I check hash of js/Number it's always the same for the same number:

(hash (js/Number 4)) ;; => 4
(hash (js/Number 4)) ;; => 4

But if I do the same for js/BigInt, it's different each time:

(hash (js/BigInt 4)) ;; => 15
(hash (js/BigInt 4)) ;; => 16

(tested using online cljs repl: http://clojurescript.net/).

This suggests that hash function doesn't work for js/BigInt and therefore when computing place in the hashmap:

A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found.
(https://en.wikipedia.org/wiki/Hash_table#Hashing)

incorrect place is computed (as based on the hash function, the new js/BigInt is different value than the js/BigInt inside the hash map).

The reason it works with some hash maps is due to different implementation used in the 2 cases. ClojureScript implements more interfaces for hash maps: https://cljs.github.io/api/cljs.core/#IMap

(type {0 0 1 1 2 2 3 3 (js/BigInt 4) 4 5 5 7 7 8 8}) ;; => cljs.core/PersistentArrayMap
(type {0 0 1 1 2 2 3 3 (js/BigInt 4) 4 5 5 7 7 8 8 9 9}) ;; => cljs.core/PersistentHashMap

You can see example when implementation is switched here: https://cljs.github.io/api/cljs.core/PersistentArrayMap based on HASHMAP-THRESHOLD property.

Some more details about why different implementations exist: https://stackoverflow.com/questions/16516176/what-is-the-difference-between-clojures-apersistentmap-implementations

So if hash map is small, it works like array and for equality hash is not used, only = is. Then, when hash map becomes bigger, first hash is used to determine place in the hash map and = is used to get the concrete value in the position.

That's more or less how it works.

by
Thanks, that explains perfectly the reason it seemed to be working for the small size test.
I do remember reading about that for Clojure at some point but had obviously forgotten.
+1 vote
by
edited by

You can make js/BigInt work with hash maps by implementing the two core protocols required for that to work:

  • cljs.core/IHash
  • cljs.core/IEquiv

Meaning:

(extend-type js/BigInt
  IHash
  (-hash [this] ... impl ...)

  IEquiv
  (-equiv [this other] ... impl ...))

(it might work fine without IEquiv, not sure)

You can do this in your code. I haven't done anything with js/BigInt so I'm unsure how the implementation would look exactly. Should be straightforward though. You can look arround in the cljs.core sources to find examples of other types implementing those.

Without these protocols implemented it uses the default implementation which assigns a unique id as the "hash" for every new instance of js/BigInt. Thats why it doesn't work with a hash map.

by
Thanks for answering. I finally had some time to check this.

I added a simple definition which converted the BigInt to a hexstring and used the string hash function and then calculated the hash of that.
 
    (extend-type js/BigInt
        IHash
        (-hash  [this] (hash (.toString this 16)))
     )

I expect that IEquiv is needed in the case that the hash value collide. I will try to check this later.

I tested the implementation by porting my simple solution to the Project Euler #14 (Collatz problem) to Clojurescript and it generated the correct answer using a hashmap with 2168611 entries. Note, this problem doesn't actually need BigInt.
by
In my testing, using `(.toString this 32)` is faster, probably because it requires allocating shorter strings.

Also in my testing, IEquiv is not necessary.

(extend-protocol IHash
  js/BigInt
  (-hash [this]
    (hash (.toString this 32))))
...