Welcome! Please see the About page for a little more info on how this works.

0 votes
in Clojure by

I can't get nth to work for long value as index :

(nth (cycle [:a]) 12321123212397) => Execution error (ArithmeticException) at java.lang.Math/toIntExact (Math.java:1371). integer overflow

Same if I explicitly try to coerce to Long :

(nth (cycle [:a]) (long 12321123212397)) => Execution error (ArithmeticException) at java.lang.Math/toIntExact (Math.java:1371). integer overflow

Any advice appreciated. thank you !

1 Answer

+1 vote
by
selected by
 
Best answer

It's not about long vs int types since integer number literals in Clojure are always longs. It's about the range of the value and the fact that it doesn't fit into Java's int.

I don't know for certain but I can speculate that the limit is due to practicality.
Strings and arrays in Java have a size limit that fits into int. Generic List interface can also be index only with int.

Same for indexed Clojure collections. And while e.g. Clojure's persistent vectors could in theory use longs as the base index type since they rely on arrays of size 32, it wouldn't be very practical as even a plain array with Integer/MAX_VALUE of 8-byte values will take 17 GB of RAM, and persistent vectors are not plain arrays.

As for anything lazy or without random access - using nth with a high index there would be a very time-consuming anti-pattern. That's not to say that it's never needed, but someone needing it is a strong sign that a wrong data structure has been chosen for the problem at hand.

by
Thank you for your answer. I'm working in data science, and we do have very large collections of data taking up to several GB of RAM.
I understand the reasoning, though, and I will look for another approach.
by
A common approach for large data blobs is to use appropriate libraries instead of generic data structures, e.g. https://github.com/scicloj/tablecloth. And the long value that you posted in the OP is not several GB, it's 100 TB. :)
by
I will have a look at tablecloth. As for the long value, it's just one I've made up for the code snippet ;)  Thanks again !
...