Welcome! Please see the About page for a little more info on how this works.

+1 vote
in Syntax and reader by

I find it a bit inconsistent that I can type:


And it will auto-promote it to a BigInt to fit, without me needing to specify so with the N suffix.

But if using decimal:


It won't auto-promote to BigDecimal without explicitly adding an M suffix. Instead it'll truncate to double precision.

Any reason why that is?

1 Answer

+3 votes
selected by
Best answer

My guess is that for the integer case, it is 100% certain that the value cannot be represented in a 64-bit or smaller integer size, and there is only one big integer type that Clojure supports as the default.

For the non-integer number you mention, that could be a float with too many digits of precision, or a double with too many digits of precision, or it could be desired to be a BigDecimal type. There is no unambiguous way I can think of for a compiler to determine which one the programmer intended.

Note that when written in decimal, some exact representations of IEEE 754 values have a lot more digits after the decimal point than others, so you cannot simply distinguish this by counting the number of digits after the decimal point.

As a couple of examples of the last point, note that 2 to the power -40 can be represented in IEEE 754 arithmetic using one significant bit, and an exponent of -40.  Writing it down exactly in decimal takes 28 decimal digits, because it is equal to (5^40) / (10^40), and 5^40 has 28 decimal digits.

Conversely 0.1 in decimal requires a forever-repeating representation to write it down in binary.  That is, it is similar to trying to write 1/3 or 1/7 exactly in decimal -- it cannot be done without forever-repeating series of decimal digits after the decimal point.
In this case though, we're talking about the literal syntax. So why would you type more precision into the number than you need? Why type those if you don't want that level of precision?

Or are you saying somehow it is not possible to know by looking at the literal if it will fit in a double or not?
It is possible by looking at a literal to know whether it will fit, with no change in its exact value whatsoever, into a double, or not.  However, the set of those values, written in decimal, vary quite widely in how many decimal digits they contain, from 1 up to I suspect hundreds.
Out of curiosity, what rule would you propose for taking a sequence of decimal digits, containing a decimal point, and deciding whether it should be represented as a double or BigDecimal?

Given that IEEE 754 double precision values, and BigDecimal values, have very different kinds of assurances on how arithmetic behaves, e.g. in terms of which arithmetic results will be exact, and which will not, and how they perform roundoff when the results are not exact, it seems to me to be much more predictable if one must explicitly opt in for using BigDecimal values.

Another way of expressing the point of the previous paragraph is to comment on something you said in your original post, that you consider BigDecimal to be a kind of "promotion" over IEEE 754 floating point.  I do not think that is true.  They are just different.

For example, I am pretty sure you can do as many additions and subtractions on BigDecimal values with the same scale, and the results are always guaranteed to be exact.  That is not the case for double values.
I will really, really try to stop thinking about this after writing message :-)

Here is one possible rule one could imagine, but I think it would be pretty weirdly surprising in its behavior, if you implemented it.

"If the non-integer decimal literal can be represented exactly as a double, make it a double, otherwise make it an exact BigDecimal (which is always possible for any finite sequence of decimal digits)."

Consider all of the literals 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0.  Among those 11 literals, only 0.0, 0.5, and 1.0 can be represented exactly as a double, so according to the rule above, those 3 would be double, and the other 8 would be made BigDecimal.

Similarly, for the 101 literals with 2 digits after the decimal point from 0.00, 0.01, 0.02, etc. up to 1.00, only 5 of them can be represented exactly as double: 0.00, 0.25, 0.50, 0.75, and 1.00.  According to the rule above, only those 5 literals would be type double, and the rest would all be BigDecimal.
I think my understanding of floats/double might have been too limited.

I understand now. For onlookers, this helped me: https://www.exploringbinary.com/why-0-point-1-does-not-exist-in-floating-point/

It makes sense what you say now. You can't do what you do for integers. You can't just truncate X number after the decimal point, because in binary, some decimals numbers that would fit, don't fit.

So you can't just say 10 after the decimal. Because 0.1 would be allowed, and yet in binary 0.1 has an infinite number of bits after the bicimal point.

But, and maybe I'm still a bit confused. My thought was more that if the decimal I typed is within the range Double/MIN_VALUE and DOUBLE/MAX_VALUE, it should be a double, if not it should be a BigDec. Wouldn't that make sense still? So 0.1 would be read as a double, but 0.99999999999999999999 would read as a BigDec.
Double/MIN_VALUE is approximately 4.9E-324 and Double/MAX_VALUE is about 1.7976931348623157E308, so your example value of 0.99999999999999999999 _is_ between Double/MIN_VALUE and Double/MAX_VALUE.
I am assuming here, perhaps incorrectly, that by "x is between a and b" you mean "a <= x and x <= b".
Ya, I'm just being dumb. I keep treating the decimals like they're an integer, and that just doesn't make sense.

I'm starting to see that there's no easy way. You can't just promote to BigDecimal after X decimal number,  or over X value. You would need to interleave as you said, which would be weird and unexpected, is 1.03 going to be double or bigdec? Hard to tell. And most of the time, for performance, you'd want to stay in the realm of double, so this auto-promotion would be hurtful.

Thanks for all the details.