Rationale
Based on my own experience, and also by observing the beginners channel in Slack, an extremely common pitfall for newcomers is wondering how can you parse a string to an int.
This goes wrong in number of ways:
- People discover the
int
function, whose docstring is Coerce to int -- that throws when passed a string.
- People are suggested
Integer/parseInt
, that does the job, but then can't be used as (map Integer/parseInt coll-of-strings)
-- this is surprising, because it's their first experience with Java interop.
- People are suggested
#(Integer/parseInt %)
, that does what's expected, but whose syntax can be a little bit too much for your first 10 minutes of Clojure.
- Sometimes people are suggested
#(Integer. %)
, which is deprecated in Java 11.
- People go through the same procedure in ClojureScript, and have to select a different interop way to do this.
Proposal 1
Consider adding a parse-*
family of functions in Clojure that are thin wrappers over the Java interop. See Appendix for a possible list.
Proposal 2
Consider exposing clojure.lang.LispReader.matchNumber
as parse-number
.
People can then use the various coercions functions to get back the precision that they need. This might fit better the rationale of this ticket, which is to make a very common "toy program" operation smoother for beginners, and matching the Reader's behaviour will be the least surprising thing.
People who are sensitive about performance should know more about the intricacies of boxed arithmetic on the JVM anyway. This is also pleasantly platform-agnostic, CLJS could expose match-number
.
Questions/Alternatives
- Should the functions return primitives or boxed values?
- What should be the handling of strings like
"0xff"
? The parseFoo
family of functions rejects those, but 0xff
can be read by the Clojure reader.
- OTOH, the
decode
family of functions handle some prefixes, but they return a boxed value. But they also accept numbers like #10
which is an invalid Clojure literal.
Appendix
A hopefully complete list of primitive-returning functions (as of JVM 8) is:
name args ret-value
parse-int s int
parse-int s, radix int
parse-uint s int
parse-uint s, radix int
parse-long s long
parse-long s, radix long
parse-ulong s long
parse-ulong s, radix long
parse-short s short
parse-short s, radix short
parse-byte s byte
parse-byte s, radix byte
parse-float s float
parse-double s double
The unsigned functionality was added in Java 8, so should be safe to use in newer Clojure versions. Newer JVM versions add support for parsing parts of CharSequences.