Welcome! Please see the About page for a little more info on how this works.

0 votes
in Clojure by

A common question is why it is possible to programatically create keyword instances that cannot be read (https://clojure.org/guides/faq#unreadable_keywords).

a) There are many places where programmatic keywords are perfectly valid
b) It is undesirable to validate all keywords on construction due to the perf hit

However, there are use cases for creating keywords that you expect to be safe for pr/read roundtripping and in those cases it would be useful to have either a variant of keyword that was more restrictive and would throw on invalid keywords OR a predicate that could tell you whether a string would be a readable keyword.

3 Answers

0 votes

Comment made by: pbwolf

  1. Keywords are too often prepared far away from the eventual pr.

  2. It is impressive how seldom deviant keywords are a problem this way. The fix should be proportionate to the problem.

  3. Irreadable keywords are explicitly not the problem. The problem belongs to pr and read. The immediate problem is that pr silently fails to print readably while print-readably.

3b. This problem with pr/read might be temporary.

  1. Adding a special predicate for readable keywords presupposes that you can predict whether someone will call later pr. I think such a prediction would usually be baseless without mingling concerns.

  2. pr could increment irreadable-forms-printed when emitting a non-readable keyword while print-readably. I/O is expensive anyway.

  3. If pr/read someday gained the ability to round-trip all keywords, pr would still have to test a keyword to choose an output format for it. Thus a check installed today could be a first step, not a temporary measure.

  4. By comparison, a program littered with "readable-keyword?" predicates would stay littered forever -- and the predicate check might not even be necessary anymore.

  5. Would "readable-keyword?" match the reader's behavior, the reader's documentation, EDN, or some conservative common subset? related: CLJ-1527 "Clarify and align valid symbol and keyword rules for Clojure (and edn)", CLJ-1530 "Make foo/bar/baz unreadable".

  6. Would every spec writer agonize about whether to spec a keyword as "keyword?" or "readable-keyword?" -- and then always choose "readable-keyword?" just in case someone might ever someday eventually decide to serialize data?.. thereby cluttering the program and obscuring concerns.

  7. Who would test irreadable-forms-printed? An "apologetic", non-solution resolution could get quite complicated.

  8. Perhaps the cure would be worse than the disease. Why not improve pr/read instead: eliminate the problem instead of adding features to work around it?

0 votes

Comment made by: alexmiller

The problem here is specifically about keywords created from either user input or other data outside your control (arbitrary keys from json input for example). When you need to verify this property you are accepting inputs, converting them to keywords, and then later expect to print that data out. I have talked through this with many many people and when people ask about it, they know they are in this situation. Having printing fail or report way down the line is not helpful. The problem is avoiding the creation of the non-roundtrippable data in the first place (and either not accepting it or escaping it at that point).

That said, another possible solution is to add an escaping mechanisms for literal symbols and keywords. We've done some design work on this in the past and ended up shelving it at the time but that's still another possible option.

This is not a high priority issue right now, but I felt it was useful to leave this ticket here to capture the idea.

0 votes
Reference: https://clojure.atlassian.net/browse/CLJ-2309 (reported by alexmiller)