Welcome! Please see the About page for a little more info on how this works.

–1 vote
in Syntax and reader by
edited by

Problem:

Having had to include some JavaScript, XML and HTML inside of my Clojure code here and there, it can be pretty annoying and error prone to have to escape quotes. This holds true as well when scripting, and running shell command, you can get into hairy escaping scenarios.

Solution:

Add a string literal which can be adapted to contain any sort of string without the need for escaping.

Suggestions:

Text blocks

Some other languages offer something called a text block where you can write a string using triple or more quotes, where all characters are then allowed:

(println """
         This " is allowed,
         and no need to escape it.
         """

Text blocks often come with additional features, such that the first and last newline isn't part of the string. And the position of the triple quote in the source code delineate the beginning of the lines in the quote. Thus the above code prints:

This " is allowed,
and no need to escape it.

and not:

         This " is allowed,
         and no need to escape it.

While text blocks are neat visually, as they have nice alignment in the source code. They are whitespace dependent, and Clojure up to now is a whitespace independent language, meaning whitespace does not matter. I think it would be best to keep it that way. Thus the next two suggestions.

Raw strings

Sometimes the text block without the "block" features is known as a raw string literal:

(println """This " is allowed,
and no need to escape it.
Also support multi-line, but
not the "block" style of text blocks.""")

Thus:

(println """
         This " is allowed,
         and no need to escape it.
         """

Prints:

         This " is allowed,
         and no need to escape it.

Unlike for Text Blocks.

If you need a triple quote, just make the delimiter a quadruple quote:

""""This """ is now allowed as well.""""

The issue with raw string is that, if you use say double quotes as your delimiter:

""This is a raw " string!""

But want your single quote to be at the beginning or the end:

"""{{hello}}"""

I want the string: "{{hello}}", not {{hello}}, but the raw string can not disambiguate the two, as now it thinks this is a triple quoted delimiter.

One solution is to allow an escaped quote only at the beginning or end:

""\"{{hello}}\"""

But not in the middle:

""\"{{he\llo}}\"""

This is the string: "{{he\llo}}"

So the escape character \ can appear anywhere except at the beginning if followed by a quote, and at the end if followed by a quote.

I still don't find this ideal. There's too many rules, and there are still cases where an escape is required.

Unescaped string (my favorite)

The idea here is to allow any string to be used as the delimiter. Thus given whatever possible string we want to nest inside our Clojure code, we can always find a string which is not contained in it to use as our delimiter.

Lets say the reader macro #text is added. Which expects the following form to be a regular string which tells it the delimiter for the following form to read:

(println #text "|" |"{{hello}}"|)

Would print:

"{{hello}}"

The first arg to #text tells it what the delimiter for the following raw string should be. That way, you absolutely never need an escape sequence inside the raw string. For any given string, you can find a delimiter string not contained in it to handle it properly.

A crazy thought I had with this approach, just trowing it out there, is if you use a sufficiently random string as the delimiter, could be a weird way to protect against forms of injection:

(println #text "xIBgdSl4TCCOIdqdMu9G" xIBgdSl4TCCOIdqdMu9G
Can't nobody guess the delimiter to escape the string context :p
xIBgdSl4TCCOIdqdMu9G)

Thank You

2 Answers

+1 vote
by

This has been requested, and declined, several times in Clojure’s lifetime and I don’t think there are any new arguments here.

by
Hum, I did try to search about it first. Could not find anything. I was motivated from the incoming Java JEPs about it: https://openjdk.java.net/jeps/355 and https://openjdk.java.net/jeps/326
by
There are many discussions on the Clojure and Clojure-dev google group mailing lists if you search for “raw string”, “string literal”, “heredoc”, “multiline comment”, etc.

Old design page: https://archive.clojure.org/design-wiki/display/design/Alternate%2Bstring%2Bquote%2Bsyntaxes.html

I guess I should reword “decline” to “lack of interest”; I think generally Rich finds this kind of thing has a lot of subtle complexities (esp for tools) with a relatively low benefit.
by
An argument for not using the triple quote approach is that this is already legal Clojure:

user=> (println """
This is already valid Clojure
""")
 
This is already valid Clojure
 
nil
user=>

(it doesn't mean what you want but it is valid today, even tho' I doubt any real code does this)
by
Okay, I admit having only searched google, and not the mailing list specifically :p.

I hadn't thought of doc-strings as a use case, but that's a good one as well.

Good point about tooling complexities this may introduce. Parsing a heredoc or similar can be more difficult.

Also if you look at Ruby, Perl and Python, you do see how no one seems to be able to settle on the right way to do this, and all three ended up supporting multiple ways.

Speaking of not being worth the effort for tooling. This is actually something tooling can also address. Not for the doc-string use case, but for my snippet one. I know in IntelliJ, there is a mode where another buffer opens for you to type freely, and it automatically Java escapes what you type inside your string. Maybe I'll work on something like that for Emacs as well.
by
@sean Good catch. Whatever solution it is, I'd personally lean on having it use a reader tag. So #s """ """ or the like. Which would solve this issue.
+1 vote
by

Looking at the bright side to the status quo,-- Treating XML and HTML as text has perennially gotten non-Clojurists into awful troubles, including malformed output, injections, kluged transformations, and code that is simply very hard to evaluate for correctness!

Clojure owes some of its good reputation for robustness to conventions founded in data (not strings). "Hiccup" and "clojure.xml" conventions have proved so effective for HTML and XML, and (in my unquantified experience) so surprisingly cheaply at run-time, that they prove a wondrous up-side to refraining from planting blobs in code. If you have blobs to start with, you can parse them with Enlive etc. at the earliest possible opportunity, and complete the processing as data structures.

Er well Javascript might be a stretch. Best just get it from a resource? If coding in ClojureScript, you can use a macro to fetch the resource at compile-time from the same classpath as the cljs files.

Overall, I am inclined to think Clojure is wise (and we are fortunate) to do without a sop to embed and tweak HTML and XML literals in code. At first it might seem like a missing feature, but its absence has been a boon to the reliability of Clojure code.

by
Great point. And for HTML and XML I agree. My drive came from doing server side rendering and needed to embed some JavaScript in my Hiccup.

I've actually wondered if CLJS could be used embedded in that way, but I don't think it can. Though if it did, that would be pretty great.
...