The JVM method java.util.regex.Pattern
has the option of taking a second parameter for flags, which is a bitwise combination of the various allowed values. Has a similar arity been considered for the clojure.core.re-pattern
function?
For instance:
(def rflags {\i java.util.regex.Pattern/CASE_INSENSITIVE
\m java.util.regex.Pattern/MULTILINE
\s java.util.regex.Pattern/DOTALL
\u java.util.regex.Pattern/UNICODE_CASE
\d java.util.regex.Pattern/UNIX_LINES
\x java.util.regex.Pattern/LITERAL
\c java.util.regex.Pattern/CANON_EQ})
(defn re-flags [s]
(reduce bit-or 0 (map #(rflags % 0) s)))
(defn re-pattern
"Returns an instance of java.util.regex.Pattern, for use, e.g. in
re-matcher."
{:tag java.util.regex.Pattern
:added "1.0"
:static true}
([s] (re-pattern s 0))
([s f] (if (instance? java.util.regex.Pattern s)
s
(. java.util.regex.Pattern (compile s f))))
Some notes on this:
- Most of these flags can already be added to a pattern today using a ?
modifier. For instance, a pattern can be made case insensitive by adding (?i)
to the start of the string. However, allowing a flags string is compatible with JavaScript (and could be implemented on ClojureScript)
- There are currently no options to define LITERAL or CANON_EQ without using java.util.regex.Pattern
directly.
- There is currently no way to implement any flags in ClojureScript without using interop.
- While not all of these flags are compatible with JavaScript, the more common ones are. Similarly, JavaScript allows for flags that are not compatible with Java, so there is already a small disconnect.
- Passing 0 for the default flags is indeed what java.util.regex.Pattern(String)
does.