Welcome! Please see the About page for a little more info on how this works.

0 votes
in Libs by

I am trying to write a parser for an s-expression-based language. It has the usual ; to the end of line comments. My problem is that I don't want to strip the comments (my original goal was to write a pretty-printer/formatter), and the comments can appear anywhere in the code.

For example, I can have

;; pretty normal -- this function does blah blah blah
(define-private (blah)
  ;; TODO: do something useful here
  (= 23 5))

or

(define-private ;; make this public maybe?
  (blah)
  (let (
    (enigma 23) ;; snicker
    (laws ;; this is a terrible example
      5))
    ;; inside the let body
   (= enigma 
    ;; todo: constant folding?
    laws)))

How do I get instaparse to handle that?

Stripping comments is trivial -- I can use something like this:

(defparser ws-or-comments
  "ws-or-comments = #'\\s+' | comment+
   comment = #';+[^\n]*'
" :auto-whitespace :standard)

(defparser my-parser ... :auto-whitespace ws-or-comments)

1 Answer

0 votes
by

I have been working on an instaparse grammar for lua (eventually writing a little analyzer and optimizing compiler for it).

I ran into similar issues w.r.t. comments (in-line and block). I currently have defined comments in the grammar so they parse into the data structure.

https://github.com/joinr/bpdb/blob/master/src/bpdb/core.clj#L116

Now I have the unfun task of preventing parsing -- (the comment syntax) as two unary - - operators. You should have no such problem in sexpr language though. I am still picking up instaparse, so low mileage. There are probably better answers.

Another option that comes to mind is to have a 2 pass parsing; first pass scrapes out comments but retains them for printing, second pass parses the "normal" code.

...