Comment made by: bendlas
sax-js doesn't seem to support streaming ..
I'm not sure I understand: SAX literally stands for "Streaming API for XML" and looking at its README example, I would expect it to allow multiple .write calls with partial chunks before calling the final .close
What are the good options for a streaming Node parser to build on top of?
All the various streaming parsers I've found were built on sax-js, so I'd go for that.
Streaming support with lazy seqs would be great!
Here is the crux: Lazy Sequences in are not really an option for IO in JS, since everything is non-blocking. So, where as in java you can happily block your thread, while waiting for input in a lazy-seq's .next, you can't do that in JS. Presumably, there are options for blocking IO in Node.js, but that would be node-only and still horrible due to the single-threaded nature of node programs. This is also the reason there exists no XML Pull API (StAX) for JS.
data.xml is built on StAX, because it's a natural fit for lazy-seq's and because this was the preferred way to do stream processing in clojure back then. SAX on the other hand, is a push model which is a good fit for JS, since your program is driven by incoming IO.
Recently, clojure has gained solid support for push streams in the form of transducers. I am hammocking on the possibilities for basing data.xml on transducers, so that StAX and SAX sources could be supported uniformly. As a bonus, this has the potential of making data.xml faster, due to reducing intermediate allocation.