Welcome! Please see the About page for a little more info on how this works.

+4 votes
in Sequences by
retagged by

The doc string for sequence says:

  • "... Will not force a lazy seq..."
  • "... returns a lazy sequence..."

Based on that, I would not expect sequence (by itself) to cause any of its input to be consumed. Not sure if this is expected behavior, but maybe the wording could be tweaked to indicate that sequence will partially realize some of its input.

(do
  (->> (range 1000)
       (map #(doto % prn))
       (sequence (map inc)))
  nil)

prints the following to standard out:

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31

This is in contrast to lazy sequence functions like map, filter as well as delayed options like eduction. The following expressions don't print anything to standard out:

(do
  (->> (range 1000)
       (map #(doto % prn))
       (eduction (map inc)))
  nil)

(do
  (->> (range 1000)
       (map #(doto % prn))
       (map inc))
  nil)

Another reference that also seems to indicate that sequence (by itself) doesn't consume any of its input, https://clojure.org/reference/transducers#_sequence

The resulting sequence elements are incrementally computed. These
sequences will consume input incrementally as needed and fully realize
intermediate operations. This behavior differs from the equivalent
operations on lazy sequences.

(emphasis mine)

It's not surprising that sequence can chunk, but it is surprising to me that sequence will partially realize its input without any additional calls to the returned result.

by
This is extremely important to me as I want to be able to use `sequence` in order to handle consuming over potentially stateful sequence generators.

In particular, I had to invent my own equivalent to `sequence` which did not eagerly consume the sequence in order to allow me to write a binary parsing library where multiple parsing functions consume different parts of the sequence, where any chunking or eager consumption would result in bad offsets and as a result an invalid parse of binary data.

2 Answers

+2 votes
by
selected by
 
Best answer

When using the transducer arity of sequence, it creates a transducing iterator over N sources (sequence is the only function that can tap into that). That is then wrapped into a seq via RT.chunkIteratorSeq() - this is the method that will force the first element to determine whether it needs to return nil or a sequence.

Without looking back at old notes, I don't remember if this was something we considered and decided wasn't important, or whether it's something that was an oversight and should be corrected. I agree with you that the behavior seems in conflict with the docstring.

There might already be a ticket related to this, I haven't looked yet.

+2 votes
by
...