# Faster reduce for lazy-seqs

Lazy seqs that are built from vectors/chunked-seqs can sometimes take a slow first/next reduce.

Observation:

```
(def xs (vec (range 3000000)))

(time (reduce max xs)) ;; #1: 130ms, (reference)
(time (reduce max (lazy-cat xs))) ;; #2: 130ms, just as fast
(time (reduce max 0 (lazy-cat xs))) ;; #3: 500ms, 4x slower!!

;; Double them with concat and they're not 2x slower but 10x slower:
(time (reduce max (lazy-cat xs xs))) ;; #4: 1200ms
(time (reduce max 0 (lazy-cat xs xs))) ;; #5: 1200ms
```

Explanation for #3: The problem is that {{seq-reduce}} when called without {{init}} will properly call {{reduce}} again and take a fast path. With {{init}} given it won't ever escape to a faster reduce but will stick to first/next.

Note: In Clojure they scale "properly" (first 3 are ~45ms, last 2 are ~110ms).

The reason is that Clojure properly escapes to a fast path:

https://github.com/clojure/clojure/blob/2b242f943b9a74e753b7ee1b951a8699966ea560/src/clj/clojure/core/protocols.clj#L131-L143

This is an RFC, since I'm not 100% certain about the implemenation. Need to be careful to not blow the stack here...

Questions:
1. Should ChunkedCons implement IReduce? I think so.

by

```
CHROME:
"OLD"
#1: "Elapsed time: 21.870000 msecs"
#2: "Elapsed time: 25.485000 msecs"
#3: "Elapsed time: 134.900000 msecs"
#4: "Elapsed time: 317.155000 msecs"
#5: "Elapsed time: 313.225000 msecs"
"NEW"
#1: "Elapsed time: 23.290000 msecs"
#2: "Elapsed time: 19.445000 msecs"
#3: "Elapsed time: 20.075000 msecs"
#4: "Elapsed time: 102.895000 msecs"
#5: "Elapsed time: 100.430000 msecs"
"OLD"
#1: "Elapsed time: 19.585000 msecs"
#2: "Elapsed time: 19.475000 msecs"
#3: "Elapsed time: 87.805000 msecs"
#4: "Elapsed time: 303.340000 msecs"
#5: "Elapsed time: 291.680000 msecs"
"NEW"
#1: "Elapsed time: 20.760000 msecs"
#2: "Elapsed time: 19.690000 msecs"
#3: "Elapsed time: 18.960000 msecs"
#4: "Elapsed time: 101.385000 msecs"
#5: "Elapsed time: 101.290000 msecs"

FIREFOX:

"OLD"
#1: "Elapsed time: 7.880000 msecs"
#2: "Elapsed time: 7.820000 msecs"
#3: "Elapsed time: 69.460000 msecs"
#4: "Elapsed time: 197.800000 msecs"
#5: "Elapsed time: 209.300000 msecs"
"NEW"
#1: "Elapsed time: 7.380000 msecs"
#2: "Elapsed time: 7.360000 msecs"
#3: "Elapsed time: 8.100000 msecs"
#4: "Elapsed time: 110.640000 msecs"
#5: "Elapsed time: 89.960000 msecs"
"OLD"
#1: "Elapsed time: 6.520000 msecs"
#2: "Elapsed time: 7.360000 msecs"
#3: "Elapsed time: 106.920000 msecs"
#4: "Elapsed time: 252.300000 msecs"
#5: "Elapsed time: 249.800000 msecs"
"NEW"
#1: "Elapsed time: 7.360000 msecs"
#2: "Elapsed time: 6.940000 msecs"
#3: "Elapsed time: 6.880000 msecs"
#4: "Elapsed time: 99.380000 msecs"
#5: "Elapsed time: 53.220000 msecs"

```

by

Impact on truly (unchunked) lazy-seqs is neglibile (hard to measure due to garbage collection causing a lot of variance).

by

New ticket for reduce for ChunkedCons: CLJS-2468

by

Depends on CLJS-2469 for tests to pass. Patch is attached.

by

ChunkedSeq's already implements an efficient reduce stategy. I believe the issue is that LazySeq which is just a container that may have a different type of seq inside, is always calling `seq-reduce` on that inner sequence. Sometimes, such as in the case of ChunkedSeq, the inner sequence actually knows how to reduce itself best.

So I think the solution is as simple as changing the body of LazySeq.IReduce.-reduce from: {{(seq-reduce f init coll)}} to {{(reduce f init (seq coll)}}. The key here, is calling seq pulls out the inner seq if there is one, and then reduce will call the best pathway for it.

Hope this helps.

by

I don't think that's a good idea. You can easily blow the stack with that approach. Though, I haven't tried it.

by

One thing worth considering, the patch improves a particular case, ChunkedSeqs. seqs of hash-maps, IndexedSeqs, Ranges, are not handled. Dispatching, to {{reduce}} would fix this. You may have a point about blowing the stack, I haven't experimented though.

For example, {{(reduce + (range 10000))}} is much faster than {{(reduce + (lazy-cat (range 10000))}}, but making LazySeq's reduce call reduce on the inner collection should fix this issue.
by

Thomas, happy to bounce ideas back and forth. Feel free to compare your ideas to my patch with perf measures. The problem with calling reduce on the inner fn is the stack and also the non-preserving reduce. And wrapping the reducing function is pretty expensive. I tried that and it's much slower. IIRC that's why I did it that way. But again, need to be very careful not to blow the stack. That'd be a huge problem for users.