Welcome! Please see the About page for a little more info on how this works.

0 votes
in Clojure by
Chunked sequential processing shows a general improvement when invoking {{reduce}} on {{ArrayChunk}} instead of the current {{dotimes}}. The functions affected are {{map}}, {{filter}} and {{keep}}. The following table shows the relevant benchmarks in ms. Numbers in [brackets] are worse than the original (although by a small margin). The best overall improvement is seen on long ranges.

*long range*
|| f                    || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
 | {{(map inc lr)}}      | 3.75          | 2.88         | 2.15               | 2.06              |
 | {{(keep identity lr)}}| 2.56          | 2.16         | 0.75               | 0.72              |
 | {{(filter odd? lr)}}  | 2.77          | 2.20         | 1.53               | 1.45              |

*range*
|| f                    || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
 | {{(map inc lr)}}      | 3.64          | [3.70]       | 2.32               | [2.50]            |
 | {{(keep identity lr)}}| 2.10          | 1.94         | 0.56               | 0.46              |
 | {{(filter odd? lr)}}  | 1.95          | [1.99]       | 1.19               | [1.66]            |

*vector*
|| f                    || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
 | {{(map inc lr)}}      | 3.81          | 3.68         | 2.44               | 2.15              |
 | {{(keep identity lr)}}| 2.03          | [2.16]       | 0.53               | 0.46              |
 | {{(filter odd? lr)}}  | 2.08          | [2.82]       | 1.67               | 1.39              |

*gvec*
|| f                    || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
 | {{(map inc lr)}}      | 3.69          | [3.83]       | 1.46               | 1.35              |
 | {{(keep identity lr)}}| 2.86          | 2.82         | 2.44               | 2.52              |
 | {{(filter odd? lr)}}  | 2.95          | 2.70         | 2.08               | 2.07              |

All benchmarks executed with Criterium "bench" on a freshly started JVM discarding results with big outlier variance. The general bench template used has the form: {{(let [xs chunked-seq] (bench (doall (f xs))))}} where:

* "chunked-seq" is one of: {{(range 100000)}}, {{(range 1e5)}}, {{(vec (range 1e5))}} or {{(into (vector-of :int) (range 1e5))}}
* "doall" is either {{doall}} or {{chunk-last}} (see below for definition)
* "f" is one of: {{(map inc xs)}}, {{(filter odd? xs)}} or {{(keep identity xs)}}.

Observations:

* The more and larger the chunks the more the benefits. Custom (chunked) sequences with larger chunks (than 32) could additionally benefit from the changes.
* The same change makes things worse with {{keep-indexed}} and {{map-indexed}}, so they have not being changed.
* {{for}} macro is also chunk-aware, but it uses an explicit loop to handle {{:let, :when, :while}} cases that is difficult to separate from the chunk-buffer changes.
* {{chunk-last}} is a chunk-aware function to access the last element by chunk. Compared to {{doall}} (that walks the sequence one by one), {{chunk-last}} is more efficient for both before code and after changes code. The function is: {{(defn chunk-last [xs] (when-let [xs (seq xs)] (if-let [cn (chunk-next xs)] (recur cn) (last xs))))}}
* The initial ad-hoc pre-definition of {{dotimes}} before the definition of {{{map}}} can be removed from core. This is included in the patch.

8 Answers

0 votes
by

Comment made by: alexmiller

Can you squash the patch?

0 votes
by

Comment made by: reborg

Sure done.

0 votes
by

Comment made by: alexmiller

Thanks!

0 votes
by

Comment made by: reborg

New in CLJ-2346-2.patch: changed interop invocation {{.count}} to normal {{count}} after seeing slight improvement in speed. Updated the table to reflect new benchmarks.

0 votes
by

Comment made by: alexmiller

Is there a reason to .reduce vs reduce?

0 votes
by

Comment made by: reborg

Because it's the reduce on IChunk which is not exposed through normal coll-reduce.

0 votes
by

Comment made by: alexmiller

It could be if IChunk extended IReduceInit.

0 votes
by
Reference: https://clojure.atlassian.net/browse/CLJ-2346 (reported by reborg)
...