Chunked sequential processing shows a general improvement when invoking {{reduce}} on {{ArrayChunk}} instead of the current {{dotimes}}. The functions affected are {{map}}, {{filter}} and {{keep}}. The following table shows the relevant benchmarks in ms. Numbers in [brackets] are worse than the original (although by a small margin). The best overall improvement is seen on long ranges.
*long range*
|| f || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
| {{(map inc lr)}} | 3.75 | 2.88 | 2.15 | 2.06 |
| {{(keep identity lr)}}| 2.56 | 2.16 | 0.75 | 0.72 |
| {{(filter odd? lr)}} | 2.77 | 2.20 | 1.53 | 1.45 |
*range*
|| f || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
| {{(map inc lr)}} | 3.64 | [3.70] | 2.32 | [2.50] |
| {{(keep identity lr)}}| 2.10 | 1.94 | 0.56 | 0.46 |
| {{(filter odd? lr)}} | 1.95 | [1.99] | 1.19 | [1.66] |
*vector*
|| f || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
| {{(map inc lr)}} | 3.81 | 3.68 | 2.44 | 2.15 |
| {{(keep identity lr)}}| 2.03 | [2.16] | 0.53 | 0.46 |
| {{(filter odd? lr)}} | 2.08 | [2.82] | 1.67 | 1.39 |
*gvec*
|| f || before(doall)|| after(doall)|| before(chunk-last)|| after(chunk-last)||
| {{(map inc lr)}} | 3.69 | [3.83] | 1.46 | 1.35 |
| {{(keep identity lr)}}| 2.86 | 2.82 | 2.44 | 2.52 |
| {{(filter odd? lr)}} | 2.95 | 2.70 | 2.08 | 2.07 |
All benchmarks executed with Criterium "bench" on a freshly started JVM discarding results with big outlier variance. The general bench template used has the form: {{(let [xs chunked-seq] (bench (doall (f xs))))}} where:
* "chunked-seq" is one of: {{(range 100000)}}, {{(range 1e5)}}, {{(vec (range 1e5))}} or {{(into (vector-of :int) (range 1e5))}}
* "doall" is either {{doall}} or {{chunk-last}} (see below for definition)
* "f" is one of: {{(map inc xs)}}, {{(filter odd? xs)}} or {{(keep identity xs)}}.
Observations:
* The more and larger the chunks the more the benefits. Custom (chunked) sequences with larger chunks (than 32) could additionally benefit from the changes.
* The same change makes things worse with {{keep-indexed}} and {{map-indexed}}, so they have not being changed.
* {{for}} macro is also chunk-aware, but it uses an explicit loop to handle {{:let, :when, :while}} cases that is difficult to separate from the chunk-buffer changes.
* {{chunk-last}} is a chunk-aware function to access the last element by chunk. Compared to {{doall}} (that walks the sequence one by one), {{chunk-last}} is more efficient for both before code and after changes code. The function is: {{(defn chunk-last [xs] (when-let [xs (seq xs)] (if-let [cn (chunk-next xs)] (recur cn) (last xs))))}}
* The initial ad-hoc pre-definition of {{dotimes}} before the definition of {{{map}}} can be removed from core. This is included in the patch.