Functions folder and reducer broken for types nil and array + fix for typo

Question

Functions folder and reducer broken for types nil and array + fix for typo

asked Dec 29, 2013 in ClojureScript by jira

This currently doesn't work:

(->> nil

 (r/map identity)
 (r/reduce + 0))
 ; org.mozilla.javascript.JavaScriptException: Error: No protocol method IReduce.-reduce defined for type null

The reason for this is that reducers created by r/reducer or r/folder, invoke -reduce (of IReduce) directly. They thereby bypass the special case for nil in the function r/reduce.

An entirely analogous problem exists for collections of type array.
The patch CLJS-700 mistakenly defined coll-fold for the type cljs.core/IPersistentVector. This should have been cljs.core/PersistentVector. (There exists no protocol IPersistentVector in ClojureScript.)

I will shortly attach a patch that addresses all of the above problems by implementing IReduce for nil and array. The patch also includes unit tests.

18 Answers

Page:

jira · Answer 1 · 2013-12-29T20:22:19+0000

Comment made by: jdevuyst

Alternative patch in which r/reduce and r/fold treat arrays and nil as special cases -- as opposed to having arrays and nil implement IReduce and CollFold.

The functions r/reducer, r/folder, and the protocol methods of r/Cat now call r/reduce and r/fold instead of calling -reduce and coll-fold directly.

This patch also fixes a bug in the coll-fold implementation for Cat, which previously used (reducef) as the initial value rather than (combinef). The new code is copied and pasted from the Clojure implementation and uses the fork-join stubs.

jira · Answer 2 · 2013-12-30T14:23:26+0000

_Comment made by: dnolen_

The {{implements?}} should probably be a {{satisfies?}} in the second patch. Have you run any benchmarks of before/after the patch?

jira · Answer 3 · 2013-12-30T17:24:25+0000

_Comment made by: jdevuyst_

If I understand correctly then {{(satisfies? x y)}} is roughly equivalent to {{(or (implements? x y) (natively-satisfies? x y))}}.

If native types (nil, array, object currently) are treated as special cases then {{implements?}} seems more appropriate.

{{satisfies?}} works also, however, so I have attached a new 'alt' patch.

jira · Answer 4 · 2013-12-30T17:26:05+0000

Comment made by: jdevuyst

The first patch is in fact faster when running the following code:

(time (->> (repeat 1000 (vec (range 1000)))
           vec
           (r/mapcat identity)
           (r/map inc)
           (r/filter even?)
           (r/fold +)))

This takes about 700 msecs. Using the first patch this terminates 100-300 msecs faster. This is after repeated (but informal) testing.

I guess the worry is that the first patch would slow down other random code since it involves extending the types nil, array, and object. I'm not sure what exactly I should test for though.

(Note that the 2nd and 3rd patch also contain a fix for {{Cat}} and include more unit tests. The first patch should preferably not be applied as-is.)

jira · Answer 5 · 2013-12-30T17:35:40+0000

Comment made by: dnolen

Yeah you're timing too many things, including {{vec}}, {{range}}, lazy sequences. Also testing a small N. Take a look at the reducers example on the Mori README - https://github.com/swannodette/mori. Thanks.

jira · Answer 6 · 2013-12-30T18:52:32+0000

_Comment made by: jdevuyst_

I tried running the following code:

(let [coll (vec (repeat 1000 (vec (range 10))))]
  (time (doseq [n (range 1000)]
               (->> coll
                    (r/mapcat identity)
                    (r/map inc)
                    (r/filter even?)
                    (r/fold +)))))

Some of the last results I got were:

1st patch: 75680 msecs
2nd patch: 76585 msecs

Truth be told, although the first patch seemed to win most of the times, sometimes the second patch was faster.

One other thing I tried was removing the {{implements?}}/{{satisfies?}} check from the second patch and overriding the protocol method coll-fold for the type {{object}} instead (as in the first patch). This 'hybrid' approach generally (but not always) seemed to result in a slowdown.

I'm not sure how I should proceed. Should I perhaps just run both patches simultaneously for several minutes?

jira · Answer 7 · 2013-12-30T19:21:17+0000

Comment made by: dnolen

This is still a bad way to do timing, you're recording the cost of range and seq'ing. Use {{dotimes}}.

jira · Answer 8 · 2013-12-30T22:33:14+0000

Comment made by: jdevuyst

Hm. I guess the lazy sequence does lead to a lot of allocations.

Alright, I rewrote my test and ran it a few more times. I now also tested on both vectors and arrays.

Patch 1 needed a slight tweak. When coll-fold is invoked, patch 1 only specifies a fallback for type {{object}} (i.e. {{r/reduce}} is called). I had to add the same fallback for type {{array}}. (This is weird!)

So here are the results.

For vectors:

`
(let [coll (vec (repeat 100 (vec (range 100))))]
(time (dotimes [n 3000]

      (->> coll
          (r/mapcat identity)
          (r/map inc)
          (r/filter even?)
          (r/fold +)))))

`

Patch 1: 205872 msecs
Patch 2: 210756 msecs

For arrays:

`
(let [coll (into-array (repeat 100 (into-array (range 100))))]
(time (dotimes [n 3000]

      (->> coll
          (r/mapcat identity)
          (r/map inc)
          (r/filter even?)
          (r/fold +)))))

`

Patch 1: 123567 msecs
Patch 2: 119704 msecs

I ran my tests a few times and the results were pretty consistent. Patch 1 is faster for vectors and patch 2 is faster for arrays.

This makes sense.

In patch 1 {{reducer}} will call {{-reduce}} directly. In patch 2, {{reducer}} first calls {{r/reduce}}, which calls {{-reduce}} if the collection is a vector and {{array-reduce}} if it's an array. Hence patch 2 contains an extra function call in the case of vectors, but avoids invoking a protocol method on a native type in the case of arrays.

Using macros (or copy and paste) the extra function call can be avoided. Would that be worth trying or is it more important to keep the code clean?

--

I just realized that patch 2 is semantically slightly different from what Clojure does, although perhaps this is a bug in Clojure: https://groups.google.com/forum/#!searchin/clojure-dev/kv-reduce/clojure-dev/bEqECvbExGo/iW4B2vEUh8sJ. My suggestion to use a macro (or copy and paste) to avoid the extra function call in patch 2, could also fix this discrepancy.

jira · Answer 9 · 2013-12-30T22:42:14+0000

Comment made by: dnolen

How are you benchmarking this? With V8? JavaScriptCore? SpiderMonkey? In the browser? What optimization settings, etc.

jira · Answer 10 · 2013-12-30T22:48:05+0000

Comment made by: jdevuyst

I used repljs (Rhino?). I'll test again in a more realistic setting tomorrow.

jira · Answer 11 · 2013-12-30T22:54:36+0000

Comment made by: dnolen

Yeah, benchmarking with Rhino isn't informative.

jira · Answer 12 · 2013-12-31T07:40:32+0000

Comment made by: jdevuyst

I compiled the same code (with n=3000) using {{cljs}} with {{"{:optimizations :advanced}"}}.

I then tested it in the latest stable releases of Firefox, Chrome, and Safari. I closed all my browsers. For each browser I then followed the following procedure:

Open the browser
Open the developer console
Run the benchmark for patch 1
Run the benchmark for patch 2
Run the benchmark for patch 1 and write down the result
Run the benchmark for patch 2 and write down the result
Close the browser

Firefox:
- Patch 1. Vectors: 26057 msecs
- Patch 1. Arrays: 25026 msecs
- Patch 2. Vectors: 26258 msecs
- Patch 2. Arrays: 36653 msecs
- Summary: Patch 1 is faster for vectors and arrays

Chrome:
- Patch 1. Vectors: 7804 msecs
- Patch 1. Arrays: 7092 msecs
- Patch 2. Vectors: 7754 msecs
- Patch 2. Arrays: 6768 msecs
- Summary: Patch 2 is faster for vectors and arrays

Safari:
- Patch 1. Vectors: 167230 msecs
- Patch 1. Arrays: 108780 msecs
- Patch 2. Vectors: 173940 msecs
- Patch 2. Arrays: 110012 msecs
- Summary: Patch 1 is faster for vectors and arrays

I'm not sure what to make of this.

jira · Answer 13 · 2013-12-31T08:47:41+0000

Comment made by: jdevuyst

I have attached a new version of the first patch.

This patch fixes an issue with {{r/Cat}}. (This issue was also addressed in the second and third patch. A unit test is included.).

This patch also fixes {{r/fold}} for arrays.

To summarize, a choice needs to be made between the following patches.

CLJS-736-patch-1-redux.patch
CLJS-736-alt.patch (uses implements?) / CLJS-alt-satisfies.patch (uses satisfies?)

The implementation details are patch-1-redux is more similar in spirit to the Clojure source code. The alt patches are more similar in spirit to the ClojureScript source code.

As explained above, the alt patches are semantically a bit different from the original Clojure source—but it's not clear which behavior is 'right'.

jira · Answer 14 · 2014-01-16T23:27:39+0000

Comment made by: dnolen

The benchmarks would be more informative if they explained the performance before and after that patch.

jira · Answer 15 · 2014-01-18T17:55:19+0000

Comment made by: jdevuyst

{{r/reduce}} previously didn't work for nil or JavaScript arrays.

One reason why I have trouble recommending a patch is that I don't know what use case you would like to optimize for.

Functions folder and reducer broken for types nil and array + fix for typo

Please log in or register to add a comment.

Please log in or register to answer this question.

18 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Categories

Functions folder and reducer broken for types nil and array + fix for typo

Please log in or register to add a comment.

Please log in or register to answer this question.

18 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions

Categories