Welcome! Please see the About page for a little more info on how this works.

+9 votes
in Clojure by

Whilst doing performance work recently, I discovered that unrolling to single assoc calls were significantly faster than using multiple keys (~10% for my particular application). Zachary Tellman then pointed out that clojure.core doesn't unroll assoc at all, even for the case of relatively low numbers of keys.

We already unroll other performance critical functions that call things via apply, e.g. update https://github.com/clojure/clojure/blob/master/src/clj/clojure/core.clj#L5914, but assoc (which is, I think in the critical path for quite a bunch of applications and libraries), would likely benefit from this.

I have not yet developed patches for this, but I did some standalone benchmarking work:

https://github.com/yeller/unrolling-assoc-benchmarks

benchmark results:

code: https://github.com/yeller/unrolling-assoc-benchmarks/blob/master/src/bench_assoc_unrolling.clj

| |1 |2 |3 |4 |
| :-- | :-- | :-- | :-- | :-- |
| empty array map (not unrolled) | 23ns | 93ns | 156ns | 224ns |
| empty array map (unrolled assoc) | N/A | 51ns | 80ns | 110ns |
| | | | | |
| 20 element persistent hashmap (not unrolled) | 190ns | 313ns | 551ns | 651ns |
| 20 element persistent hashmap (unrolled assoc) | N/A | 250ns | 433ns | 524ns |
| | | | | |
| record (not unrolled) | 12ns | 72ns | 105ns | 182ns |
| record (unrolled assoc) | N/A | 21ns | 28ns | 41ns |

Each measurement was made in a separate JVM, to avoid JIT path dependence.

Benchmarks were ran on a commodity server (8 cpus, 32gb ram), with ubuntu 12.04 and a recent release of Java 8. Attached are cpuinfo, uname and java -version output.

Relatively standard JVM production flags were enabled, and care was taken to disable leiningen's startup time optimizations (which disable many of the JIT optimizations).

Benchmarks can be run by cloning the repository, and running script/bench

There's one outstanding question for this patch: How far should we unroll these calls? update (which is unrolled in the 1.7 alphas) is unrolled to 3 arguments. Adding more unrolling isn't difficult, but it does impact the readability of assoc.

Patch: CLJ-1656-v5.patch

25 Answers

0 votes
by

Comment made by: alexmiller

Yes, inlining works through var invocation.

0 votes
by

Comment made by: tcrayford

Michael,

That group is just an uploaded version of clojure master with your patches applied, built in just the same way as before (you should be able to check out the repo and replicate).

0 votes
by

Comment made by: alexmiller

The patch CLJ-1656-v5.patch doesn't seem to do anything with the old version of assoc (in core.clj around line 179)?

The new one needs to have the arglists and other stuff like that. I'm not sure about the macro/private vars in there either. Did you try leveraging RT.assocN() with a vector?

Are there existing tests in the test suite for assoc with N pairs?

0 votes
by

Comment made by: michaelblume

The dependencies in clojure.core were such that assoc needed to be defined well before syntax-quoting, so I just let it be defined twice, once slower, once faster. I'll put up a patch with arglists. Does it need an arglist for every new arity, or are the existing arglists enough? (I'm afraid I'm not 100% solid on what the arglists metadata does) There is an annoying lack of existing tests of assoc. I have a generative test in my tree because that seemed more fun than writing cases for all the different arities. I can post it if it seems useful, it might be overkill though.

0 votes
by

Comment made by: michaelblume

Here's the test patch I mentioned, it's even more overkill than I remembered

0 votes
by

Comment made by: michaelblume

There, code and test.

This also checks that assoc! is passed an even number of kvs in the varargs case, which is the behavior of assoc. The test verifies that both assoc and assoc! throw for odd arg count.

0 votes
by

Comment made by: alexmiller

The existing arglist is fine - it just overrides the generated one for doc purposes.

Did you try any of the RT.assocN() stuff?

I guess another question I have is whether people actually do this enough that it matters? :)

0 votes
by

Comment made by: michaelblume

Updated patch to apply to master

0 votes
by

Comment made by: alexmiller

This still needs assessment on frequency of different counts. Based on some very quick checking, passing two kvs is common enough to matter. Passing three is far less common and so on. But I'd love to see some rough idea of frequency from running some stuff. I think the current unrolling is too much (2-3 kvs is probably sufficient).

0 votes
by
Reference: https://clojure.atlassian.net/browse/CLJ-1656 (reported by tcrayford)
...