Welcome! Please see the About page for a little more info on how this works.

0 votes
in Collections by

Vectors of primitives produced by {{vector-of}} do not support transients.

core.rrb-vector implements transient support for vectors of primitives. Such transient-enabled vectors of primitives can be obtained in a number of ways: (1) using a gvec instance as an argument to {{fv/catvec}} (if RRB concatenation happens, which is not guaranteed) or {{fv/subvec}}; (2) passing a gvec instance to {{fv/vec}}, which as of core.rrb-vector 0.0.11 will simply rewrap the gvec tree in an RRB wrapper; (3) using {{fv/vector-of}} instead of {{clojure.core/vector-of}}. Native support in gvec would still be useful as part of an effort to make supported functionality consistent across vector flavours (see CLJ-787 in this connection); gvec is also simpler and still has (and is likely to maintain) a performance edge.

A port of core.rrb-vector's transient support to gvec is available here:

https://github.com/michalmarczyk/clojure/tree/transient-gvec

I'll bring it up to date with current master shortly.

See the clojure-dev thread for some benchmarks:

https://groups.google.com/d/msg/clojure-dev/9ozYI1e5SCM/BAIazVOkUmcJ

10 Answers

0 votes
by

Comment made by: michalmarczyk

Here's the current version of the patch (0001-CLJ-1416-transients-hash-caching-for-gvec-Object-met.patch). It includes a few additional changes -- here's the commit message:

{quote}
CLJ-1416: transients, hash caching for gvec, Object methods for gvec seqs

  • Adds transient support to gvec
  • Adds hash{eq,Code} caching to gvec and gvec seqs
  • Implements hashCode, equals, toString for gvec seqs
    {quote}

https://github.com/michalmarczyk/clojure/tree/transient-gvec-1.6

0 votes
by

Comment made by: michalmarczyk

Here's an updated patch with some additional interop-related improvements.

The new commit message:

{quote}
CLJ-1416: transients, hash caching, interop improvements for gvec

  • Adds transient support to gvec
  • Adds hash{eq,Code} caching to gvec and gvec seqs
  • Implements hashCode, equals, toString for gvec seqs
  • Correctly implements iterator-related methods for gvec and gvec seqs
  • Introduces throw-unsupported and caching-hash (both marked private)
    {quote}
0 votes
by

Comment made by: jafingerhut

Patch 0002-CLJ-1416-transients-hash-caching-interop-improvement.patch dated Jul 5 2014 no longer applied cleanly to latest master after some commits were made to Clojure on Aug 29, 2014. It did apply cleanly before that day.

I have not checked how easy or difficult it might be to update this patch. See section "Updating Stale Patches" on this wiki page for some tips on updating patches: http://dev.clojure.org/display/community/Developing Patches

0 votes
by

Comment made by: michalmarczyk

Patch updated to apply cleanly to master.

0 votes
by

Comment made by: bbloom

Maybe this should be another ticket, but it would affect this patch, so I'll mention it here:

The ArrayManager interface is an incomplete abstraction. The original gvec code plus the new transients codepaths rely on System/arraycopy, rather than .arraycopy on the manager object. This means that it's impossible to create gvecs backed by non-JVM arrays. Or, in my case, to create a gvec of nibbles backed by an array of longs. See https://gist.github.com/brandonbloom/441a4b5712729dec7467

0 votes
by
_Comment made by: bbloom_

The current patch has a bug on line 762:

    (let [node ^clojure.core.VecNode (.ensureEditable this node)

There is no such signature, only these:

  (ensureEditable [this]
  (ensureEditable [this node shift]

I discovered this problem using https://github.com/ztellman/collection-check
0 votes
by

Comment made by: michalmarczyk

Thanks for the catch! Fixed patch attached. (There was in fact one more bug in editableArrayFor, also fixed in this version.)

0 votes
by

Comment made by: michalmarczyk

As for gvecs of nibbles, could that be a separate ticket with patches building on top of this one?

On a separate note, core.rrb-vector could support vectors of nibbles as an extra feature (and adopt built-in gvec's representation if indeed the built-in gvec comes to support this feature at some point). Do you think that'd be useful?

0 votes
by
_Comment made by: michalmarczyk_

Of course vectors of nibbles could be implemented today with a separate vector type wrapping a gvec of longs, but the implementation would be more involved. I wonder what kind of performance difference there would be between the wrapper approach and the "nibble AM" approach…
0 votes
by
Reference: https://clojure.atlassian.net/browse/CLJ-1416 (reported by michalmarczyk)
...