Welcome! Please see the About page for a little more info on how this works.

+2 votes
in Sequences by
edited by

When using reducers, I will occasionally use foldcat to "realize" the transformed collection through the reducer chain. For downstream code, its preferable to have a fully realized standard clojure collection like vec. This is often the case when updating core collection code to use reducers for performance reasons:

Note the following works just fine:

(->> (repeat 2 3) (vec) (r/map inc) (r/foldcat) (vec))
=> [4 4]

If the input gets large enough however, this triggers the parallel behavior which changes the return type from Array to Cat and causes the last vec to choke:

(->> (repeat 513 3) (vec) (r/map inc) (r/foldcat) (vec))
Execution error at user/eval19976 (form-init6687476487684153971.clj:1).
Unable to convert: class clojure.core.reducers.Cat to Object[]

I surmise that the "correct" way to do this is to use (into []) instead of vec in the last position which works just fine.

That said, this came as fairly surprising behavior given that seq and into work just fine on both Array and Cat. Its also quite dangerous because such code can function quietly in production until the input data grows passed a certain threshold. Is there pragmatic reason that seq works for Cat but vec doesn't?

1 Answer

+2 votes

Seems like there is a missing case not being covered here - vec ultimately delegates to LazilyPersistentVector.create() which is catching none of the covered cases and falling into trying array adoption.

The case to PV creation available (since 1.7) are:

  • IReduceInit - self reduction
  • ISeq - traversal via seq
  • Iterable - traversal via iteration
  • Object array adoption

The Cat object is reducible (via the CollReduce protocol which is not available in LPV as it's written in Java), seqable (but not a seq), and foldable. Cat pre-dated the existence of IReduceInit and the vec (CLJ-1546), both in 1.7.0, but trying with older releases, seems like this has always been broken.

Seems like either Cat or LPV should do something more such that Cat be vec'ed. Would need some eval to determine which is a better path. My first impression is that Cat does what it advertises, and LPV is the one missing a case.

Logged as https://clojure.atlassian.net/browse/CLJ-2785.

Thank you for the detailed explainer @alexmiller! I’ll be curious to learn which approach y’all take.