Why do r/foldcat return types give inconsistent results when passed directly to `vec`?

Question

Why do r/foldcat return types give inconsistent results when passed directly to `vec`?

asked May 30, 2023 in Sequences by Davis Shepherd
edited May 31, 2023 by Davis Shepherd

When using reducers, I will occasionally use foldcat to "realize" the transformed collection through the reducer chain. For downstream code, its preferable to have a fully realized standard clojure collection like vec. This is often the case when updating core collection code to use reducers for performance reasons:

Note the following works just fine:

(->> (repeat 2 3) (vec) (r/map inc) (r/foldcat) (vec))
=> [4 4]

If the input gets large enough however, this triggers the parallel behavior which changes the return type from Array to Cat and causes the last vec to choke:

(->> (repeat 513 3) (vec) (r/map inc) (r/foldcat) (vec))
Execution error at user/eval19976 (form-init6687476487684153971.clj:1).
Unable to convert: class clojure.core.reducers.Cat to Object[]

I surmise that the "correct" way to do this is to use (into []) instead of vec in the last position which works just fine.

That said, this came as fairly surprising behavior given that seq and into work just fine on both Array and Cat. Its also quite dangerous because such code can function quietly in production until the input data grows passed a certain threshold. Is there pragmatic reason that seq works for Cat but vec doesn't?

1 Answer

alexmiller · Answer 1 · 2023-05-30T18:42:27+0000

Seems like there is a missing case not being covered here - vec ultimately delegates to LazilyPersistentVector.create() which is catching none of the covered cases and falling into trying array adoption.

The case to PV creation available (since 1.7) are:

IReduceInit - self reduction
ISeq - traversal via seq
Iterable - traversal via iteration
Object array adoption

The Cat object is reducible (via the CollReduce protocol which is not available in LPV as it's written in Java), seqable (but not a seq), and foldable. Cat pre-dated the existence of IReduceInit and the vec (CLJ-1546), both in 1.7.0, but trying with older releases, seems like this has always been broken.

Seems like either Cat or LPV should do something more such that Cat be vec'ed. Would need some eval to determine which is a better path. My first impression is that Cat does what it advertises, and LPV is the one missing a case.

Logged as https://clojure.atlassian.net/browse/CLJ-2785.

Why do r/foldcat return types give inconsistent results when passed directly to `vec`?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Categories

Why do r/foldcat return types give inconsistent results when passed directly to `vec`?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Categories