Share your thoughts in the 2024 State of Clojure Survey!

Welcome! Please see the About page for a little more info on how this works.

+1 vote
in Clojure by
(let [coll [{:a 1 :b 2}
            {:a 3 :b 4}
            {:a 7 :b 5}
            {:a 1 :b 4}]
      xf1 (comp (map :a) (filter #{1 3}))
      xf2 (comp (map :b) (filter #{2 4}))]
  (concat
    (into [] xf1 coll)
    (into [] xf2 coll)))

is there a way to combine the two xforms above (xf1 and xf2) into a single xform somehow so the collection coll will be traversed only once?

4 Answers

0 votes
by

not really a combination of transducers, but:

(let [coll [{:a 1 :b 2}
            {:a 3 :b 4}
            {:a 7 :b 5}
            {:a 1 :b 4}]
      xf1 #(-> % :a #{1 3})
      xf2 #(-> % :b #{2 4})] 
  (into []
    (comp 
      (mapcat (juxt xf1 xf2))
      (remove nil?))
    coll))
=> [1 2 3 4 1 4]
by
nice try, but I'm looking for a way to combine transducers...
0 votes
by
  (let [coll [{:a 1 :b 2}
              {:a 3 :b 4}
              {:a 7 :b 5}
              {:a 1 :b 4}]
        xf1 (comp (map :a) (filter #{1 3}))
        xf2 (comp (map :b) (filter #{2 4}))]
    (eduction cat [(eduction xf1 coll) (eduction xf2 coll)]))

eduction can be replaced with sequence

0 votes
by
(defn facet [m]
  (fn [f]
    (let [m (into {} (for [[k v] m]
                       [k (v f)]))]
      (fn
        ([accum]
         (reduce
          (fn [accum1 [k accum2]]
            (assoc accum1 k ((get m k) accum2)))
          accum
          accum))
        ([accum value]
         (reduce
          (fn [accum1 [k accum2]]
            (assoc accum1 k ((get m k) accum2 value)))
          accum
          accum))))))

(let [coll [{:a 1 :b 2}
            {:a 3 :b 4}
            {:a 7 :b 5}
            {:a 1 :b 4}]
      xf1 (comp (map :a) (filter #{1 3}))
      xf2 (comp (map :b) (filter #{2 4}))]
  ((comp (partial apply concat)
         (juxt :a :b))
   (transduce
    (facet {:a xf1
            :b xf1})
    conj
    {:a []
     :b []}
    coll)))
by
a good place to look for interesting transducers is https://github.com/cgrand/xforms and a good place for inspiration for different kinds of processing you can do with folds is https://github.com/aphyr/tesser
0 votes
by
edited by

Oops, I realized I didn't actually answer your question. I'm moving my original one into a comment.

Using cgrand/xforms, a possible answer could be:

(let [coll [{:a 1 :b 2}
            {:a 3 :b 4}
            {:a 7 :b 5}
            {:a 1 :b 4}]
      xf1 (comp (map :a) (filter #{1 3}))
      xf2 (comp (map :b) (filter #{2 4}))]
  (= (concat
      (into [] xf1 coll)
      (into [] xf2 coll))
     
     ;; do it in one pass
     (x/into []
       (comp (x/transjuxt [(comp xf1 (x/into [])) (comp xf2 (x/into []))])
             cat
             cat)
       coll)))
by
Good thing there's a search function around here. I recently ran into similar situations where I wanted to perform different transformations on a couple of input collections and end up with one collection, so I tried out a few different approaches.

Here's my REPL session for that. After some basic perf testing with criterium, I concluded that the solution threading a collection through multiple `into`-s is probably the most idiomatic and it isn't worse wrt performance than the `catduce` one.

    (def input1 (into [] (range 100000)))
    (def input2 (into [] (range 999 9999)))
    (def xf1 (filter odd?))
    (def xf2 (map inc))
    (def xf3 (take 100))
    (def xf4 (comp (filter even?) (map dec) (take 1000)))
  
    ;; concat (don not like that it is lazy and there are several intermediate colls allocated)
    (def result (into []
                      (concat
                       (into [] xf1 input1)
                       (into [] xf2 input1)
                       (into [] xf3 input2)
                       (into [] xf4 input2))))
  
    ;; cat (still allocates intermediate collections)
    (= result
       (into []
             cat
             [(into [] xf1 input1)
              (into [] xf2 input1)
              (into [] xf3 input2)
              (into [] xf4 input2)]))
  
    ;; intos (no intermediate allocations, but multiple transient/persistent switches)
    ;; - this should not be too much of a problem though as they both are O(1)
    (= result
       (-> []
           (into xf1 input1)
           (into xf2 input1)
           (into xf3 input2)
           (into xf4 input2)))
  
    ;; reduce into
    ;; - almost the same as the intos solution, without having to write into multiple times
    (= result
       (reduce
         #(into %1 (second %2) (first %2))
         []
         [[input1 xf1]
          [input1 xf2]
          [input2 xf3]
          [input2 xf4]]))
  
    ;; catduce
    ;; - tried to see if it can be done without the many transient/persistent switches
    (defn catduce
      "adopted from clojure.core/cat"
      [rf]
      (fn
        ([] (rf))
        ([result] (rf result))
        ([result input+xf]
         (transduce (second input+xf) rf result (first input+xf)))))
  
    (= result
       (into [] catduce
             [[input1 xf1]
              [input1 xf2]
              [input2 xf3]
              [input2 xf4]]))
...