# Can I make this routine for scoring a graph bisection more efficient?

My code is spending most of its time scoring bisections: determining how many edges of a graph cross from one set of nodes to the other.

Assume `bisect` is a set of half of a graph's nodes (ints), and `edges` is a list of (directed) edges `[ [n1 n2] ...]` where `n1,n2` are also nodes.

``````(defn tstBisectScore
"number of edges crossing bisect"
([bisect edges]
(tstBisectScore bisect 0 edges))

([bisect nx edge2check]
(if (empty? edge2check)
nx

(let [[n1 n2] (first edge2check)
inb1 (contains? bisect n1)
inb2 (contains? bisect n2)]
(if (or (and inb1 inb2)
(and (not inb1) (not inb2)))
(recur bisect nx (rest edge2check))
(recur bisect (inc nx) (rest edge2check))))

)))
``````

The only clues I have via sampling the execution of this code (using VisualVM) shows most of the time spent in `clojure.core\$empty_QMARK_`, and most of the rest in `clojure.core\$contains_QMARK_`. (`first` and `rest` take only a small fraction of the time.)

Any suggestions as to how I could tighten the code?

`(empty? x)` is just `(not (seq x))`, and that `seq` can potentially be costly. If you know for sure that `edge2check` is always, for example, a vector, then you can replace that `(empty? edge2check)` with `(zero? (count edge2check))`. Or an even lower-level `(.isEmpty ^java.util.List edge2check)`. But given that you call `rest` on it, it stops being a vector after `recur`, so you'd have to use `subvec` instead of `rest`. Or, alternatively, invert the order of the elements in that vector and call `peek` and `pop` instead of `first` and `rest`, respectively.

Similar thing with `contains?` - you can replace `(contains? bisect x)` with `(.contains ^clojure.lang.IPersistentSet bisect x)` if you always know for sure that `bisect` is a persistent set.

edited
thanks very much @Eugene, these seem like excellent suggestions.  i implemented both your peek/pop and IPersistentSet ideas:

(defn tstBisectScore2
([bisect edges]
(tstBisectScore2 bisect 0 (vec edges)))

([bisect nx edge2check]
(if (zero? (count edge2check))
nx

(let [[n1 n2] (peek edge2check)
inb1 (.contains ^clojure.lang.IPersistentSet bisect n1)
inb2 (.contains ^clojure.lang.IPersistentSet bisect n2)]
(if (or (and inb1 inb2)
(and (not inb1) (not inb2)))
(recur bisect nx       (pop edge2check))
(recur bisect (inc nx) (pop edge2check))))

)))

but timing old vs. new shows only slight (6%) improvement?

(time (dotimes [i 100000]
(hpclj.test/tstBisectScore1 rp g6fe)))
"Elapsed time: 6103.62275 msecs"

(time (dotimes [i 100000]
(hpclj.test/tstBisectScore2 rp g6fe)))
"Elapsed time: 5732.2645 msecs"

i know timing clojure has fine-points (startup?), so maybe this simple experiment is insufficient?  thanks again in any case.
For benchmarks, it's better to use something like Criterium. But that beside, that call to `vec` is probably what's taking time now. What I meant in my original answer, is that `g6fe` should be a vector in the first place, if possible.
no, the `g6fe` comes as list of edges (from the loom graph package).  i've not played with Criterium before but will try that.
curiouser and curiouser!  criterium makes your ideas seem slightly (3%) SLOWER:

(criterium/quick-bench (tstBisectScore1 rp g6fe))

Evaluation count : 11148 in 6 samples of 1858 calls.
Execution time mean : 54.087476 µs
Execution time std-deviation : 387.196475 ns
Execution time lower quantile : 53.651538 µs ( 2.5%)
Execution time upper quantile : 54.485682 µs (97.5%)