From match.clj:

;; IMPORTANT NOTE: this calculation is very very slow,

;; we should look at this more closely - David

This has to do with computing the necessary column, ie, the column that contains the most number of matches we will have to do. This patch simplifies this logic which is spread out over several functions and puts it into one location. Further, it improves complexity. When computing the necessity of a column, only patters above wildcards in a column need to be computed. the way the previous implementation worked was to compute the value for each entry in the pattern matrix and then check above each entry to see if there were wildcards. Now, we just group by column and then reduce on the column, adding up the same values as last time and immediately returning the accumulator with `(reduced acc)`

if we hit a wildcard. This saves the repeated traversal of each column for each entry. A nice benefit is locality of everything inside of a single function.

I've timed this against master with the entries from `bench.clj`

and was actually a little surprised to see that there was no time speedup, however.