Since Clojure 1.5 will probably no longer target JVM 5, add support for the (Concurrent)?Navigable(Map|Set) interfaces. This should involve adding functions to PersistentTreeMap that descend the red-black tree to select the closest items.

Welcome! Please see the About page for a little more info on how this works.

0 votes

*Comment made by: michalmarczyk*

Just wanted to note that (link: https://github.com/clojure/data.avl text: data.avl) offers this capability, although with a Clojure interface. (Implementing descendingMap would be straightforward, but it would likely complicate the implementation of some features which I think may be more compelling. I can go into more detail if desired, though perhaps it'd be best to discuss that on data.avl's tracker.)

Implementing the entirety of NavigableMap in PersistentTreeMap would be trickier than it might seem, since presumably we'd want to maintain the property of count being O(1) on the resulting structure. (It's possible to do away with this requirement, of course, but see below.)

data.avl collections store rank information in nodes; this enables rank queries and the navigable map operations with O(1) count on their return values, but it comes at a cost in node size. I think it's useful to have the compact PTM and the more feature-packed, but node-heavy data.avl available for different use cases, particularly since data.avl is very fast for basic operations (as in, completely on a par with PTM for access to a node through a path of the same shape; some individual tree operations may be a little slower, but then AVL trees tend to be shallower).

0 votes

*Comment made by: jafingerhut*

I haven't checked in detail, but it seems like it may be true that NavigableMap and NavigableSet interfaces could be implemented using the existing clojure.core/subseq and clojure.core/rsubseq functions (plus first, seq, etc.)

0 votes

_Comment made by: michalmarczyk_

{{subseq}} and {{rsubseq}} + {{first}} solve the "neightbour lookup" part of {{NavigableMap}}, but not the submap methods.

Subsetting red-black trees in logarithmic time is entirely possible, but, as mentioned in my previous comment, not if you want to maintain the property that {{count}} on the resulting structure is, if not O(1), then at least significantly better than O\(n). This is because there is no way of knowing how many nodes are contained in any subrange of a BST without actually walking it or storing annotations in (at least some of) the nodes ahead of time.

I have described a possible workaround in my Conj talk last November and I have a sketch lying around somewhere, but it's not going to offer the type of convenience that data.avl does – the limitation described above is fundamental; it could be workable for use cases where {{count}} is (almost) never called.

As a side note, data.avl uses an implementation of neighbour lookup based on {{subseq}} et al. in its test suite, as a sanity check on the direct implementation used by {{clojure.data.avl/nearest}}. There's also an implementation of {{subrange}} (the universal subsetting function for data.avl collections) based on {{subseq}} in there; as you'd expect, it's extremely slow for all but the smallest output sizes.

{{subseq}} and {{rsubseq}} + {{first}} solve the "neightbour lookup" part of {{NavigableMap}}, but not the submap methods.

Subsetting red-black trees in logarithmic time is entirely possible, but, as mentioned in my previous comment, not if you want to maintain the property that {{count}} on the resulting structure is, if not O(1), then at least significantly better than O\(n). This is because there is no way of knowing how many nodes are contained in any subrange of a BST without actually walking it or storing annotations in (at least some of) the nodes ahead of time.

I have described a possible workaround in my Conj talk last November and I have a sketch lying around somewhere, but it's not going to offer the type of convenience that data.avl does – the limitation described above is fundamental; it could be workable for use cases where {{count}} is (almost) never called.

As a side note, data.avl uses an implementation of neighbour lookup based on {{subseq}} et al. in its test suite, as a sanity check on the direct implementation used by {{clojure.data.avl/nearest}}. There's also an implementation of {{subrange}} (the universal subsetting function for data.avl collections) based on {{subseq}} in there; as you'd expect, it's extremely slow for all but the smallest output sizes.