Critiques of FSet (Modern Common Lisp with FSet)

Next: History of Functional Collections, Previous: Features of FSet, Up: Recommendations for Language Designers [Contents][Index]

10.2 Critiques of FSet ¶

Before I critique other languages and libraries, I think it’s only fair to say a little about what aspects of FSet I would not encourage others to duplicate.

WB-trees are obsolete ¶

Perhaps the most prominent is that weight-balanced trees (WB-trees) are considered obsolete; the consensus seems to be that, even among ordered tree data structures, there are alternatives that produce better-balanced trees. Chris Okasaki’s functional red-black trees are a commonly cited example, but there may be others worth considering.

Poorly-balanced trees make access slower, by increasing the average depth of the tree, requiring lookups to walk down more levels and perform more comparisons, on average. That said, the amount of potential improvement that WB-trees leave on the table is not large — around 10% in the worst case.

The CHAMP trees, which are now FSet’s default for sets, maps, and bags, are faster than any ordered-tree data structure anyway. WB-trees are needed only for the cases in which you want the collection maintained in some specific order. So switching to another kind of ordered tree has not seemed urgent to me. Still, if you’re starting a functional-collections project from scratch, it’s worth looking at the alternatives.

For seqs, it appears that a better choice would be Phil Bagwell and Tiark Rompf’s relaxed-radix B-trees (RRB-trees). These were invented for Clojure, because its oriiginal functional vector type had poor time complexity for insertion and concatenation. They should have noticeably better indexing performance than FSet’s wb-seq, at the cost of making insertion, subsequencing, and concatenation more expensive (these are all constant-factor differences; the time complexities are all logarithmic). This sounds like a trade worth making for most applications, but I’m not sure it’s a slam dunk. As I’ve argued earlier, I think random access into a sequence is relatively rare; clients more commonly iterate over the sequence or a subrange of it, and even with FSet’s WB-trees, that operation is as fast per element as it would be with RRB-trees. So again, changing data structures for FSet has not seemed urgent. Still, for a new library, RRB-trees would probably be the recommended option.

Overgrown or overengineered in places ¶

It’s not too unusual an experience for me to look at some part of FSet that I have written and wonder whether anyone has ever used it. The poster child for this phenomenon is probably the reader macros; I haven’t used them myself, and don’t know whether anyone else has. What I didn’t know when I wrote them was that the constructor macros would be entirely sufficient in practice.

When I was first adding custom orderings to FSet, I started with WB-sets. I didn’t want WB-sets that didn’t use the feature to pay the space cost — just one word in the wrapper object — of allowing custom orderings, so I bifurcated the class into wb-default-set and wb-custom-set, where only the latter has the field holding the ordering. I now think this was excessively fussy, though on the other hand, I don’t see the point of removing it. (I didn’t do this with any of the other types; they already have more space overhead, anyway, so the relative benefit would have been even smaller.)

Some features are historical ¶

The WB-tree implementations of replay sets and maps are clearly obsolete and I should probably remove them, in favor of the CHAMP versions. Iteration order is not an issue with the replay types, because it’s controlled by the client anyway.

No laziness ¶

FSet has no lazy collections, though it does define an iterator protocol — two of them, actually, one stateful and one functional — that can be used for similar purposes, if not as elegantly. (GMap argument generators are even more like lazy sequences.)