Choosing a Data Structure (Modern Common Lisp with FSet)

Next: Transients, Previous: Custom Comparison and Hash Functions, Up: FSet Data Structures [Contents][Index]

5.8 Choosing a Data Structure ¶

This discussion applies to the “setlike” types: sets, maps, and bags — the ones for which FSet gives you a choice between WB or CHAMP.

For most applications, CHAMP performs better, which is why I’ve made it the default in FSet 2. However, there are still reasons to use the WB implementations in some cases:

Controlling the ordering: You might want to maintain a collection in a particular order — for example, as mentioned above, you might want to keep a map whose keys are strings in lexicographic order. This might be just for convenience while viewing the collection in the debugger or printing it out, or maybe you specifically want to iterate over it in that order.
Using order-related operations: Closely related to the above is the fact that the WB collections support some operations that depend on the ordering, operations CHAMP doesn’t support. These include least, greatest, rank, at-rank, split-from, split-above, split-through, and split-below.
Better locality: For extremely large collections, it is possible that the WB collections could give better performance, in a situation where the elements or keys being added or looked up tend to cluster so that those referenced closely together in time also tend to be close together in the ordering. For instance, if you have a map indexed by long strings, and operations on the map often come in groups whose keys share a common prefix, there is a possibility that it would be better as a WB-map. This is because ordered trees tend to group items together in memory that are close together in the ordering; this is especially true if the Lisp implementation uses a compacting garbage collector. A CHAMP map is likely to scatter such keys across the memory used by the map, resulting in more cache misses and TLB faults. If you think you might be in a situation where this effect could matter, I recommend making some timing measurements.

These FSet functions are sometimes useful for compare-functions for WB-collections:

Generic Function: compare-lexicographically a b &key ¶
Method: compare-lexicographically (string string) &key ¶
Method: compare-lexicographically (list list) &key (val-compare-fn #’compare) ¶
Method: compare-lexicographically (vector vector) &key (val-compare-fn #’compare) ¶
Method: compare-lexicographically (seq seq) &key (val-compare-fn #’compare) ¶: A generic function with methods to compare strings, lists, or vectors lexicographically. In the case of lists and vectors, you can optionally supply a function to compare their elements with.

Function: compare-strings-lexicographically a b ¶
Function: compare-lists-lexicographically a b &optional (val-compare-fn #’compare) ¶
Function: compare-vectors-lexicographically a b &optional (val-compare-fn #’compare) ¶
Function: compare-seqs-lexicographically a b &optional (val-compare-fn #’compare) ¶: Non-generic functions to compare strings. lists. or vectors lexicographically. Use when you know the type and you want to avoid the generic function dispatch overhead (which is, however, only rarely noticeable). In the case of lists and vectors, you can optionally supply a function to compare the individual values (elements) with.