CHAMP (Compressed Hash-Array Mapped Prefix-tree) is the subject of a 2017 Ph.D. dissertation by Michael Steindorfer. It’s an evolution of the Hash Array-Mapped Trie (HAMT) invented by Phil Bagwell in 2001 [&&& ref; see Steindorfer].
CHAMP, like HAMT, is hash-based, rather than ordering-based. FSet’s CHAMP collections are significantly faster than the WB-tree collections at simple operations such as element/key updates and lookups, and several times faster at operations that combine two collections in some way: equality testing, union, intersection, etc. However, CHAMP is appropriate only for setlike types; there’s no CHAMP alternative to the WB-tree seqs.
So FSet has CHAMP implementations of sets, maps, and bags, named ch-set, ch-map, and
ch-bag respectively. As with the WB-tree types, point updates take O(log n) time, but
the constant factors are smaller. The combining operations — union, intersection,
etc. — are much faster on CHAMP; a factor of 4 seems to be typical, and I’ve seen it go
as high as 10.
One downside is that the order in which CHAMP yields elements during iteration is a rather complex function of their hash values. This ordering will appear random — it isn’t actually random, but it won’t make any sense from the application’s perspective.
In the event of a hash collision, where two unequal values nonetheless have the same hash value, FSet’s CHAMP implementation stores the collision sets as WB-trees within the CHAMP tree. This keeps the time complexities of operations down (e.g., insertion remains logarithmic) even when collision sets grow unboundedly, provided the comparison function can order most pairs of values. This design choice reduces the pressure on the hash function to return unique values, which in turn can have performance benefits. For example, if the keys of a CHAMP map are long FSet seqs, the time taken to hash an entire seq may become problematic. FSet avoids this problem by hashing only a bounded number of elements; this makes collisions more likely, as some subsets of the keys may have those elements in common, but the time complexities are unaffected nonetheless, because of this WB-tree fallback feature.