You may have noticed that I refer to FSet collections as “functional” rather than “immutable”. There is more to functional semantics than mere immutability: functional collections also support “functional update” operations: operations that take a collection and return a new collection that is different in some specified way. Some “immutable” collections libraries do not provide these.
Functional collections are therefore also persistent, in the data-structure sense: since performing a functional update returns a new collection, the collection in its previous state still exists and can be operated on further if desired. (The term “persistent” is also used for data written to nonvolatile storage; this meaning is not intended here.)
To develop an exact definition of functional collections, or functional datatypes in general, let me start with some simple examples in C++ and Java. The first could be in either language:
int a = 3;
int b = a;
++a; // b is unchanged
Simple, right? Everyone knows that incrementing a doesn’t affect b. This is because
the type int has an immediate implementation: there is no way that b could be an
alias of a, because the assignment copies the value itself, not a pointer to the
value. We say that int has value semantics.
Now consider this, in Java:
int[] a = { 3 };
int[] b = a;
a[0] = 4; // b is also affected
Here we see a case where assignment creates an alias to a mutable object, so that mutations to the object are visible through the alias. This example demonstrates reference semantics.
Interestingly, to have reference semantics, a type must be aliasing (meaning that assignment creates an alias) and mutable. C++, for instancee, has collection types that are mutable but not aliasing:
std::vector<int> a{3};
std::vector<int> b = a;
a[0] = 4; // b is unchanged!
Why is b not an alias of a? Because std::vector redefines assignment to copy
the contents. So even though the implementation is pointer-mediated — as opposed to
immediate, like int — the class arranges for assignment not to create an alias. (Of
course, this being C++, you could create an alias using an explicit reference or pointer, but let’s
leave that aside. And yes, C++ pedants, I am overlooking the distinction between initialization and
assignment, which doesn’t matter for this discussion.)
Curiously enough, the result is that (in the absence of explicit references or pointers) the
standard C++ collections have value semantics despite being mutable! I said above that the Java
int[] example shows a case where assignment creates an alias to a mutable object. Here we
see that both aspects are required: the type must be aliasing and mutable to have reference
semantics; otherwise, it has value semantics.
Let’s have another look at that first example:
int a = 3;
int b = a;
++a; // b is unchanged
How do we want to describe the action performed by the third statement? The usual way is to say
that ++a is an abbreviation for a = a + 1; and that’s valid. But interestingly, it’s
equally valid to consider int to be a mutable but unaliasing type, like std::vector.
We could just as well say that the increment operation is mutating a in place; in fact,
that’s often how we think about it, and it’s also very likely to be how it’s implemented at the
machine-instruction level. We see, then, that if a type is unaliasing, the mutable/immutable
distinction no longer has the force we expect: it no longer determines whether the type has
reference or value semantics.
So the first requirement for a functional datatype is simply that it have value semantics.
Okay, back to Common Lisp and FSet. Collections have to have pointer-mediated implementations
because they’re of arbitrary size, and Common Lisp gives us no way to redefine assignment. So
aliasing is forced on us. (This is true of most languages.) To get value semantics, then, the FSet
types must be immutable. As we will see, though, FSet provides operations that superficially
resemble mutations; in practice, you can mentally pretend that FSet types are mutable but
unaliasing, like std::vector, although that’s not how they’re actually implemented — the
apparently mutating operations are constructing a new representation and assigning it back to the
variable holding the collection.
If that sounds odd to you as a CL programmer, let me assure you that it is not. Consider this example:
(let* ((a (list 4))
(b a))
(push a 3)) ; b is unchanged
Here the push form expands to (setq a (cons 3 a)), which obviously doesn’t touch
b. Although Lisp lists are technically a reference type, since cons cells can be mutated, in
practice they are rarely mutated once they are fully constructed. (By “fully constructed” I’m
referring to the fact that lists are often built backwards and then reversed in place with
nreverse; usually, once that reversal is done, the list is fully constructed.) So Lisp lists
are almost always used as a functional type, and the language supports this usage fairly well.