Lisp is often referred to as one of the earliest functional languages, but the way the term “functional” is used has shifted over time. Where it once referred to a language with first-class functions — functions that can be assigned to variables and passed to and returned from other functions — now, languages called “functional” are often pure functional, that is, languages with no side-effects at all, such as Haskell.
Also, when I refer to first-class functions, I should probably be putting “functions” in quotes. What we are used to calling functions in Lisp and many other languages are more correctly called procedures; they aren’t functions in the mathematical sense, as they often have side-effects.
I like to refer to Lisp as “semi-functional”. For programming in the small, Lisp allows code to be written functionally; of course, the whole point of FSet is to greatly expand the amount of such code that can be written. But on a larger scale, Lisp programs tend to have some amount of state that gets updated as well.
There are many languages in this “semi-functional” category: Scala, the ML family — even Python can be used this way.
I want to digress a bit here to ask a different question: can a program be considered functional if it uses local variable assignment? In my view, yes: if the only side-effects it performs are assignments to local variables, for most purposes it can be considered functional code.
At the machine-code level, after all, everything is side-effects: even a pure function in Haskell has to allocate and initialize a stack frame and store computed values in it, or at the very least, modify machine registers. We refer to a procedure as “pure” when it doesn’t have any side-effects that are detectable at the level of the language semantics. But a procedure that assigns to local variables, but performs no other mutating operation, similarly lacks side-effects that are detectable by its callers.
In fact, there exists an algorithm, the subject of this 1994 paper by Dan Weise et al., that can take a procedure written with local assignment and mechanically convert it to a functional form. (The paper doesn’t describe its algorithm in those terms, but close examination will reveal that the Value Dependence Graph is easily reinterpreted as a functional form of the input program.)
So although it can be a fun exercise to write an algorithm without assignment, typically using tail recursion instead, I don’t think you’re necessarily giving up on functional programming if you prefer to write with some local assignments. This point will be worth keeping in mind when we turn to Quasi-Mutating Operators.