3.3 A Brief Introduction to Big-O

This book uses “Big-O” notation to characterize the time complexity of operations. While most readers are probably familiar with it, a few may not be; I thought I should go over the basics. You should definitely skip this section if you have studied the topic.

Time complexity — more accurately, asymptotic time complexity — is a way to characterize how the worst-case running time of an algorithm grows with the size of its input, as that input gets arbitrarily large. It is not the only important measure of performance! If you have two implementations of an algorithm, one of which always takes 100 times longer than the other — perhaps it is running on a very slow interpreter — these two implementations nonetheless have the same time complexity. Time complexity abstracts away from these constant-factor differences to focus on the dependence of running time on input size.

Imagine, for example, that there is some function we want to compute on collections of some kind, and that we have two algorithms to do it. Algorithm A works by performing some simple operation on every pair of elements of the input. If the collection is of size n, there are n2 such pairs; so the time taken by this algorithm must grow proportionally to that number. Algorithm B performs a more complex operation on every element by itself; let’s suppose that this more complex operation takes time independent of the size of the collection; so the time taken by the algorithm is proportional to n.

Even if algorithm A is faster for small inputs — even if it’s much faster — the fact that its running time grows as the square of the size of the input means that there must exist an input large enough that algorithm B will beat A on that input; and as the input gets larger yet, algorithm B’s advantage will only grow.

We say that algorithm A has quadratic or O(n2) time complexity, and B has linear or O(n) time complexity. Some commonly encountered complexity classes — including the ones used in this book — are, in increasing order:

constant, O(1)

The running time is independent of the size of the input.

logarithmic, O(log n)

The running time is proportional to the logarithm of the size of the input. Note that the base of the logarithm doesn’t matter, because logarithms of different bases are related by a constant factor, and time complexity abstracts away constant factors.

It is common for algorithms to have logarithmic time complexity when they involve splitting the input into roughly-equally-sized partitions, selecting one of those, and repeating until the current partition picks out a single element. A familiar example is binary search on a sorted vector, which maintains a range of indices within which the value being searched for must lie if present; at each iteration, it compares the item at the center of that range against the search key, and based on the result of that comparison, discards the top or bottom half of the range. Finding a path through an ordered binary tree similarly discards half the remaining tree at each step.

linear, O(n)

The running time grows at the same rate as the size of the input. Algorithms that perform an operation on every element of the input must have at least linear time complexity.

linearithmic, O(n log n)

The running time grows as the size of the input times its logarithm, that is, slightly faster than linearly. Algorithms that perform an operation on every element of the input, where the time for that operation is logarithmic in the size, will have this time complexity. Many sorting algorithms are in this class.

quadratic, O(n2)

The running time grows as the square of the size of the input.

Most of the time, in practice, constant-time operations are very fast — they could require as little as one machine instruction — and logarithmic-time operations are fast enough not to worry about. Linear time is generally acceptable for bulk operations, and linearithmic (O(n log n)) time isn’t much worse — again, the difference is rarely worth worrying about. And these tend to be the best time complexities possible for many bulk operations (e.g. sorting), anyway. Quadratic time has the potential to be a problem, though in practice, people do live with quadratic or even cubic (O(n3)) algorithms when nothing better is available. Beyond that are all the higher-order polynomials, and then exponential time (O(kn) for some constant k) — these are considered increasingly intractable.

Again, time complexity is not all there is to performance; there are situations where constant-factor differences are more important. But usually, time complexity is the first consideration, because when programs are put to actual use, there tend to be at least a few users who run them on very large inputs.

One more point, about the word “asymptotic”: this refers to the time taken by the algorithm as the input size grows without bound. So it’s not something you can measure, because all measurements must be made with specific, finite inputs. It can be arrived at only by analysis of the algorithm in question. The most that measurements can tell you is how the algorithm performs on a certain range of input sizes.