Histogram Construction (Modern Common Lisp with FSet)

Next: Graph Walking, Up: Examples [Contents][Index]

2.1 Histogram Construction ¶

Suppose we have a sequence of numbers; I’ll assume here they’re in an FSet seq. We want to calculate the data for a histogram, which divides the input axis into evenly-spaced buckets, then counts how many of the numbers fall into each bucket. (We’re just going to prepare the bucket counts; actually plotting the histogram is left as an exercise for the reader.)

An easy way to do this is with an FSet bag. Here’s one way to write it:

(in-package :fset2-user)

(defun histogram-1 (nums bucket-width)
  (let ((result (wb-bag)))
    (do-seq (num nums)
      (includef result (* bucket-width (floor num bucket-width))))
    result))

I’ve used wb-bag instead of the default CHAMP bags so the result will be printed in increasing bucket order.

Go ahead, try it now. You can generate some sample data by doing this:

(defparameter *nums* (gmap (:result seq)
                           (fn (_) (+ (random 100.0) (random 100.0) (random 100.0)))
                           (:arg index 0 1000)))

By adding three uniformly-distributed random numbers, we should begin to approximate a normal distribution, better known as a bell curve. Here’s a sample result I got:

FSET2-USER> (histogram-1 *nums* 20.0)
#{% #%(20.0 13) #%(40.0 16) #%(60.0 59) #%(80.0 96) #%(100.0 116) #%(120.0 141) #%(140.0 138)
   #%(160.0 141) #%(180.0 106) #%(200.0 77) #%(220.0 57) #%(240.0 25) #%(260.0 15) %}

In each pair, the first number is the bucket base; the second is the number of samples that fell in the bucket. So, 13 samples were between 20.0 and 40.0, etc.. You can get a more detailed view with a smaller bucket size; play with it!

Personally, I would write it using GMap:

(defun histogram-2 (nums bucket-width)
  (gmap (:result wb-bag) (fn (x) (* bucket-width (floor x bucket-width)))
        (:arg seq nums)))