Lisp

McCarthy’s classic 1960 paper introducing Lisp is life-changing. Paul Graham’s homage The Roots of Lisp [PDF] to the original paper is more accessible, and corrects bugs.

Functional programming languages began with John McCarthy’s invention of Lisp, which showed computer scientists the importance of lambda calculus. Without Lisp, would we still be stuck with Turing machines for the theory of computation? Would we still be ignorant of the connections between programming and logic and category theory?

We’ll build a Lisp interpreter based on Graham’s paper:

For the command-line version, if it appears that an expression is incomplete, rather than complain immediately, our interpreter asks for another line. This feature requires importing the Parsec library’s Error module.

{-# LANGUAGE CPP #-}
#ifdef __HASTE__
import Haste.DOM
import Haste.Events
#else
import System.Console.Readline
import Text.ParserCombinators.Parsec.Error
#endif

The interpreter itself only needs a few imports:

import Control.Monad
import Data.List
import Text.ParserCombinators.Parsec

Tree Processor

We define a data structure that will hold a Lisp expression. Lisp should really be called Trep: it processes trees, not lists!

Atoms and the empty list are external nodes and non-empty lists are internal nodes on a full binary tree. The atom primitive of Lisp is slightly confusing because it returns true for all atoms and the empty list; presumably the name is-external-node is too unwieldy.

To emphasize that Lisp expressions are binary trees, we construct external node with Lf, which stands for "leaf", and use the infix constructor (:.) to build internal nodes from left and right subtrees.

The infix constructor also makes pattern matching easier. We give it right-associativity so we can write Lisp lists with fewer parentheses, because a flattened Lisp list is actually an unbalanced binary tree whose spine extends to the right and terminates with a sentinel node (nil), and the elements of the list are the left child nodes.

Our interpreter accepts a sequence of expressions, and later expressions may use labels defined by earlier expressions. To help with this, we add the Label constructor. This design is unclean, as it means a label in the wrong place will cause problems, but it suffices for our demo.

We also define Bad for crude but serviceable error-handling.

infixr 5 :.
data Expr = Lf String | Expr :. Expr | Label String Expr | Bad

instance Show Expr where
  show Bad         = "?"
  show (Label s _) = s
  show (Lf "")     = "()"
  show (Lf s)      = s
  show (l :. r)    = "(" ++ show l ++ showR r ++ ")" where
    showR (Lf "")  = ""
    showR (l :. r) = " " ++ show l ++ showR r
    showR x        = " " ++ show x

Haskell is more deserving of the term "list processor" because its lists are true singly-linked lists, and the language encourages their use thanks to extensive library support and notation such as [1, 2, 3] and (h:t). This may not always be good: lists are less general than trees because monoids are fussier than magmas.

While we can easily define other data structures, they will never be as easy to use. Infix constructors, the FTP proposal, and view patterns have improved matters, but lists are still king. Indeed, initially I wrote the interpreter with:

data Expr = Atom String | List [Expr]

so I could write:

  f "quote" [x] = x
  f "atom" [Atom _ ] = Atom "t"
  f "atom" [List []] = Atom "t"
  f "atom" [_      ] = List []
  f "car"  [List (h:_)] = h
  f "cdr"  [List (_:t)] = List t
  f "cons" [h,  List t] = List (h:t)
  ...

This felt like cheating, so ultimately I defined a tree structure from scratch.

7 Primitive Operators

To take advantage of Haskell’s pattern matching we introduce a function f to handle primitive operators. We write this function first, and worry about how it fits into the bigger picture later.

eval env = g where
  f "quote" (x :. Lf "") = x

  f "atom" (Lf _ :. Lf "") = Lf "t"
  f "atom" _               = Lf ""

  f "eq" (Lf x :. Lf y :. Lf "") | x == y    = Lf "t"
                                 | otherwise = Lf ""
  f "eq" (_    :. _    :. Lf "") = Lf ""

  f "car"  ((l :. r) :. Lf "") = l
  f "cdr"  ((l :. r) :. Lf "") = r
  f "cons" ( l :. r  :. Lf "") = l :. r

  f "cond" ((l :. r :. Lf "") :. rest) | Lf "t" <- g l = g r
                                       | otherwise     = f "cond" rest

I could have added a Nil sentinel to the Expr data type and used it instead of Lf "", but the code seemed more honest without it.

The f function is also a good place to handle label, and some Lisp shorthand:

  f "label" (Lf s :. e :. Lf "") = Label s e

  f "defun" (s :. etc) = g $ Lf "label" :. s :. (Lf "lambda" :. etc) :. Lf ""
  f "list" x = x

  f _ _ = Bad

Eval

Locally, we give the name g to the evaluation function for a given environment:

  • For labels, we add a binding to the environment, then evaluate the newly bound function on the given arguments.

  • For atoms, we look up its binding in the environment.

  • For lambda, we modify the environment accordingly before recursing.

  • Otherwise we expect the left child of the root to be a leaf. If it has a binding in the environment, we recursively evaluate it on the rest of the tree. (This means it’s possible to override the builtin functions.) If it’s cond, quote, defun, or label, then we call f without evaluating its arguments first. If not, then we do evaluate the arguments first before calling f.

Again we see the overheads incurred by using a non-list data structure in Haskell. In my initial list-based code, I could simply use the default map instead of mapL, and fromTree would be unneeded.

  g (Label s e :. r) = eval ((s, e):env) $ e :. r
  g (Lf s) | Just b <- lookup s env = b
           | otherwise              = Bad

  g ((Lf "lambda" :. args :. body :. Lf "") :. t) =
    eval (zip (fromLeaf <$> fromTree args) (fromTree $ mapL g t) ++ env) body
  g (Lf h :. t)
    | Just b <- lookup h env                  = g $ b :. t
    | elem h $ words "cond quote defun label" = f h t
    | otherwise                               = f h $ mapL g t
  g _ = Bad

  fromTree (Lf "")  = []
  fromTree (h :. t) = h:fromTree t
  fromLeaf (Lf x)   = x

  mapL f (Lf "")  = Lf ""
  mapL f a@(Lf _) = f a
  mapL f (l :. r) = f l :. mapL f r

Parser and User Interface

A simple parser suits Lisp’s simple grammar:

expr :: Parser Expr
expr = between ws ws $ atom <|> list <|> quot where
  ws   = many $ void space <|> comm
  comm = void $ char ';' >> manyTill anyChar (void (char '\n') <|> eof)
  atom = Lf <$> many1 (alphaNum <|> char '.')
  list = foldr (:.) (Lf "") <$> between (char '(') (char ')') (many expr)
  quot = char '\'' >> expr >>= pure . (Lf "quote" :.) . (:. Lf "")

oneExpr = expr >>= (eof >>) . pure

We use a list to store bindings. We permanently augment the global environment when the eval function returns a Label.

We preload definitions of the form (defun cadr (x) (cdr (car x))) from caar to cddddr.

addEnv (Label s e) = ((s, e):)
addEnv _           = id

preload = foldl' f [] $ concat $ genCadr <$> [2..4] where
  f env s = let Right expr = parse oneExpr "" s in addEnv (eval env expr) env

genCadr n = [concat ["(defun c", s, "r (x) (c", [h], "r (c", t, "r x)))"] |
  s@(h:t) <- replicateM n "ad"]

For this webpage, we set up one button to run the program supplied in the input textarea and place the results in the output textarea. Another button fills the input textarea with a predefined program.

#ifdef __HASTE__
main = withElems ["input", "output", "evalB",
                  "surpriseB", "surpriseP",
                  "quoteB", "quoteP"] $
  \[iEl, oEl, evalB, surB, surP, quoB, quoP] -> do
  surB  `onEvent` Click $ const $ do
    getProp surP "innerHTML" >>= setProp iEl "value" >> setProp oEl "value" ""
  quoB  `onEvent` Click $ const $ do
    getProp quoP "innerHTML" >>= setProp iEl "value" >> setProp oEl "value" ""

  evalB `onEvent` Click $ const $ do
    let
      run (out, env) expr = case r of
        Label _ _ -> (out, env1)
        r         -> (out ++ show r ++ "\n", env1)
        where r    = eval env expr
              env1 = addEnv r env
    s <- getProp iEl "value"
    setProp oEl "value" $ case parse (many expr >>= (eof >>) . pure) "" s of
      Left  e  -> "Error: " ++ show e ++ "\n"
      Right es -> fst $ foldl' run ("", preload) es
#else

The command-line variant of our program provides more immediate gratification, printing results on every newline if it terminates a valid expression.

expectParen (Expect "\"(\"") = True
expectParen (Expect "\")\"") = True
expectParen _                = False

repl pre env = do
  ms <- readline $ if null pre then "> " else ""
  case ms of
    Nothing -> putStrLn ""
    Just s  -> addHistory s >> case parse oneExpr "" $ pre ++ s of
      Left err  -> case find expectParen $ errorMessages err of
        Nothing -> do
          putStrLn $ "parse error: " ++ show err
          repl "" env
        _ -> repl (pre ++ s ++ "\n") env
      Right expr -> do
        let r = eval env expr
        print r
        repl "" $ addEnv r env

main = repl "" preload
#endif

And with that, our interpreter is done!

Surprised Again

McCarthy’s eval function must have been astonishing for its time. Graham calls it The Surprise. By adding a handful of primitives to lambda calculus, we can write a self-interpreter that fits on a page. To play with it, click the “Surprise Me!” button at the start of this page.

This is impressive, and more elegant than universal Turing machines. But if surprise is inversely proportional to self-interpreter simplicity and size, then prepare to be amazed by Mogensen’s one-line self-interpreter in plain unadulterated lambda calculus:

(λf.(λx.f(xx))(λx.f(xx)))(λem.m(λx.x)(λmn.em(en))(λmv.e(mv)))

The indelible impact of Lisp

The spirit of Lisp lives on in languages with modern conveniences.

Haskell is a fashionable five-star high-tech luxurious language, but stripping away its contemporary furnishings reveals a humble core surprisingly similar to Lisp. For example, take a typical function from Paul Graham’s On Lisp:

(defun our-remove-if (fn lst)
  (if (null lst)
    nil
    (if (funcall fn (car lst))
      (our-remove-if fn (cdr lst))
      (cons (car lst) (our-remove-if fn (cdr lst))))))

We can translate this almost word-for-word to Haskell:

if' a b c = if a then b else c

ourRemoveIf fn lst =
  (if' (null lst)
    []
    (if' (fn (head lst))
      (ourRemoveIf fn (tail lst))
      ((:) (head lst) (ourRemoveIf fn (tail lst)))))

The family resemblance is obvious, but it’s best to hide this beneath generous servings of Haskell syntax sugar:

ourRemoveIf _ []                 = []
ourRemoveIf f (x:xs) | f x       = ourRemoveIf f xs
                     | otherwise = x : ourRemoveIf f xs

In this small example we see various sweeteners: off-side rule]; pattern matching; guards; infix and prefix notation; concise notation for lists.

There is substance behind the delicious style. Patterns are sophisticated enough to be useful, yet elementary enough so compilers can detect overlapping patterns or incomplete patterns in a function definition. This catches bugs that would go unnoticed in a Lisp cond.

With this in mind, we see the source of our interpreter is almost the same as Graham’s, except it’s less cluttered and more robust. For example, for the 7 primitives, thanks to pattern matching, the function f reduces duplicated code such as eq (car e) gives the compiler more power, and detects errors when the wrong number of arguments are supplied.

By the way, as with Lisp, in reality we would never bother defining the above function, because ourRemoveIf = filter . (not .).

Less is more

Haskell is really the Core language, coated in heavy layers of syntax sugar. The Core grammar only takes a handful of lines:

data Expr b
  = Var   Id
  | Lit   Literal
  | App   (Expr b) (Arg b)
  | Lam   b (Expr b)
  | Let   (Bind b) (Expr b)
  | Case  (Expr b) b Type [Alt b]
  | Cast  (Expr b) Coercion
  | Tick  (Tickish Id) (Expr b)
  | Type  Type
  | Coercion Coercion
  deriving Data

type Arg b = Expr b

data Bind b = NonRec b (Expr b)
            | Rec [(b, (Expr b))]

Parallels with the Lisp are obvious, for example, Lam is lambda, Case is cond, and App is the first cons cell in a Lisp list, There’s bookkeeping for types, and source annotation (Tick) for profilers and similar tools, but otherwise Core and Lisp share the same minmalist design. Both are extensions of lambda calculus.

Core changes

Nonetheless, there are profound differences between Haskell and Lisp. Although almost invisible to the untrained eye, they have enormous implications.

Haskell has benefited from the experience of functional programmers, as well as advances in theory involving the foundations of mathematics and computer science.

  • Lisp programmers pioneered programming with pure functions. Haskell has taken purity to heart: its type system means the compiler knows which functions are pure and which are not, and steers programmers to isolate impure code in tiny functions, leaving the rest of the code pure.

  • The discovery that the monads of category theory applied to programming meant Haskell could stay pure yet handle I/O beautifully.

  • Lazy evaluation largely obviates the need for macros. Good language support for pure functions is a prerequisite for effective lazy evaluation. Programs that may never terminate with eager evaluation might terminate with lazy evaluation: in fact, if some evaluation strategy causes a program to terminate, then it will terminate with lazy evaluation.

  • The Hindley-Milner type system underpinning Haskell 98 lets us write code without a single type annotation, so it still feels like Lisp, yet an efficient type inference algorithm means the compiler rejects badly typed programs. Haskell has since gone beyond Hindley-Milner, but even so, type annotation is inconspicuous.

  • The Core language is built on System F, which is formalizes parametric polymorphism and also guarantees programs terminate. This guarantee is voided by Haskell’s support for recursion, but we can regain it by restricting recursion, as Idris does.

  • Core pushes beyond System F to enrich types, but not so much more that we need full-blown dependent types. (This explains the weirder parts of the Core grammar.)

  • The Curry-Howard correspondence and Haskell’s expressive type system means more of the program can be shown to be correct at compile-time.

  • Roughly speaking, Lisp reads (min 1 x) as (min (1 x)), while Haskell reads it as ((min 1) x). For function evaluation, Haskell’s parse tree is more troublesome because we must repeatedly traverse left from the root until we find the function to evaluate, rather than simply take the left child of the root. However, it’s a net win because a curried function is a subtree. We have combinatory logic to thank for left-associative function application.

  • Lisp is perhaps the best language for appreciating the equivalence of code and data, since a program is its own representation. However, although artistically and intellectually engaging, this blurring of the use-mention distinction trips up everyone from students (who have trouble with quote) and theorists (who have trouble formally reasoning about it). Haskell wisely chose a more explicit form of metaprogramming.


Ben Lynn blynn@cs.stanford.edu 💡