Bug-free Code

I love programming. Literally by barely lifting my fingers, I can turn dreams into reality. Accordingly, I hate bugs. Apart from delaying me, each bug rudely exposes my imperfections. Each bug inescapably shows I have less ability than I thought.

The puzzle-solving aspect of debugging can be fun and even educational, but this is negligible compensation. I’d much rather avoid bugs in the first place.

To err is human, so my only hope is to have the computer find mistakes for me.

Make No Mistake

Incredibly, in the right language, buggy code is impossible. The compiler simply refuses to build erroneous programs.

How can this be? Are we cheating by locking ourselves in a Sapir-Whorf prisonhouse such as the Newspeak of George Orwell’s 1984? Absolutely not. We can express just about any idea we want. We must forgo Turing-completeness, but we otherwise retain a lot of power.

Also, to be clear, no language can prevent us from writing, say, a program that correctly reverses a string when we really meant to write a program to detect palindromes. This is a different kind of slip-up: the program runs to specifications, but the specifications themselves are wrong.

Let’s first look at a suboptimal approach to bug prevention.

A Brief History of Types

We ignore theoreticians for the moment and focus on programmers in the field.

In the beginning, languages were untyped. While the freedom must have been exhilirating, early programmers must have also learned that a little complexity goes a long way. Undisciplined code quickly becomes hard to maintain and debug. For example, confusing a pointer to an integer with a pointer to a string might go undetected until the user encounters strange behaviour.

Over time, static type checking gained prominence. Despite the overhead, programmers appreciated the language preventing them from, say, confusing a number with a string.

So far so good. But then disaster struck. Well-intentioned but ill-informed programmers popularized an abomination known as object-oriented programming. After them, the deluge: meaningless discussions of is-a versus has-a; of the circle-ellipse “problem”; of the inheritance anomaly; of the pros and cons of multiple inheritance. Copy-and-paste programming gained the lofty title of “implementation inheritance”. Band-aids on self-inflicted wounds were termed “design patterns”.

There would be little harm if their only product was pseudo-philosophical gobbledygook. Unfortunately, because of their influence, two horrific languages, C++ and Java, became industry standards.

Their verbose and clumsy object-oriented type systems fooled the public into equating static typing with stifling boilerplate. This in turn led to a resurgence of untyped languages, except this time around they were called dynamically-typed languages. Object-oriented languages had made static typing so painful that many preferred to give up and just wing it.

Forgive me if I sound bitter. I was an impressionable undergraduate when the object-oriented junta were at the height of their power. I fell victim to the regime; indeed even Edgser W. Dijkstra’s pleas fell on deaf ears in those dark times.

At least I can end on a happier note. I escaped after years of struggle. C++ and Java are now mending their ways. C++11 supports type inference and lambda expressions, and Java 8 supports lambda expressions. The newer Go language is similar enough to C to have broad appeal, has had type inference and anonymous functions since day one, and notably lacks implementation inheritance. With luck, Go will also teach the unwashed masses more sensible approaches to modularity and concurrency.

Mathematics = Computer Science

If only we had listened to the theoreticians. In 1934, years before ENIAC was born, logicians discovered the Curry-Howard correspondence. Had we paid attention, today we would all know that:

  1. Theorems are equivalent to types in a Haskell-like language.

  2. Writing a proof of a given theorem is equivalent to writing an expression with a given type.

The second fact may seem unbelievable. Any old expression with the right type will do? Yes! However, recall languages like Haskell are fastidious with types. We are unable to sneak in, say, an I/O operation unless the type signature allows it. Indeed, as the type becomes more complex, it becomes tougher to find a program that fits.

The Curry-Howard correspondence is a fundamental deep connection between programming and mathematics that is so strong that we can readily see it for ourselves. If we look up the axioms of propositional logic, we find:

a -> (b -> a)

(that is, “a implies (b implies a)”), as well as:

(a -> (b -> c)) -> ((a -> b) -> (a -> c))

These are also precisely the types of Haskell’s const function and ap function in the Reader monad, also known as the K and S combinators:

-- K combinator
const :: a -> (b -> a)

-- S combinator
ap :: (a -> (b -> c)) -> ((a -> b) -> (a -> c))

In a typical overview of propositional logic, we learn that:

a -> a

(“a implies a”, also known as reflexivity of implication) follows from the above two axioms. This also happens to be the type of Haskell’s id function, which can be built from the above functions:

id :: a -> a
id = ap const const

or as they say in combinatory logic: “I = SKK”.

We can find Haskell equivalents for other logical connectives. Hence every logical formula is equivalent to a type, and proving a theorem by applying rules of inference to axioms is the same as writing an expression of a given type by composing language primitives. We normally think a compiler’s job is to check all our types match. We can now also view it as a proof-checker: a tool that ensures we are correctly applying the rules of inference.

We’ve focused on propositional logic. We get first-order logic and Peano arithmetic by introducing dependent types.

The Curry-Howard correspondence implies we can define a type that states: “for all arrays of integers A, there exists an sorted version of that array”, and moreover, any function that we define of type is guaranteed to sort a given array of integers correctly. Errors in our algorithm are equivalent to type mismatches. Bugs are impossible: if our program compiles, then it is correct.

A cure worse than the disease?

Although the above sounds promising, one does not simply tweak a run-of-the-mill sort routine until the compiler accepts it. Programs are proofs, so whatever we do will be equivalent to writing a machine-friendly proof that our sorting algorithm is correct. Today’s proof-checkers need substantial guidance: we must spell out each step in detail.

I’m far too impatient to tolerate this. Even mathematicians state proofs in broad strokes and expect their audience to fill in the blanks, with notable exceptions such as Russell and Whitehead’s proof that 1+1=2.

Some day, perhaps compilers will be so advanced that they can automatically fill in enormous blanks, but until then, I must accept that my code may contain bugs.

The good news is the Curry-Howard equivalence lives on one end of a continuum. On the opposite end, we find untyped languages where anything goes, and often crashes. We can choose a point on this lawless-to-flawless continuum that best suits us.

We can choose an untyped language that ditches all formalities, and thus is unable to catch bugs. We can choose a language where we provide detailed proofs of correctness and avoid all bugs. Or we can choose an intermediate language where we spend less on safety but still receive substantial protection.

Haskell occupies a sweet spot on this spectrum. Haskell types have a good power-to-weight ratio: they can almost be ignored, yet they also prevent us from making all kinds of mistakes. Haskell’s sharp line between pure and impure code, functional programming constructs, and type-related language extensions mean we can squeeze out a surprising amount out of the Curry-Howard equivalence.

Haskell Power

Paul Graham’s Revenge of the Nerds illustrates the power of Lisp by exhibiting a function that generates accumuators.

Accumulators are impure, so in Haskell we need something like Data.IORef.

Lisp has built-in functions that modify then return the value of a reference, which partly accounts for the succinct Lisp solution. In Haskell, these are separate functions, but we can define a helper to combine them:

lispf f r = modifyIORef r f >> readIORef r

Now we can solve the challenge in one line:

foo n = newIORef n >>= (\r -> pure $ \i -> lispf (+i) r)

We add a little test to check that if we increment n by two n times, and then by zero, we wind up with 3*n:

prop_triple n = (3*n ==) <$> do
  acc <- foo n
  replicateM_ n $ acc 2
  acc 0

The Haskell definition is longer, but not by much, and in fact, the extra characters are helpful. For example, pure tells us that the returned function has side effects, while the Lisp code lacks this information. Most languages mix up pure and impure functions.

Also, the type of the reference is important. For a concurrent program we might want Control.Concurrent.MVar instead of an IORef. The problem description fails to mention whether the generated accumulators should be thread-safe, but judging from the example solutions, it seems we may assume a single-threaded program. Swapping IORef for MVar is easy in Haskell; I suspect supporting concurrency in other languages may be tougher.

Naturally, we can always reduce mentions of references with point-free programming:

foo' n = flip (\i -> (lispf (+i))) <$> newIORef n

A more zealously point-free version [I’ve chosen fmap over (<$>) because I’m less fond of parentheses than Lisp programmers]:

foo'' = fmap (flip $ lispf . (+)) . newIORef

We now turn the tables and illustrate the power of Haskell with a problem: prove the reflexivity of implication using only the THEN-1 and THEN-2 axioms. Above, we saw a solution in Haskell:

p :: a -> a
p = ap const const

By Curry-Howard, we check this proof by compiling the above code.

How about other languages? If a language lacks a sufficiently powerful static type checker, the only recourse is to implement enough of one to carry through the above proof.

As computers take over more aspects of our lives, the importance of program correctness grows. Sometimes, the value of catching bugs early becomes obvious only in hindsight, and static analysis is retrofitted into existing systems: see MyPy; TypeScript. One day, we may see a follow-up to Greenspun’s Tenth Rule:

Any sufficiently productionized dynamically typed language is checked by an ad hoc informally-specified bug-ridden slow implementation of half of Haskell.

Extremist Programming

Suppose we insist on perfection. What languages can we use?

I first came across Coq, which emphasizes the mathematical side of the Curry-Howard correspondence. The user states theorems and specifies proof tactics to direct their proof. For example, the Four Colour Theorem can be proved wth Coq.

Since I was more interested in the programming side of Curry-Howard, I looked into Agda. I was initially excited after seeing examples such as a merge sort that can do no wrong.

However, Agda almost forces Emacs upon the user, and is irritatingly fussy. For example, the filename must match the mandatory module name, and when compiling, we must supply at least two "-i" options (one for the standard library and one for the current directory). I was unable to get a "Hello, World!" example running on my system. Perhaps its Ubuntu package needs work. In any case, Agda refused to cooperate despite due diligence.

I moved on to Idris. Finally! Idris was trivial to install, works well with any text editor, and "Hello, World!" is easy to write:

main : IO ()
main = putStrLn "Hello, world!"

and easy to run:

$ idris hello.idr -o hello
$ ./hello
Hello, World!

Haskell programmers will feel at home with the syntax of Idris. The most obivous difference is that : and :: are swapped. There is also one huge semantic difference: Idris is eagerly evaluated.

Idris allows a configurable amount of slop. We can define partial functions, and more generally we can act as we do in less safe languages. Hence we can develop rapidly and sinfully, then later concentrate our efforts on proving the correctness of the most important parts.

The highest form of academic work

My late friend Alain Fournier once told me that he considered the lowest form of academic work to be taxonomy. And you know what? Type hierarchies are just taxonomy. You need to decide what piece goes in what box, every type’s parent, whether A inherits from B or B from A.

That is, object-oriented programming corresponds to taxonomy. In contrast, the Curry-Howard isomorphism means a dependently typed functional programming language is equivalent to mathematics.

Now, according to Gauss, mathematics is “the queen of sciences”. A well-known joke expresses a similar sentiment:

Biologists think they are biochemists,
Biochemists think they are Physical Chemists,
Physical Chemists think they are Physicists,
Physicists think they are Gods,
And God thinks he is a Mathematician.

Therefore object-oriented programming is the lowest form of academic work, and functional programming is the highest!


Ben Lynn blynn@cs.stanford.edu