Practice and Learn

I’d like to do better in programming contests, but I have little time to train. Furthermore, I’m mostly interested in the algorithms; I’m far less interested in perfecting boilerplate, memory allocation, avoiding overflow and so on.

Hence Haskell: it takes care of much of the paperwork, thanks to features like type inference, garbage collection, and arbitrary precision integer arithmetic.

Until they stopped running it, I focused on Google Code Jam’s Past Problems, because at the time they:

Judged solutions online.
Accepted any language.
Had diverse high-quality problems.
Posted solutions, so I had no qualms about publishing my own.

Reverse Words

(From Google Code Jam Qualification Round Africa 2010.)

Our task is easy to express in English: with separating spaces, join the words of the reverse of the list of words in a given line.

Our task is arguably even easier to express in Haskell:

unwords ( reverse ( words ( "you can cage a swallow can't you" ) ) )

As one might guess, words splits a line into a list of words, reverse reverses a list, and unwords joins a list of words into a single line with separating spaces.

We turn this into a full program satisfying contest conditions with interact, a function that passes the entire input as a giant string to a function of our choice, then prints the returned string to standard output.

-- reverse-words.hs
main = interact $ unlines . zipWith (++)
  ["Case #" ++ show t ++ ": " | t <- [1..]] . map solve . tail . lines

solve = unwords . reverse . words

The ($) operator saves us a pair of parentheses. In general, we can rewrite something like a(b(c(d))) as the less cluttered a $ b $ c $ d.

unwords . reverse . words $ "you can cage a swallow can't you"

The code following the $ defines the function we pass to interact. Because the function composition operator (.) is right-associative, to understand our definition, we read it right to left.

It breaks the input string into lines (lines), then ignores the first line (tail); the first line contains the number of test cases which is handy for languages with primitive memory allocation facilities.

We then run our algorithm on each input line (map solve), and prepend "Case #x:" to the result (zipWith (++) and the list comprehension). Finally, we join all the output lines together into a giant string (unlines) to return to interact.

This tiny program showcases several crown jewels of Haskell:

point-free programming: instead of defining "f(x) = g(h(x)) for all x", we dispense with the distracting variable x and simply say f = g . h.
list comprehensions: expressions like [t^2 | t ← [1..]] mimic how mathematically inclined humans define sets when communicating with each other.
lazy evaluation: why does the infinite list [1..] work? Because Haskell only computes as much as it needs. Lazy evalution also implies our program doesn’t wait for the entire input before moving to the next step. On the contrary, it’s more like piping in the shell: it only reads and computes on as much input is needed to print the next character.
map: half of Google’s famed MapReduce is named after this function. (The other half is named after another functional programming gem which Haskell programmers call fold instead of "reduce".)
currying: we’re composing map solve with other functions with (.), but how is map solve a function? Easy: it’s the function that maps the solve function over a given input list. In other words, if f = map solve, then f x = map solve x.
flexible fixity: we can dance between infix and prefix notation, mixing the two for maximum clarity.
boilerplate-free strong static types: we have no type declarations, yet not only is it impossible to confuse a string with an int, but it’s also impossible to confuse a pure function with an impure function. The only impure function here is interact; the rest of the program is pure.
equal rights: in many languages, the equal sign means assignment, not equality. That is, the right side gets assigned to something the left side refers to. In Haskell, provided the function is pure, any time we see the stuff on one side of the equals sign, we can swap it out for the stuff on the other, a property known as referential transparency.

But enough navel-gazing for now. After downloading the input data (via the webpage), we compile and run the program, then submit the output (via the webpage) for some easy points:

$ ghc reverse-words
$ ./reverse-words < in > out

We could write a faster solution in C: we’re given bounds on the input size, so we could do everything on the stack and avoid heap manipulations that Haskell must be doing behind the scenes. My younger self would have done this, and relished the speed.

How much running time would this shave off? Let’s be generous and say we could save 5 seconds by using C. Now consider how much time it takes to write a complete C program, versus 3 words and 2 dots in Haskell. Even with an advanced IDE, it’s hard to imagine the savings in running time compensating for the cost in programming time.

Store Credit

(From Google Code Jam Qualification Round Africa 2010. If you’re lucky you might still find the original problem statement somewhere on the web.)

Since we may assume each input test case has a unique pair of items whose cost sum to the desired amount, we can write a straightforward solution with a list comprehension:

solve c ps = head [show i ++ " " ++ show j | let m = length ps,
    i <- [1..m-1], j <- [i+1..m], ps!!(i - 1) + ps!!(j - 1) == c]

solve 100 [5,75,25]
solve 200 [150,24,79,50,88,345,3]
solve 8 [2,1,9,4,4,56,90,3]

Here, c is the desired amount, and ps is the list of prices of the items. We slap on a dab of code to parse the input and print the "Case #x:" preamble:

main = interact $ unlines . zipWith (++)
  ["Case #" ++ show t ++ ": " | t <- [1..]] . f . tail . lines

f [] = []
f (_c:_:_ps:s) = solve (read _c) (map read $ words _ps) : f s

But this solution is slow. Haskell lists truly are linked lists, so looking up the nth member takes n steps, implying cubic running time. We should instead use arrays for a quadratic-time solution. (A hash table could solve the problem in linear time, but the inputs are so small we may as well stick with arrays.)

import Data.Array

main = interact $ unlines . zipWith (++)
  ["Case #" ++ show t ++ ": " | t <- [1..]] . f . tail . lines

f [] = []
f (_c:_:_ps:s) = solve (read _c) (map read $ words _ps) : f s

solve c ps = let
  n = length ps
  a = listArray (1, n) ps
  in head [show i ++ " " ++ show j |
    i <- [1..n-1], j <- [i+1..n], a!i + a!j == c]

Lists and Arrays

The two solutions above serve as an introduction to lists and arrays in Haskell.

Haskell’s lists are a pleasure to work with. Standing on the shoulders of Lisp, much thought went into their design, and over the years, a rich and luxurious library of list functions has evolved and matured.

Arrays on the other hand, though tolerable, are less refined. Within the Data.Array namespace there are several kinds of arrays to choose from, all slower than good old C arrays. Moreover, Data.Vector may be better sometimes, and this module must be installed separately to the base system.

Updating a single array element takes constant time in other languages, but unless we make a special effort, in Haskell, this operation is as expensive as copying the entire array.

Thus for smaller problems, we often reach for lists first, and consider arrays later, the opposite of our behaviour when programming in C-like languages.

Ben Lynn blynn@cs.stanford.edu 💡

Practice and Learn

Reverse Words

Store Credit

Lists and Arrays

Contents