{# LANGUAGE CPP #}
#ifdef __HASTE__
{# LANGUAGE PackageImports #}
#endif
{# LANGUAGE LambdaCase, TupleSections, RecordWildCards #}
{# LANGUAGE FlexibleContexts #}
#ifdef __HASTE__
import "mtl" Control.Monad.State.Strict
import Haste.DOM
import Haste.Events
import Haste.Graphics.Canvas
import Data.IORef
import Text.Read (readMaybe)
#else
import Control.Monad.State.Strict
#endif
import Data.List (find)
import Data.Map (Map, (!))
import qualified Data.Map as M
Get a Life
Zoom: Steps: 2^
Presets:
When I was a kid, I wrote a BASIC program to explore Conway’s Game of Life on my 8088. It was painfully slow, so I began an assembly rewrite. But I failed at Life: I abandoned the project in frustration because I kept freezing up my system.
It was just as well. I believe the only optimizations I had in mind were bit twiddling and focusing on or around live cells. So I was astounded when I later learned of Bill Gosper’s Hashlife algorithm, which miraculously simulates an exponential number of generations in linear time. All the bit twiddling in the world couldn’t come close.
I eagerly studied any explanations I could find. Unfortunately, although Hashlife’s ingredients are elementary in isolation (quadtrees; memoization; interning), some aspects of the mixture confused me. I hoped one day to figure it all out and run Life in the fast lane.
I have at last fulfilled this lifelong ambition. Perhaps I acted out of anger, as now and then I’ll see a post about coding the Game of Life, only to discover the naive approach lurking behind it. Seriously?! Who cares about the elegant source or whatever if the algorithm is mediocre? [I exaggerate; in truth, I enjoy seeing the Game of Life in one line of APL, with comonads, in the Game of Life itself, and so on.]
At some point it’s easier to quit complaining and fight back. Here, then, is yet another writeup on programming the Game of Life, but one of the minority featuring Hashlife. Batteries included: we walk through the entire source for the demo above.
A matter of life and death
Life takes places on an infinite grid of cells, each of which is dead or alive. A cell’s neighbours are the eight cells surrounding it (Moore neighbourhood). Then:

A dead cell with exactly 3 live neighbours comes to life.

A live cell with exactly 2 or 3 live neighbours remains alive.

Otherwise the cell will be dead.
We represent a dead cell with 0 and a live cell with 1. The following function summarizes the rules:
nextLife :: Int > Int > Int
nextLife 0 3 = 1
nextLife 1 2 = 1
nextLife 1 3 = 1
nextLife _ _ = 0
Curiously, this is the same as saying a cell is alive exactly when the sum or product of these two numbers is 3.
Quad workout
Quadtrees are fundamental to Hashlife. In our rendition, at the lowest level, we represent a single cell with a bit.
At the level above, we organize the cells in 2x2 block. We represent the 4 bits of a block with an Int. We provide functions to convert between an Int and the 4 cells in a 2x2 block:
data ZNode = ZNode Int Int Int Int deriving (Show, Eq, Ord)
pack4 :: Int > Int > Int > Int > Int
pack4 a b c d = a + 2*b + 4*c + 8*d
unpack4 :: Int > ZNode
unpack4 n = ZNode (f 1) (f 2) (f 4) (f 8) where f k = div n k `mod` 2
We record the 4 cells of a 2x2 block in Zorder. If we imagine a compass, then the order is NW, NE, SW, SE, the same order a pen travels when we write the letter Z. We use screen coordinates.
zorder :: [(Int, Int)]
zorder = [(0,0), (1,0), (0,1), (1,1)]
At the next level above, we organize the 2x2 blocks themselves in 2x2 superblocks. These are represented as a ZNode; each number represents a 2x2 block of cells in Zorder.
Consider a 4x4 block of labelled cells, where each cell’s state is represented with a bit:
a0 a1 b0 b1 a2 a3 b2 b3 c0 c1 d0 d1 c2 c3 d2 d3
We represent the above with ZNode a b c d where a is the Int with binary representation a3 a2 a1 a0, and similarly for the other fields.
Given a 4x4 block of cells, the following returns the 2x2 central block of cells of the next generation. The rules imply no cell outside the 4x4 grid may affect this calculation. They also imply this 2x2 central block is the only part of the next generation we may reliably compute.
base :: Int > Int > Int > Int > Int
base a b
c d = pack4 nw ne
sw se
where
ZNode a0 a1
a2 a3 = unpack4 a
ZNode b0 b1
b2 b3 = unpack4 b
ZNode c0 c1
c2 c3 = unpack4 c
ZNode d0 d1
d2 d3 = unpack4 d
nw = nextLife a3 $ sum
[ a0, a1, b0
, a2, b2
, c0, c1, d0
]
ne = nextLife b2 $ sum
[ a1, b0, b1
, a3, b3
, c1, d0, d1
]
sw = nextLife c1 $ sum
[ a2, a3, b2
, c0, d0
, c2, c3, d2
]
se = nextLife d0 $ sum
[ a3, b2, b3
, c1, d1
, c3, d2, d3
]
The building blocks of Life
We devise a method to represent a ZNode as an Int, so that we represent an 8x8 block of cells as a ZNode, whose fields now refer to 4x4 cells. We can go further: by induction, for any n, we can represent any 2^{n} x 2^{n} block of cells with a ZNode, which we interpret as a 2x2 block of 2^{n1} x 2^{n1} cells.
We sometimes call this n the level. It is not stored in the ZNode itself, so functions taking ZNodes often need another argument that contains the level.
We only store one copy of a ZNode (think hash consing; string interning; maximal sharing; the flyweight pattern). Life patterns are often repetitive, so a handful of ZNodes sometimes suffice for representing many cells.
This is how Hashlife got its name, though we shall use an ordered tree instead of hashing to map ZNodes to Ints.
We maintain Data.Map of ZNodes to Ints. On encountering a ZNode, we first check this map: if already present, then we reuse its corresponding Int. Otherwise we assign it an unused Int and add it to the map. The Ints start from 16 to avoid colliding with numbers that represent 2x2 blocks of cells.
We also maintain an inverse map from an Int to the ZNode it represents.
In addition, each ZNode is paired with an Int that represents its future center; see below.
Empty cells are a special case. The Int 0 represents empty squares of any level. This comes in handy when we want to embed a given pattern in a large empty space.
data Mem = Mem (Map Int (ZNode, Int)) (Map ZNode Int) deriving Show
initMem :: Mem
initMem = Mem mempty mempty
intern :: ZNode > Int > State Mem Int
intern z future = do
Mem m idxs < get
let next = M.size idxs + 16
put $ Mem (M.insert next (z, future) m) $ M.insert z next idxs
pure next
recall :: Int > State Mem (ZNode, Int)
recall 0 = pure (ZNode 0 0 0 0, 0)
recall k = (\(Mem m _) > m!k) <$> get
Half Life
Here’s a puzzle that might appear in a book for children. How many 2x2 boxes can you see in a 4x4 grid?
+++++ a0a1b0b1 +++++ a2a3b2b3 +++++ c0c1d0d1 +++++ c2c3d2d3 +++++
There are nine overlapping 2x2 boxes:
+++ +++ +++ a0a1 a1b0 b0b1 +++ +++ +++ a2a3 a3b2 b2b3 +++ +++ +++ +++ +++ +++ a2a3 a3b2 b2b3 +++ +++ +++ c0c1 c1d0 d0d1 +++ +++ +++ +++ +++ +++ c0c1 c1d0 d0d1 +++ +++ +++ c2c3 c2d0 d2d3 +++ +++ +++
If we replace each 2x2 box with a single 1x1 box centered at the same point, then they no longer overlap, but tile perfectly to make a 3x3 grid:
++++ x0x1x2 ++++ x3x4x5 ++++ x6x7x8 ++++
We can repeat this on the 3x3 grid. We have four overlapping 2x2 boxes:
+++ +++ x0x1 x1x2 +++ +++ x3x4 x4x5 +++ +++ +++ +++ x3x4 x4x5 +++ +++ x6x7 x7x8 +++ +++
Again, if we replace each 2x2 box with a single 1x1 box centered at the same point, they tile perfectly:
+++ y0y1 +++ y2y3 +++
In summary, we started with a 4x4 grid, and finished with a 2x2 grid in the middle. Hashlife combines this process with time travel.
Double Life
Our base function takes 4x4 cells and returns the central 2x2 cells one generation into the future.
Suppose we are given 8x8 cells. We view them as a 4x4 grid of 2x2 cells and follow the above procedure, calling the base function whenever we wish to replace a 2x2 box (measuring 4x4 cells) with a 1x1 box (measuring 2x2 cells). Hence we advance one generation to get to the 3x3 grid, and advance another generation to get to the 2x2 grid.
In summary, given 8x8 cells, repeated calls to base reveals the central 4x4 cells 2 generations into the future.
By induction, we can do the same for higher powers of 2. For example, given 256x256 cells, a bunch of recursions gives us the central 128x128 cells exactly 64 generations into the future.
We have the perfect conditions for memoization. Thus we record the future central cells the first time we compute them, and look them up every time thereafter. This is why we associate each ZNode with an Int; the latter references the future central cells.
We split off a function to reduce a 3x3 grid to a 2x2 grid, as it will turn out to have more uses later. The reduce3x3 function applies a function f to each 2x2 box in a 3x3 grid, then applies g to their results.
gosper :: Int > Int > Int > Int > Int > State Mem Int
gosper 0 a b c d = pure $ base a b c d
gosper n a b
c d = do
(ZNode _ a1
a2 a3, x0) < recall a
(ZNode b0 _
b2 b3, x2) < recall b
(ZNode c0 c1
_ c3, x6) < recall c
(ZNode d0 d1
d2 _, x8) < recall d
x1 < rec a1 b0
a3 b2
x3 < rec a2 a3
c0 c1
x4 < rec a3 b2
c1 d0
x5 < rec b2 b3
d0 d1
x7 < rec c1 d0
c3 d2
reduce3x3 rec (memo n) x0 x1 x2
x3 x4 x5
x6 x7 x8
where
rec a b c d = memo n a b c d >>= fmap snd . recall
reduce3x3 f g
x0 x1 x2
x3 x4 x5
x6 x7 x8 = do
y0 < f x0 x1
x3 x4
y1 < f x1 x2
x4 x5
y2 < f x3 x4
x6 x7
y3 < f x4 x5
x7 x8
g y0 y1
y2 y3
memo :: Int > Int > Int > Int > Int > State Mem Int
memo _ 0 0 0 0 = pure 0
memo 0 a b c d = pure $ pack4 a b c d
memo n a b c d = seek >>= maybe create pure
where
z = ZNode a b c d
seek = (\(Mem _ idxs) > M.lookup z idxs) <$> get
create = intern z =<< gosper (n  1) a b c d
Such is Life
We’re done! We’ve presented a complete implementation of Hashlife. However, we need a bit more code to show it off.
We define a bookkeeping data type to hold the current pattern’s size (it measures 2^{lifeSize+1} x 2^{lifeSize+1} cells), the coordinates of the topleft corner, and its ZNode index.
data Life = Life
{ lifeSize :: Int
, lifeOrigin :: (Int, Int)
, lifeIndex :: Int
, lifeMemory :: Mem
} deriving Show
A straightforward recursion turns a list of points into a quadtree:
enliven :: [(Int, Int)] > Life
enliven [] = Life 0 (0, 0) 0 initMem
enliven ps = uncurry (Life sz (xmin, ymin))
$ runState (enc sz (xmin, ymin)) initMem where
(xs, ys) = unzip ps
xmin = minimum xs
ymin = minimum ys
xmax = maximum xs
ymax = maximum ys
loggish n = max 0 $ head (filter (\k > 2^k >= n) [0..])  1
sz = loggish $ max (ymax  ymin) (xmax  xmin) + 1
enc _ (ox, oy)  ox > xmax  oy > ymax = pure 0
enc n (ox, oy) = mapM go zorder >>= (\[a,b,c,d] > memo n a b c d) where
go (dx, dy)
 n == 0 = pure $ fromEnum $ elem (ox + dx, oy + dy) ps
 otherwise = enc (n  1) (ox + dx*p, oy + dy*p)
p = 2^n
We’d like an inverse to this function, but we first deal with an annoyance. For quadtree nodes at the lowest level, it’s irritating that we break a number into 4 bits rather than look it up in a map. We add a function to hide this:
visit :: Int > State Mem ZNode
visit k  k < 16 = pure $ unpack4 k
 otherwise = fst <$> recall k
Listing the coordinates of the live cells of a given pattern is now another straightforward recursion:
points :: Life > [(Int, Int)]
points Life{..} = evalState (coords lifeSize lifeOrigin lifeIndex) lifeMemory
where
coords :: Int > (Int, Int) > Int > State Mem [(Int, Int)]
coords _ _ 0 = pure []
coords n (x, y) k = do
ZNode a b c d < visit k
concat <$> zipWithM go [a,b,c,d] zorder
where
go p (dx, dy)
 n == 0 = pure $ if p == 0 then [] else [(x+dx, y+dy)]
 otherwise = coords (n  1) (x + e*dx, y + e*dy) p
e = 2^n
We perform a quick sanity check with GHCi. We define a pattern:
rPentomino :: [(Int, Int)]
rPentomino = [(1,0), (2,0), (0,1), (1,1), (1,2)]
Then confirm:
points $ enliven rPentomino
returns the same points.
Let’s predict the future! The following function takes a 2^{n+1} x 2^{n+1} pattern and returns its 2^{n} x 2^{n} center, 2^{n1} generations into the future. We just pluck out the memoized future center index and repackage it:
scry :: Life > Life
scry Life{..} = Life n (ox + p, oy + p) x lifeMemory where
(ox, oy) = lifeOrigin
n = lifeSize  1
p = 2^n
x = snd $ evalState (recall lifeIndex) lifeMemory
We find:
points $ scry $ enliven rPentomino
returns the single point [(1,2)]. The rpentomino fits in a 4x4 square, so scry returns the middle 2x2 square 1 generation into the future. The result is correct, but underwhelming.
To travel further through time and space, we repeatedly pad our pattern with empty squares before calling scry. Here, we find it handy that 0 represents an empty square of any size. For the sake of code reuse, to double the dimensions of a pattern, we surround it with empty squares to make a 3x3 grid, then reduce with middle, a function which takes a 4x4 grid and discards all but the central 2x2 grid.
pad :: Life > Life
pad Life{..} = Life
{ lifeSize = n
, lifeOrigin = (ox  p, oy  p)
, lifeIndex = i'
, lifeMemory = st
} where
(ox, oy) = lifeOrigin
p = 2^lifeSize
n = lifeSize + 1
i = lifeIndex
(i', st) = runState (reduce3x3 (middle n) (memo n)
0 0 0
0 i 0
0 0 0) lifeMemory
middle :: Int > Int > Int > Int > Int > State Mem Int
middle n a b c d = do
ZNode _ _ _ a3 < visit a
ZNode _ _ b2 _ < visit b
ZNode _ c1 _ _ < visit c
ZNode d0 _ _ _ < visit d
memo (n  1) a3 b2 c1 d0
Rather than return a list of points, let’s draw the central 64x64 cells:
#ifndef __HASTE__
main :: IO ()
main = putStr $ unlines $
[[if elem (c, r) ps then '#' else ' '  c < [32..32]]  r < [32..32]]
where
ps = points $ scry $ iterate pad (enliven rPentomino) !! 10
#endif
What should we see? Well, we started with a 4x4 pattern, then padded it 10 times, yielding a 2^{12} x 2^{12} pattern. Hashlife should return the central 2^{11} x 2^{11} cells 2^{10} = 1024 generations into the future, and we display the central 64x64 cells:
## # # # # # # # # ## ## ## # # # # # ### ### # # # # # # # # # # ## # # ### ## # # # ## # ## # ## ## # ## #### # # ## # # # # # # # # # # # ## # ## # ## ## # # # # ### ##
This took under a quarter of a second on my laptop. Evidently we take less than 250 microseconds per generation.
If we bump up the 10 to 20, we find the running time hardly changes, so we’re somehow beating 250 nanoseconds a generation. Bump it up some more, and it seems we compute each generation in less than a CPU clock cycle! Of course, in reality, the pattern has stabilized and we’re just recycling memoized answers.
Baby steps
Hashlife is great for exploring the distant future, but what if we want to take it slow and watch the pattern change one generation at a time?
No problem. Instead of traveling some power of 2 generations into the future in each reduction step, we simply travel 0 generations, except for the base case which remains the same as before, namely, it computes one generation ahead.
More generally, to step forward 2^{k} generations at a time for any k, we travel 0 generations as we reduce until we reach the desired level, then switch to the original Hashlife algorithm.
We start with a more general version of gosper that is like the big brother of reduce3x3: a function that slides a 2x2 window over a 4x4 grid and applies a function f to each, followed by applying g to the 3x3 results.
We could rewrite gosper to use this function, but that would result in more lookups, and we prefer to have the code above to stand on its own.
reduce4x4 f g a b
c d = do
ZNode a0 a1 a2 a3 < visit a
ZNode b0 b1 b2 b3 < visit b
ZNode c0 c1 c2 c3 < visit c
ZNode d0 d1 d2 d3 < visit d
x0 < f a0 a1
a2 a3
x1 < f a1 b0
a3 b2
x2 < f b0 b1
b2 b3
x3 < f a2 a3
c0 c1
x4 < f a3 b2
c1 d0
x5 < f b2 b3
d0 d1
x6 < f c0 c1
c2 c3
x7 < f c1 d0
c3 d2
x8 < f d0 d1
d2 d3
g x0 x1 x2
x3 x4 x5
x6 x7 x8
We begin by recursively reducing with middle, as this function extracts the center cells with no time travel. Upon reaching the level k we look up the future center.
baby :: Int > Life > Life
baby k Life{..} = Life
{ lifeSize = sz
, lifeOrigin = (ox + p, oy + p)
, lifeIndex = i'
, lifeMemory = st
} where
(ox, oy) = lifeOrigin
sz = lifeSize  1
p = 2^sz
go _ 0 0 0 0 = pure 0
go n a b c d
 n <= k = memo (n + 1) a b c d >>= fmap snd . recall
 otherwise = reduce4x4 (middle n)
(reduce3x3 (go (n  1)) (memo n)) a b c d
(i', st) = runState (visit lifeIndex
>>= \(ZNode a b c d) > go sz a b c d) lifeMemory
There may be live cells on the edge of a pattern, so before we step, we pad it with enough empty cells to guarantee no live cells are lost after reduction. At the least, this requires quadrupling both dimensions.
Afterwards, the extra room may prove unnecessary. In an interactive setting, we repeatedly display the pattern after advancing some number of generations, so to avoid runaway padding, we shrink the pattern when possible. We look for a 2x2 box in a 4x4 grid such that all cells outside the box are dead. If one exists, then we discard all but the box and recurse.
shrink :: Life > Life
shrink Life{..} = uncurry ($) $
runState (go lifeSize lifeOrigin lifeIndex) lifeMemory
where
f a b c d = pure $ ZNode a b c d
zsum (ZNode a b c d) = a + b + c + d
go 0 d k = pure $ Life 0 d k
go n (dx, dy) k = do
ZNode a b c d < visit k
reduce4x4 f g a b c d
where
g x0 x1 x2 x3 x4 x5 x6 x7 x8 = let
tot = sum $ zsum <$> [x0, x2, x6, x8]
xs = [x0,x1,x2,x3,x4,x5,x6,x7,x8]
xds = zip xs [0..]
in case find ((tot ==) . zsum . fst) xds of
Just (ZNode a b c d, i) > let
(y, x) = divMod i 3
in go (n1) (dx + x*2^(n1), dy + y*2^(n1))
=<< memo (n1) a b c d
Nothing > pure $ Life n (dx, dy) k
run :: Int > Life > Life
run k lf@Life{..} = shrink $ baby k $ iterate pad lf !! n where
n = max 2 $ k + 1  lifeSize
Our display is limited to a viewport, so we refine our points function to prune cells outside a given rectangle. We also use difference lists to avoid costly concatenation.
  Assumes x0 y0 even, x1 y1 odd, x0 < x1, y0 < y1.
crop :: (Int, Int) > (Int, Int) > Life > [(Int, Int)]
crop (x0, y0) (x1, y1) Life{..} = evalState (go lifeSize lifeOrigin lifeIndex) lifeMemory []
where
go _ _ 0 = pure id
go n (x, y) k
 x > x1  y > y1  x + 2*e <= x0  y + 2*e <= y0 = pure id
 otherwise = do
ZNode a b c d < visit k
foldr (.) id <$> zipWithM f [a,b,c,d] zorder
where
f p (dx, dy)
 n == 0 = pure $ if p == 0 then id else ((x+dx, y+dy):)
 otherwise = go (n  1) (x + e*dx, y + e*dy) p
e = 2^n
Lastly, we throw together a makeshift UI for the interactive demo at the top of this page:
#ifdef __HASTE__
main :: IO ()
main = withElems
["canvas", "goB", "level", "customB", "customT", "zoomDown", "zoomUp"]
$ \[canvasE, goB, lvl, customB, customT, zoomUp, zoomDown] > do
Just canvas < fromElem canvasE
lf < newIORef $ enliven rPentomino
zoomRef < newIORef 4
let
snapshot = do
zoom < readIORef zoomRef
let
w = div 320 (zoom * 2)
h = div 240 (zoom * 2)
cell (x', y') = renderOnTop canvas $ fill
$ rect (x, y) (x + z, y + z)
where
x = fromIntegral $ zoom*(x' + w)
y = fromIntegral $ zoom*(y' + h)
z = fromIntegral zoom
render canvas $ setFillColor (RGB 0 0 0)
mapM_ cell . crop (w,h) (w1,h1) =<< readIORef lf
setup but pat = do
Just b < elemById but
void $ b `onEvent` Click $ const $ do
writeIORef lf $ enliven pat
snapshot
snapshot
setup "acorn" [(1,0),(3,1),(0,2),(1,2),(4,2),(5,2),(6,2)]
setup "rpent" rPentomino
setup "pihept" [(0,0),(1,0),(2,0),(0,1),(0,2),(2,1),(2,2)]
setup "rabbits" [(0,0),(4,0),(5,0),(6,0),(0,1),(1,1),(2,1),(5,1),(1,2)]
void $ customB `onEvent` Click $ const $ do
ps < maybe [] id . readMaybe <$> getProp customT "value"
writeIORef lf $ enliven ps
snapshot
void $ goB `onEvent` Click $ const $ do
n < maybe 0 id . readMaybe <$> getProp lvl "value"
setProp lvl "value" $ show n
modifyIORef lf $ run n
snapshot
void $ zoomUp `onEvent` Click $ const $ do
modifyIORef zoomRef $ max 1 . (`div` 2)
snapshot
void $ zoomDown `onEvent` Click $ const $ do
modifyIORef zoomRef $ min 16 . (*2)
snapshot
#endif
A better Life
We could replace our persistent treebased maps with mutable hash maps in the ST monad. I avoided this to cause less trouble when compiling with Haste, which I use to build the JavaScript version.
It may be cleaner to split off the cached future center into another map. And for faster baby steps, perhaps we could memoize multiple center blocks at different points in the future and only compute futures lazily.
It turns out we can remove the pack4 and unpack4 mess (and use our generic ZNode scheme to hold 4 bits) at the cost of some performance; about 10% on the rpentomino. The code would be marginally cleaner, but the base case is still special, and our writeup would need to mention interning earlier.
On the other hand, to speed up our program, we could double down on bit twiddling and make 8x8 the lowest level. We could place entries of a 4x4 or 3x3 matrix in an array instead of making them function arguments and writing out everything explicitly. See Gil Dogon’s annotated C implementation of Hashlife.