Import:
= Penney's Game =

Alice and Bob play a game.
Alice chooses a sequence of 3 coin flips. Then Bob chooses another
sequence of the same length. They flip a coin over and over again and a player
wins as soon as their sequence appears.

What sequence should Alice pick? Does it matter? After all, if you flip a coin
3 times, every sequence has the same chance of occurring.

Well, here's one choice that she should avoid. Suppose Alice picks HHH, and Bob
subsequently picks THH. Then Alice can only win if the first 3 flips are all
heads, that is, Bob wins with probability 7/8.

== How to play ==

This game, known as https://en.wikipedia.org/wiki/Penney%27s_game[Penney's
game], is trickier than it may seem. But with a few clever observations, we can
figure out the best strategies for sequences of any size.

Define:

  * `aintOver goals s` is `True` when no element of `goals` appears somewhere in the string `s`, that is, the game is still ongoing.

  * `wins goals k s` is `True` when the string in `goals` of index `k` has just won the game described by `s`, that is, the string `s` ends with the goal string of index `k` and this is the first goal string of any index to appear.

[ ]:
aintOver goals s = not $ any (`isInfixOf` s) goals

wins goals k s = (goals!!k) `isSuffixOf` s && aintOver goals (init s)

-- From Data.List:
isPrefixOf [] _         =  True
isPrefixOf _  []        =  False
isPrefixOf (x:xs) (y:ys)=  x == y && isPrefixOf xs ys
isSuffixOf xs ys = isPrefixOf (reverse xs) (reverse ys)
isInfixOf needle haystack = any (isPrefixOf needle) (tails haystack)
tails [] = [[]]
tails xs@(x:xt) = xs : tails xt

-- These should all be True:
aintOver ["HHH", "THH"] "HHTHTHTH"
wins ["HHH", "THH"] 0 "HHH"
wins ["HHH", "THH"] 1 "HHTHTHTHH"
The following shows all coin flip sequences of a given length:

[ ]:
allFlips n = replicateM n "HT"
allFlips 4
For HHH versus THH, we see:

  * There are many sequences of length 4 where neither player has won yet
  (though Bob will inevitably win soon enough).

  * It is impossible for Alice to win in 4 flips.

  * Bob has 2 ways to win in 4 flips.

[ ]:
filter (aintOver ["HHH", "THH"]) $ allFlips 4
filter (wins ["HHH", "THH"] 0) $ allFlips 4
filter (wins ["HHH", "THH"] 1) $ allFlips 4
Some 4-letter strings are absent from these lists because they do not represent
any game. For example, HHHH is an invalid game because it was already over
after the 3rd coin flip.

Let's switch to a less trivial example. Suppose Alice, knowing that HHH is a
poor choice, picks HTH. Next, Bob picks HHT, with an innocent expression on his
face. Who is more likely to win? Both goals have two Hs and one T, so maybe no
one has an advantage?

[ ]:
goals = ["HTH", "HHT"]
filter (aintOver goals) $ allFlips 4
filter (wins goals 0) $ allFlips 4
filter (wins goals 1) $ allFlips 4
There are two games of length 4 where Bob wins, but only one where Alice wins.
Already there is an asymmetry worth investigating.

As we wish to compute probabilities, we care more about the number of ways than
particular combinations. Hence for a given 0-indexed list of goal strings,
define:

  * \(f(n)\) to be the number of \(n\)-letter strings where no one has won yet.

  * \(g_k(n)\) to be the number of ways that the \(k\)th goal string wins on the \(n\)th flip.

[ ]:
f n   = length $ filter (aintOver goals) $ allFlips n
g k n = length $ filter (wins goals k)   $ allFlips n
Then the probability that the \(k\)th goal string wins a game is:

\[
\sum_n g_k(n) 2^{-n}
\]

It looks like we're close to solving the problem, but this is an infinite sum,
and our functions have exponential time complexity. We'll need to be smarter.

== Recursion to the rescue ==

Let \(X\) be a string of length \(n\) where neither player has won yet. There
are 2 ways to append one more letter to get a string of length \(n + 1\),
namely H or T, and then either the game continues or someone wins:

\[
2 f(n) = f(n + 1) + g_0(n + 1) + g_1(n + 1)
\]

(Aside: can we prove this from our code via equational reasoning?)

We check this formula for \(n = 4\) for our example:

[ ]:
f 4
f 5
g 0 5
g 1 5
This is a great start, but as we have three functions, we need more recurrence
relations.

Again let \(X\) be a string of length \(n\) where neither player has won yet.
This time, we append Alice's goal string \(A\), which we denote \(X || A\).

If nothing else, Alice wins. For example, if \(X\) is TTHTT then \(X || A\) is
TTHTTHTH and Alice has just won.

However, we might have gone too far, so to speak. It could be that some strict
prefix of \(X || A\) describes a game that is over. For example, if \(X\) is
HTTTH then \(X || A\) is HTTTHHTH, which is an invalid game because Bob already
won one flip earlier.

This subtlety occurs exactly when a proper suffix of \(A\) is also the prefix
of some goal pattern, possibly itself. In our example, HT is both a prefix of
\(A\) and a suffix of \(B\). As we append \(A\) to a string ending in H, we
complete \(B\) before we finish \(A\).

This motivates the following. For any strings \(X, Y\), define the
_correlation_ \(XY\) to be the set of positive integers \(r\) such that
the last \(r\) letters of \(X\) are the same as the first \(r\) letters of
\(Y\).

The _autocorrelation_ of a string \(X\) is the correlation \(XX\). We must have
\(|X| \in XX\) as \(X\) is a prefix and suffix of itself; below this counts the
catch-all case of Alice winning once the entirety of \(A\) is appended.

[ ]:
correlate s t = filter (\r -> take r t `isSuffixOf` s) [1..length t]
autocorrelate s = correlate s s

correlate "HTH" "HTH"
correlate "HHT" "HTH"
[ ]:
-- Example from paper.
correlate "HTHTTH" "HTTHT"
correlate "HTTHT" "HTHTTH"
Then the above subtlety is:

\[
f(n) = \sum_{r\in AA} g_0 (n + r) + \sum_{r\in BA} g_1 (n + r)
\]

[ ]:
f 5
g 0 (5 + 1) + g 0 (5 + 3) + g 1 (5 + 2)
Similarly:

\[
f(n) = \sum_{r\in AB} g_0 (n + r) + \sum_{r\in BB} g_1 (n + r)
\]

== What is the matrix? ==

We now have 3 recurrence relations involving \(f, g_0, g_1\), and
we'll soon see these are enough to figure out who wins. Define generating
functions:

\[
F(z) = \sum_{n\in[0..]} f(n) z^{-n}
\]

\[
G_k(z) = \sum_{n\in[0..]} g_k(n) z^{-n}
\]

Thus the probability Alice wins is \(G_0(2)\), and the probability Bob wins is
\(G_1(2)\).

For a correlation \(XY\) and any element \(z\) of any ring, define:

\[
XY_z = \sum_{r\in XY} z^{r-1}
\]

We call this `correlateBase` in our code because a subset of \([1..n]\) can
be represented by an \(n\)-bit number which in turn can be interpreted as a
base-\(z\) number.

[ ]:
correlateBase z x y = sum $ (z^) . pred <$> correlate x y
From our first recurrence relation, and since \(f(0) = 1\) and
\(g_0(0) = g_1(0) = 0\):

\[
(z - 2) F(z) + z G_0(z) + z G_1(z) = z
\]

The other two recurrence relations imply:

\[
F(z) - z AA_z G_0(z) - z BA_z G_1(z) = 0
\]

\[
F(z) - z AB_z G_0(z) - z BB_z G_1(z) = 0
\]

We can now solve for \(F(z), G_0(z), G_1(z)\). A unique solution exists
because the determinant of these 3 linear equations is:

\[
\phi(z) =
\left|\begin{array}{c}
z-2 & z & z \\
1 & -z AA_z & -z BA_z \\
1 & -z AB_z & -z BB_z
\end{array}\right|
\]

The product of the main diagonal entries is a polynomial of degree higher than
all other terms of the determinant because it's where the autocorrelations
reside (recall \(|X| \in XX\)), ensuring a nonzero determinant.

Solving for \(G_0(z)\) and evaluating at \(z = 2\):

\[
G_0(2) = \frac{BB_2 - BA_2}{BB_2 - BA_2 + AA_2 - AB_2}
\]

In other words, the odds that Alice wins are:

\[
\frac{BB_2 - BA_2}{AA_2 - AB_2}
\]

[ ]:
oddsBase q a b = (num `div` d, denom `div` d) where
  go = correlateBase q
  num = go b b - go b a
  denom =  go a a - go a b
  d = gcd num denom

oddsCoin = oddsBase 2

gcd a b
  | b == 0 = a
  | b > a = gcd b a
  | otherwise = gcd b $ mod a b

oddsCoin "HTH" "HHT"
We see in the HTH versus HHT battle, Alice's odds are 1 to 2, that is, Bob is
twice as likely to win.

The above generalizes for alphabets with \(q\) different symbols, in which
case Alice's odds of winning are:

\[
\frac{BB_q - BA_q}{AA_q - AB_q}
\]

For example, the PIN number 2357 is easier to guess than 0000 in the sense that
it's more likely to show up first if we stab digits at random.

[ ]:
oddsBase 10 "2357" "0000"
We can also generalize to any number of players, even zero players.

Let \(S_1, ..., S_m\) be the goal strings for some nonnegative integer \(m\)
such that no goal string is a substring of any other goal string. This was
implied in our earlier examples because the goal strings were distinct and
had the same length; here, there is no such constraint on the lengths.

(We index from 1 this time to get nicer formulas.)

Then:

\[
(z - q) F(z) + z G_1(z) + ... +  z G_m(z) = z
\]

and for \(k \in [1..m]\):

\[
F(z) - z (S_1 S_k)_z G_1(z) - ... - z (S_m S_k)_z G_m(z) = 0
\]

== Table for one ==

Consider the solitaire variant of this game, namely, suppose Alice is the lone
player. What goal \(A\) should Alice choose to minimize the expected number of
flips?

The probability of needing more than \(n\) random letters is \(f(n)q^{-n}\),
hence the expected wait time is:

\[ \sum_n f(n) q^{-n} \]

From above, this turns out to be:

\[ F(q) = q AA_q \]

Presumably, Alice should choose a sequence that minimizes the wait time,
such as HTT for sequences of length 3.

[ ]:
eThrows q a = q * correlateBase q a a

[(a, eThrows 2 a) | a <- allFlips 3]
Then she has the best chance of seeing her goal show up before Bob's, and
has at least even odds of winning, right?

Wrong!

== After you. I insist. ==

The early bird gets the worm, but the second mouse gets the cheese.

If Alice chooses HTT, then Bob chooses HHT:

[ ]:
oddsCoin "HTT" "HHT"
Again, Bob is twice as likely to win.

If Alice tries to pre-empt this by choosing HHT herself, then Bob chooses
THH:

[ ]:
oddsCoin "HHT" "THH"
This time, Bob is three times more likely to win! (Alice only wins if the
first two flips are both heads.) What's going on?

From Bob's goal string, we get Alice's goal by adding the right letter, but not
the other way around. This hints that the expected wait time to reach Bob's
goal starting from Alice's string is much longer than the other way around.

Imagine a long random string of coin flips. We expect it to contain many
occurrences of both \(A\) and \(B\). If we randomly snip this string, we're
more likely to interrupt a transition from \(A\) to \(B\) than the other way
around. Starting from where we made the cut, we're more likely to encounter
\(B\) first.

We can quantify the expected wait time from one string to another with the
following handy generalization:

Let \(X\) be a string that contains none of
the goals \(S_1, ..., S_m\), though we allow \(X\) to be a substring of a goal.
(Indeed, setting \(X\) to the empty string yields the previous generalization.)

Define \(f(n)\) to be the number of strings of length \(n\) that start with
\(X\) and contain none of the goal strings, and \(g_k(n)\) to be the number of
strings of length \(n\) that start with \(X\) and end with \(S_k\), and
otherwise contain no occurrences of any goal string.

Then it can be shown:

\[
(z - q) F(z) + z G_1(z) + ... +  z G_m(z) = z^{1 - |X|}
\]

and for \(k \in [1..m]\):

\[
F(z) - z (S_1 S_k)_z G_1(z) - ... - z (S_m S_k)_z G_m(z) = z^{1-|S_k|}(X S_k)_z
\]

Returning to Alice and Bob, set \(X = B\) and again consider the solitaire
variant with \(A\) as the lone goal. Then we find the expected wait time to
reach \(A\) after \(B\) is:

\[
q AA_q - q BA_q
\]

[ ]:
eThrowsFrom q a b = q * correlateBase q a a - q * correlateBase q b a
This leads to an alternative derivation of the formula for Alice's odds. But
more importantly, we can use the expected wait time to shed light on Bob's
optimal strategy.

Let \(A = a_1 ... a_m\) be Alice's goal. Then Bob maximizes his chance of
winning by choosing \(B = b a_1 ... a_{m - 1}\) for some \(b\).

The case \(m = 2\) is small enough to be verified directly from the formula for
Alice's odds. Brute force shows Bob should choose \(b \ne a_1\) and
when possible, also ensure that \(b \ne a_2\).

When \(a_1 = a_2\), his odds of winning are \(q + 1\) to \(q - 1.\)
When \(a_1 \ne a_2\), Bob has even odds when \(q = 2\) (where he must choose
\(b = a_2\)), and for \(q \ge 3\) his odds to win are \(q\) to \(q - 1.\)

[ ]:
bruteCoin a = [(b, oddsCoin a b) | b <- allFlips $ length a, b /= a]
bruteCoin "HH"
bruteCoin "HT"
For goal strings of length 3, it turns out Bob should choose his first flip to
be the opposite of his third flip.

[ ]:
bobs a = (`oddsCoin` a) <$> filter (/= a) ['H':ai, 'T':ai] where ai = init a

bobs <$> allFlips 3
[ ]:
bobs <$> allFlips 5
We see that for coin flipping goals of length 5, the best strategy for Alice is
to pick a goal that caps Bob's odds of winning at 17 to 9. More generally, it
turns out for \(m \ge 4\), Alice's best choice is one head at the beginning,
two heads at the end, and tails otherwise, or the same string with everything
flipped, though with optimal play Bob's advantage is always close to 2 to 1.

For general \(m\), it is unknown if there is a simple scheme for picking \(b\)
optimally. For \(q = 2\), the best choice for \(b\) seems unique, but no proof
of this is known.

Penney's game is nontransitive, like the game of rock, paper, scissors. If one
player moves first, they suffer a great disadvantage.

=== Sketch of proof ===

Define the _periods_ of a string as follows:

[ ]:
periods s = filter (\r -> drop r s `isPrefixOf` s) [1..length s]

-- In terms of the autocorrelation:
periods' s = tail $ reverse $ (length s -) <$> 0 : autocorrelate s

periods "abcabc"
periods "abcabcd"
periods "abcdabc"
periods "aaaaaa"
We can interpret periods as the prefixes of \(S\) that can be repeated to
generate a string beginning with \(S\).

The _basic period_ of a string is its shortest period:

[ ]:
basicPeriod s = head $ periods s
The periods of a string are the same as its autocorrelation except we count the
number of removed letters rather than the number that remain, and we word the
definition to exclude zero but include the string length. Half-empty versus
half-full.

Let \(k\) be the length of Alice's goal string \(A\).

Assume \(k \ge 3\), as the \(k \le 2\) cases are easily dispatched.

Let \(A'\) be all but the last letter of \(A\) (Haskell's `init` function), and
let \(r\) be the basic period of \(A'\). Select \(b\) so that it differs from
the \(r\)th letter of \(A'\), and set \(B = b : A'\), where we have borrowed
Haskell's notation for consing a list of letters.

Let \(C\) be a string of length \(k\) whose tail (borrowing another term from
Haskell) differs from \(A'\). Then it turns out the odds of \(B\) appearing
before \(A\) must exceed the odds of \(C\) appearing before \(A\). This does
not imply Bob's best choice is \(B\); indeed, we can easily find examples where
Bob can do better than \(B\). But it does imply the tail of any best choice
must be \(A'\).

This is easy to show when \(r = 1\). Relabel so that \(A\) is HH...H and \(B\)
is TH...H. Let \(C\) be any string of length \(k\) whose tail contains at
least one letter that differs to H.

We have:

\[
AC_q \ge 0 = AB_q
\]

Also:

\[
CC_q \ge q^k = BB_q
\]

Lastly, let \(m\) be the number of trailing Hs in \(C\). We have
\(m \lt k - 1\) because of how we chose \(C\), hence:

\[
CA_q \le q + ... + q^m \lt q + ... + q^{k-1} = BA_q
\]

Therefore:

\[
\frac{AA_q - AB_q}{BB_q - BA_q} \gt \frac{AA_q - AC_q}{CC_q - CA_q}
\]

In other words, the goal \(B\) has strictly better odds of beating \(A\)
than any goal whose tail differs to the tail of \(A\).

Here we happen to know \(B\) is optimal, as \(B \ne A\) and by symmetry all
other choices of the first letter have the same odds, which we can
compute with a touch of algebra. Bob's odds of victory are:

\[
\frac{q^k - 1}{q^k - 2q^{k-1} + 1}
\]

It remains to handle the cases \(r \ge 2\).

Let \(t\) be the basic period of \(B\).

Suppose \(t \le k - 2\). By construction \(t\) is also a period of \(A'\), and
is not a multiple of \(r\). It can be shown this implies \(t + r \ge k\).
Briefly, suppose at least one integer less than \(k - r\) is period of \(A'\)
yet is not a mulitple of \(r\), and let \(s\) be the smallest such integer.
Then we can show \(s - r\) is a period of \(A'\), contradicting the minimality
of \(s\).

This implies \(t \ge (k + 1) / 2\), which can be refined to
\(t \ge (k + 2) / 2\).

When \(t \gt k - 2\) we also have \(t \ge (k + 2)/2 \) because
\( k - 1 \ge (k + 2) / 2 \) for \(k \ge 4\) and \(k = 3, t = 2\) leads to a
contradiction.

This bound then leads to bounds for \(AB_q\) and \(BB_q - BA_q\) which yield:

\[
\frac{AA_q - AB_q}{BB_q - BA_q} \ge
\frac{AA_q - \frac{q^{\lfloor k/2 \rfloor} - 1}{q - 1}}{q^{k-1} - q^{k-2} + q^{\lfloor k/2 \rfloor - 2}}
\]

This shows Bob's odds of winning are at least:

\[
\frac{q}{q-1} - O(q^{k/2})
\]

as \(k -> \infty\).

We have \(CA_q \le q^{k-3} + q^{k-4} + ... + 1\) and \(CC_q \ge q^{k-1}\) which
implies:

\[
\frac{AA_q - AC_q}{CC_q - CA_q} \le
\frac{AA_q}{q^{k-1} - \frac{q^{k-2} - 1}{q-1}}
\]

It remains to show one bound is strictly greater than the other. This turns out
to only take a few steps when \(q \ge 3\), while \(q = 2\) becomes a tedious
case analysis.

== References ==

Guibas and Odlyzko, _String Overlaps, Pattern Matching, and Nontransitive
Games_.

Ben Lynn blynn@cs.stanford.edu 💡