Import:
= Generating Functions, Generating Fun =

link:powser.html[We previously explored Doug McIlroy's beautiful code that
represents power series with lazy lists]. However, there's a drawback to this
direct approach: it could be unclear whether a given list represents a power
series or not.

McIlroy suggests a dedicated data type for power series, at the cost of a few
more characters:

[ ]:
infixr 5 :+:
data Ps a = Pz | a :+: Ps a deriving (Eq, Show)
where `Pz` is the power series zero.

We must tolerate some mess due to limitations of my compiler:

[ ]:
-- My Base class lacks Foldable.
foldrPs f z = \case
  Pz -> z
  a:+:at -> f a $ foldrPs f z at

-- My compiler lacks DeriveFunctor.
instance Functor Ps where fmap f = foldrPs ((:+:) . f) Pz

-- I don't have IsList either.
fromList = foldr (:+:) Pz
toList = foldrPs (:) []

zipPs f = \cases
  Pz _ -> Pz
  _ Pz -> Pz
  (a:+:at) (b:+:bt) -> f a b:+:zipPs f at bt
Then we rewrite the original definitions. Since we're paying extra, we may as
well extend our routines to support polynomials.

[ ]:
instance Ring a => Ring (Ps a) where
  (+) = zipZero (+)
  (-) = zipZero (-)
  (f:+:ft) * gs@(g:+:gt) = f*g:+:fmap (f*) gt + ft*gs
  _ * _ = Pz
  fromInteger = series . fromInteger

series = (:+:Pz)

zipZero (##) = \cases
  Pz ys -> (0##) <$> ys
  xs Pz -> (##0) <$> xs
  (x:+:xt) (y:+:yt) -> x##y :+: zipZero (##) xt yt

instance (Eq a, Ring a, Field a) => Field (Ps a) where
  (0:+:ft) / (0:+:gt) = ft/gt
  (f:+:ft) / (g:+:gt) = qs where qs = f/g:+:fmap ((1/g)*) (ft - qs*gt)

diff (_:+:ft) = zipPs (*) ft (fromList [1..])
int fs = 0:+:zipPs (/) fs (fromList [1..])

(f:+:ft) # gs@(0:+:gt) = f :+: gt*(ft#gs)
revert (0:+:ft) = rs where rs = 0 :+: 1/(ft#rs)
sqrt1 = \case
  0:+:0:+:ft -> 0:+:sqrt1 ft
  fs@(1:+:ft) -> qs where qs = 1 + int((diff fs)/(2*qs))
This definition makes some code marginally prettier:

[ ]:
x = (0:+:)
And we're done! We can perform magic with power series once more:

[ ]:
exps = 1 + int exps
tans = revert $ int $ 1/(1 + x(x 1))
peep = take 12 . toList
peep exps
peep tans
This time, we can also show off polynomials:

[ ]:
(1 + x(2 + x 3))^3
We take a moment to redefine `peep` so it also pretty-prints a power series
with MathJax.

[ ]:
texTerm a n = let p :% q = a/ 1 in (. texPower n) case q of
  1 | p == 1, n > 0 -> id
    | otherwise -> shows p
  _ -> ("\\frac{"++) . shows p . ("}{"++) . shows q . ('}':)
texPower = \case
  0 -> id
  1 -> ('x':)
  n -> ("x^{"++) . shows n . ('}':)
tex = go 0 True where
  go n isFirst = \case
    Pz | isFirst -> ('0':)
       | otherwise -> id
    a :+: at
      | n == 12 -> ("..."++)
      | a == 0 -> go (n + 1) isFirst at
      | otherwise -> sepPlus . texTerm a n . go (n + 1) False at
      where sepPlus = if a > 0 && not isFirst then ('+':) else id
peep ps = do
  print $ take 12 $ toList ps
  jsEval $ [r|
const e = document.createElement("div"); e.style["font-size"]="85%";
e.textContent = String.raw`\[|] ++ tex ps [r|\]`;
repl.outdiv.appendChild(e); MathJax.typeset([e]);
|]

peep tans
peep $ (1 + x(2 + x 3))^3
== Formal Dress ==

By now, we're comfortable with representing a power series with a list.
But this idea cuts both ways: we can represent a list with a power series.

McIlroy demonstrates a power series whose coefficients are themselves power
series. Consider the identity:

\[
\frac{1}{1-z} = 1 + z + z^2 + ...
\]

If we choose \(z\) to be the polynomial \(1 + x\), then this produces all the
powers of \(1 + x\), whose coefficients form Pascal's triangle.

[ ]:
pascal = 1/(1 - x(series (1 + x 1)))
putStr $ unlines $ map (unwords . map (show . numerator)) $ toList <$>
  take 10 (toList pascal)
Although `iterate ((1 + x 1)*) 1` behaves identically, we'll see that
writing computations with power series comes in handy.

== Special ops ==

Our initial concern also cuts both ways: with our new way of thinking, it could
be unclear whether a given power series represents a list or not. Thus we say
_formal power series_ when we are viewing a power series as a list, and we're
interested in tasks such as computing members of a sequence rather than
actually summing up a sum and worrying about convergence. This occurs often in
combinatorics, where formal power series are called _generating functions_.

We don't abandon analysis completely. Occasionally we do care about convergence
after all; it can shed insight on the asymptotic behaviour of a sequence. Also,
all power series converge when \(x = 0\), and we write \(f(0)\) for the
constant term of a power series \(f(x)\). And even without convergence,
differentiation and integration remain sensible if we view them as operations
that modify the numbers of a sequence in a certain way.

It turns out there is more than one way to map lists to formal series. Herbert
Wilf, _Generatingfunctionology_, explains how to disambiguate. We call:

\[f(x) = a_0 + a_1 x + a_2 x^2 + ... \]

the _ordinary power series generating function_ for the _sequence of
coefficients_:

\[ \left\{a_n\right\}_0^\infty = a_0, a_1, a_2, ... \]

and we write:

\[
\newcommand{\ops}{\overset{ops}\longleftrightarrow}
f \ops \left\{a_n\right\}_0^\infty
\]

We may write \(f\) for \(f(x)\) when the variable is clear from context.

Suppose we drop the first \(h\) members of the sequence so
\(a_h, a_{h+1}, ...\) remains. Then we have:

+\[
\frac{f - a_0 - ... - a_{h-1} x^{h-1}}{x^h} \ops \left\{a_{n+h}\right\}_0^\infty
\]+

Haskell's `drop h` is more concise, but again, we'll see that writing
computations as arithmetic operations comes in handy.

Because power series are sums, the correspondence is linear, that is, for all
\(g \ops \left\{b_n\right\}_0^\infty\):

\[f + g \ops \left\{a_n + b_n\right\}_0^\infty\]

and for any constant \(c\):

\[cf \ops \left\{ca_n\right\}_0^\infty\]

== Why algebra matters ==

Combining the above ideas, if \(f\) is the ordinary power series for the
Fibonacci numbers 0, 1, 1, 2, 3, 5, ...

\[
f = 0 + x + x^2 + 2x^3 + 3x^4 + 5x^5 + ...
\]

then the recursion \(a_{n+2} = a_{n+1} + a_n\) and initial conditions
\(a_0 = 0, a_1 = 1\) become:

\[
f = x + x^2 \left(f + \frac{f}{x}\right)
\]

[ ]:
fibs = x(1 + x(fibs + fibs/x 1))
peep fibs
which perhaps impresses everyone except seasoned Haskell programmers, who may
say we've merely disguised the classic:

[ ]:
fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
If so, fine, but it's only because Haskell is such an amazing language to begin
with! At any rate, even if the classic edition is more elementary, we stress
again that it's handy to be able to express ideas with formal power series.

And this time, we reveal why:
formal power series allow us to draw on a vast wealth of
mathematical knowledge.
https://en.wikipedia.org/wiki/Timeline_of_algebra[Humankind has toiled in rings
and fields for thousands of years]. We've picked up a few tricks along the way,
such as the invention of symbolic algebra several centuries ago, and today we
think nothing of mechanically solving for \(f\):

\[
f = \frac{x}{1 - x - x^2}
\]

Indeed, a https://en.wikipedia.org/wiki/Computer_algebra_system[computer
algebra system] could automate this. We persevere without one, though we can
check our work:

[ ]:
peep $ x 1/(1 - x 1 - x(x 1))
We have, via
https://en.wikipedia.org/wiki/Partial_fraction_decomposition[partial fraction
decomposition]:

\[
f = \frac{x}{1 - x - x^2} = \frac{1}{\sqrt{5}}
\left(
\frac{1}{1 - \frac{1}{2}(1 + \sqrt{5})x}
-
\frac{1}{1 - \frac{1}{2}(1 - \sqrt{5})x}
\right)
\]

Then exploiting the above \(1 / (1 - z)\) identity leads to
https://en.wikipedia.org/wiki/Fibonacci_sequence#Binet's_formula[Binet's
formula] for the \(n\)th Fibonacci number:

\[
\frac{1}{\sqrt{5}} \left(\frac{1 + \sqrt{5}}{2}\right)^n
-
\frac{1}{\sqrt{5}} \left(\frac{1 - \sqrt{5}}{2}\right)^n
\]

Can you deduce this by reasoning with `drop` and `iterate` and `zipWith`?
Maybe, but why bother when ordinary algebra hands you the answer on a silver
platter?

For other sequences, we might never find nice formulas, though their generating
functions can help us compute the members of the sequence, or failing that,
tell us some of their properties.

== Derivative work ==

We have \(x D x^n = x (n x^{n-1}) = n x^n \).

[ ]:
xD = x . diff

peep $ iterate x 1 !! 5
peep $ xD $ iterate x 1 !! 5
Hence for any polynomial \(p(x)\):

\[p(xD) f \ops \left\{p(n) a_n\right\}_0^\infty\]

For example, the generating function for the squares is:

\[ (xD)^2\frac{1}{1-x} \ops \left\{n^2 \right\}_0^\infty\]

[ ]:
peep $ xD (xD (1 / (1 - x 1)))
When \(a_0 = 0\), this generalizes to a Laurent polynomial \(p\) by defining
+\(D^{-1}\)+ to mean integration from 0 to \(x\) and:

+\[(xD)^{-1} = D^{-1} x^{-1}\]+

Thus the generating function for the inverse of the nonzero squares is:

[ ]:
unxD = int . (/x 1)

peep $ unxD (unxD (1 / (1 - x 1) - 1))
This is the https://en.wikipedia.org/wiki/Dilogarithm[_dilogarithm_]
\(\mathrm{Li}_2(z)\), the order 2
https://en.wikipedia.org/wiki/Polylogarithm[_polylogarithm_]. The
https://en.wikipedia.org/wiki/Basel_problem[Basel problem] asks for the
value of \(\mathrm{Li}_2(1)\). (Spoiler: it's \(\pi^2 / 6\).)

== Convolution ==

Multiplying ordinary power series results in convolution:

\[f g \ops \left\{\sum_{k=0}^{n} a_k b_{n - k}\right\}_0^\infty\]

A useful specialization is +\(g = 1/(1 - x) \ops \left\{1\right\}_0^\infty \)+.
Then we get the partial sums of the \(a_n\):

\[f \left(\frac{1}{1-x}\right) \ops \left\{\sum_{k=0}^{n} a_k \right\}_0^\infty\]

For example, the ordinary power series for the sum of the first \(n\) squares
is:

[ ]:
peep $ xD (xD (1/(1 - x 1))) * (1/(1 - x 1))
McIlroy demonstrates convolution by counting binary trees. Let \(t\) be
the ordinary power series for the number of binary trees of \(n\) nodes.
Observe:

  1. There is a unique empty binary tree.

  2. A non-empty binary tree is a parent node with two child nodes that are themselves binary trees.

Or as we would say in Haskell:

----------------------------------------------------------------
data BinaryTree = Empty | Node BinaryTree BinaryTree
----------------------------------------------------------------

We have the generating function:

\[
t = 1 + x t^2
\]

The 1 counts the empty tree, the \(x\) accounts for the parent node, and
\(t^2\) is a convolution that counts the cases where the left child has \(k\)
nodes and the right has \(n - k\) nodes for each \(k\).

[ ]:
ts = 1 + x (ts^2)
peep ts
These are the Catalan numbers. Algebra shows:

\[
t = \frac{1 - \sqrt{1 - 4x}}{2x}
\]

[ ]:
peep $ (1 - sqrt1 (1 - x 4))/(x 2)
Considering the binomial expansion of the square root leads to a formula
for the \(n\)th Catalan number:

\[
\frac{1}{n+1} {2n \choose n}
\]

We omit the details here; we are content just celebrating another victory for
algebra.

McIlroy generalizes by allowing each node to have any number of children,
that is, ordered trees. We start with perhaps the simplest kind of tree: unary
trees, otherwise known as lists. Each node has zero or one children:

----------------------------------------------------------------
data List = EmptyList | ListNode List
----------------------------------------------------------------

so the generating function that counts them is:

[ ]:
list = 1 + x list
peep list
Although it's obvious there is exactly one list of each size, it's gratifying
to see our machinery work as intended.

A nonempty tree is a parent node with a list of child nonempty trees, also
known as a forest. Empty trees complicate matters because an empty child tree
would be the same as no child, so we stick to counting nonempty trees; we can
always account for the empty case afterwards.

These Haskell declarations cut a long story short:

----------------------------------------------------------------
data NonEmptyTree = Node Forest
data Forest = Forest (List NonEmptyTree)
----------------------------------------------------------------

and translate to the ordinary power series:

[ ]:
tree = x forest
forest = list#tree
In doing so, we score another win for algebra: Haskell's algebraic data types
map neatly to generating functions, sparing us from thinking hard. See
https://codewords.recurse.com/issues/three/algebra-and-calculus-of-algebraic-data-types[Joel
Burget's post for more], including how differentiation produces zippers.

[ ]:
peep tree
The Catalan numbers strike again! A sprinkling of algebra confirms the `forest`
function satisfies the same equation as `ts`.

== Cancel that order ==

What about unordered unlabeled trees? That is, we consider two trees to be
the same if they are _isomorphic_, which means there exists a bijection
\(\phi\) from the nodes of one tree to the other such that \(v, w\) are
connected if and only if \(\phi(v), \phi(w)\) are connected.

Without labels, It's hard to get at a tree, so we first consider trees with one
labeled node. Such trees turn out to be easier to reason about, and counting
them later helps us count unlabeled trees.

A _node-rooted tree_ (or just _rooted tree_ if clear from context) is a tree
where one node is designated to be the _root node_. This imposes an extra
condition on an isomorphism: now the bijection \(\phi\) must map the root of
one tree to the root of the other.

The presence of a single root allows us to easily navigate a tree, because as
programmers know well, each child node of the root node can be viewed as the
root of a subtree.

From link:change.html[our money-changing demo], if we have coins with
denominations \(c_1, ..., c_k\) then the number of ways to make \(c\) cents
is the coefficient of \(x^c\) in the power series:

\[
\frac{1}{(1 - x^{c_1})...(1 - x^{c_k})}
\]

This formula is valid even if \(c_1,...,c_k\) are not distinct, which happens
when, say, we wish to distinguish between limited-edition commemorative 10c
coins and regular 10c coins when counting the ways to make change. Thus if
\(a_n\) is the number of different coins worth \(n\) cents we can rewrite this
power series as:

+\[
\frac{1}{(1 - x)^{a_1}(1 - x^2)^{a_2}...}
\]+

Building rooted trees is similar to making change. A rooted tree of size
\(c + 1\) has one root node and some number of children which are rooted trees
whose sizes add up to \(c\); the number of ways this can happen is the
coefficient of \(x^c\) in:

+\[
\frac{1}{(1 - x)^{a_1}(1 - x^2)^{a_2}...}
\]+

where \(a_n\) is the number of rooted trees of size \(n\).

Let \(r\) be the ordinary power series for the number of rooted trees with
\(n\) nodes. We have \(r(0) = a_0 = 0\) since the empty tree lacks a root.
Then:

+\[
r = \sum_{n=0}^\infty a_n x^n = x \sum_{n=0}^\infty a_{n+1} x^n = x \prod_{n=1}^\infty \frac{1}{(1 - x^n)^{a_n}}
\]+

Wilf derives this equation at the end of Chapter 3 via a more general
framework, and notes that while it is sufficient for determining the
coefficients \(a_0, a_1, ...\) of \(r\), the equation is "fairly
formidable".
Indeed, the infinite product makes calculation unwieldy, as do the
exponentiations by \(a_n\).

== Easy Money ==

https://people.brandeis.edu/~gessel/homepage/slides/hunting-slides.pdf[Ira
Gessel's slides] describe how to tame the above equation. Following
link:napier.html[Napier], we apply logarithms to change products to sums and
exponentiations to multiplications. Then we interchange the order of summations
to tightly bond each \(a_n\) and \(x^n\) so we can write expressions in terms
of \(r\) rather than its individual coefficients.

Recall:

\[ \log 1/(1 - z) = z + z^2/2 + z^3/3 + ... \]

For fun, we confirm several terms:

[ ]:
logs = int $ 1/(1 - x 1)
peep logs
Then:

+\[
\begin{align}
\log \prod_{n=1}^\infty \frac{1}{(1 - x^n)^{a_n}}
& = \sum_{n=1}^\infty a_n \log \frac{1}{1 - x^n} \\
& = \sum_{n=1}^\infty a_n \sum_{k=1}^\infty \frac{x^{nk}}{k} \\
& = \sum_{k=1}^\infty \frac{1}{k} \sum_{n=1}^\infty a_n (x^k)^n \\
& = \sum_{k=1}^\infty \frac{1}{k} r(x^k) \\
\end{align}
\]+

Therefore:

+\[
r(x)
= x \exp \sum_{k=1}^\infty \frac{1}{k} r(x^k) \\
\]+

Define \(M\) to be the operator that takes a power series with a zero constant
term and returns the power series that arises in the money-changing problem:

+\[ M f(x)
= \exp \sum_{k=1}^\infty \frac{1}{k} f(x^k) \]+

Then we can write:

\[ r(x) = x M r(x) \]

We define a `money` function to compute \(M\). Some care is needed because our
mutually recursive lazy lists require bootstrapping to avoid infinite loops;
typically, we use `x` in our code to to hive off deeper recursive calls behind
a `(:+:)`.

We can produce \(f(x^k)\) by inserting \(k\) zero coefficients before each
coefficient of \(f(x)\). Since \(a_0 = 0\), we have \(f(x^k) = a_1 x^{k} + ...
\), so we only start doing this on the \(k\)th step, giving the recursion
enough runway to take off.

[ ]:
money (0:+:t) = exps # x (go id 1) where
  go zs k = fmap ((1/k)*) (foldrPs (\x -> ((x:+:) . zs $)) Pz t)
    + x(go (zs . x) (k + 1))
Now we can count the number of rooted trees:

[ ]:
r = x (money r)
peep r
Our sequence agrees with https://oeis.org/A000081[the On-line Encyclopedia of
Integer Sequences (OEIS)].

== Root Removal ==

Gessel's slides apply the dissymmetry theorem of Pierre Leroux to count
unrooted trees.

First, we define new kinds of rooted trees:

  * An _edge-rooted tree_ is a tree where one edge is designated to be the root
  edge.

  * A _node-and-adjacent-edge-rooted tree_ is a node-rooted tree where one of
  the edges adjacent to the root node is designated to be the root edge.

Conditions for isomorphisms of such trees are tightened as we'd expect, that
is, if root edges are present, then \(\phi\) must map the root edge to the root
edge.

Next, we discover that is possible to get a grip on an unrooted tree, even
though they feel like slippery amorphous blobs. Given a tree, delete every leaf
node (and the edge that connected it to the tree). If we iterate this process,
then eventually, just before the tree vanishes, either a single node or a
single edge remains.

Call this last node or edge standing the _center node_ or _center edge_.
Label it \(c\) either way. Any tree isomorphism must map the center of
one tree to the center of the other.

And now, an incredible correspondence. Let \(T\) be a tree, and let:

  * \(T^\bullet\) be the set of node-rooted versions of \(T\).

  * \(T^-\) be the set of edge-rooted versions of \(T\).

  * \(T^{\bullet-}\) be the set of node-and-adjacent-edge-rooted versions of \(T\).

Consider an element of \(T^\bullet\). If its root node is not the center, then
there is a unique path from the root to the center and we can mark the first
edge on this path to get an element of \(T^{\bullet-}\).

Similarly, given an element of \(T^-\), if its root edge is not the center,
then there is a unique path from the root to the center and we can mark the
first node on this path to get an element of \(T^{\bullet-}\).
(This reminds me of a key step in link:eades.html[my favourite proof of
Cayley's formula for counting labeled trees].)

We have exhibited a bijection +\((T^\bullet\cup T^-)\setminus\{c\} \longleftrightarrow T^{\bullet-}\)+, which implies:

\[ |T^\bullet| + |T^-| - | T^{\bullet-}| = 1 \]

Therefore, we can count a set of unrooted trees by a method that sounds
silly at first. For each tree in the set, construct every possible node-rooted,
edge-rooted, and node-and-adjacent-edge-rooted tree. Then the number of trees
is the number of node-rooted and edge-rooted trees minus the number of
node-and-adjacent-edge-rooted trees!

While it is indeed silly to do all this for a single tree, it's actually
sensible for larger sets of trees because it's easy to count various styles of
rooted trees. Indeed, we already conquered \(r(x)\), the generating function
for node-rooted trees with \(n\) nodes.

A node-and-adjacent-edge-rooted tree of size \(n\) corresponds to an ordered
pair of node-rooted trees whose sizes sum to \(n\), because on either side of
the root edge lies a rooted subtree, and the root of one of them is also the
root node of the original tree. By convolution, the ordinary power series for
counting node-and-adjacent-edge-rooted trees of size \(n\) is simply:

\[ r(x)^2 \]

An edge-rooted tree of size \(n\) corresponds to an unordered pair of
node-rooted trees whose sizes sum to \(n\), because on either side of the root
edge lies a rooted subtree, but there is no reason to order them one way or the
other. Thus \( r(x)^2 \) doesn't quite work because it counts each pair of
distinct trees twice, and counts each tree paired with itself once. But we can
correct for this:

\[ \frac{r(x)^2 + r(x^2)}{2} \]

because \(r(x^2)\) counts pairs of the same tree (if there are \(a_k\) trees of
size \(k\), then there are \(a_k\) pairs of identical trees whose total size is
\(2k\)).

Then the ordinary power series for the number of unrooted trees of size \(n\)
is:

\[
r(x) + \frac{r(x)^2 + r(x^2)}{2} - r(x)^2 =
r(x) - \frac{r(x)^2 - r(x^2)}{2}
\]

[ ]:
peep $ r - (1/2)*(r^2 - r#x(x 1))
Our numbers agree with https://oeis.org/A000055[the OEIS].
(The numbers on Gessel's slides differ, and are likely erroneous.)

== Irreducible trees ==

Gessel then counts homeomorphically irreducible trees, that is, trees where no
node has degree 2. The movie _Good Will Hunting_ features a scene with
link:eades.html[the irreducible trees with 10 nodes, which we use to test a
graph layout algorithm]. Here, we find a way to count the number of trees for a
given number of nodes.

We start by counting node-rooted trees where no node has exactly one child:

\[
s(x) = x (M s(x) - s(x))
\]

[ ]:
s = x (money s - s)
peep s
This is nearly what we want. Each non-root node has exactly one parent, so if
it does not have exactly one child node, its degree is not 2. However, the root
node has no parents, so we're incorrectly including trees where the root node
has degree 2, and incorrectly omitting trees where the root node has degree 1.

We fix this by counting trees where the root node does not have exactly two
children, and the other nodes do not have exactly one child:

\[
\begin{align}
t^\bullet & = x\left(M s(x) - \frac{s(x)^2 + s(x^2)}{2}\right) \\
& = x\left(M s(x) - s(x) + s(x) - \frac{s(x)^2 + s(x^2)}{2}\right) \\
& = s(x) + x\left(s(x) - \frac{s(x)^2 + s(x^2)}{2}\right) \\
& = (1 + x)s(x) - x\left(\frac{s(x)^2 + s(x^2)}{2}\right)
\end{align}
\]

[ ]:
peep $ x (money s - (1/2)*(s^2 + s#x(x 1)))
We proceed as before to unroot. We count node-and-adjacent-edge-rooted
irreducible trees:

\[
t^{\bullet-} = s(x)^2
\]

then edge-rooted irreducible trees:

\[
t^- = \frac{s(x)^2 + s(x^2)}{2}
\]

Thus the ordinary power series for the number of irreducible trees is:

\[
t = t^\bullet + t^- - t^{\bullet-} = (1 + x)s(x) + (1 - x)\left(\frac{s(x)^2 + s(x^2)}{2}\right) - s(x)^2
\]

[ ]:
t = (1 + x 1)*s + (1/2)*(1 - x 1)*(s2 + s#(x (x 1))) - s2 where s2 = s^2
peep t
We found 10 irreducible trees of 10 nodes when we drew them, and thankfully
this matches the coefficient of \(x^{10}\). Our numbers also agree with
https://oeis.org/A000014[the OEIS].

== The Roads Not Taken ==

https://www.dmg.tuwien.ac.at/drmota/trees.pdf[Michael Drmota walks through
another way to count unrooted trees] (Theorem 6): we start by counting
rooted trees with link:../polya/[Pólya's Enumeration Theorem]. Just as we must
consider rotations and reflections when counting ways of stringing coloured
beads on a necklace, we must consider arbitrary permutations of a parent node's
children when counting trees.

The cycle index polynomial for the symmetric group of order \(n\) is:

+\[ \sum \frac{x_1^{\lambda_1} ... x_n^{\lambda_n}}{1^{\lambda_1} \lambda_1 !
2^{\lambda_2}\lambda_2 ! ... n^{\lambda_n} \lambda_n !}
\left[ \lambda_1 + 2\lambda_2 + ... + n\lambda_n = n \right] \]+

Summing for \(n \ge 0\) yields:

+\[
\left(1 + x_1 + \frac{x_1^2}{2!} + ...\right)
\left(1 + \frac{x_2}{2} + \frac{x_2^2}{2^2 2!}\right)...
\]+

which is:

\[ \exp\left(x_1 + \frac{1}{2} x_2 + \frac{1}{3} x_3 + ...\right) \]

This leads to the equation for \(M\). Then we apply a result due to Otter to
count unrooted trees. We omit the details; it's also an ingenious bijection
that involves the center node or center edge, but more complicated.

We also tour other methods of representing lists with formal series. We call:

\[f(x) = a_0 + \frac{a_1}{1!} x + \frac{a_2}{2!} x^2 + ... \]

the _exponential generating function_ for
\( \left\{a_n\right\}_0^\infty = a_0, a_1, a_2, ... \)
and we write:

\[
\newcommand{\egf}{\overset{egf}\longleftrightarrow}
f \egf \left\{a_n\right\}_0^\infty
\]

[ ]:
egfPeep = peep . zipPs (flip (/)) exps
This correspondence is also linear, and the \(xD\) trick for polynomial
remains the same. However, shifting a sequence is achieved with differentiation:

\[
D^h f \egf \left\{a_{n+h}\right\}_0^\infty
\]

and multiplication involves an extra factor which suits some problems:

\[
f g \egf \left\{ \sum_{k=0}^n {n \choose k} a_k b_{n-k} \right\}_0^\infty
\]

where \(g \egf \left\{b_{n}\right\}_0^\infty\).

For example, a _derangement_ is a permutation with no fixed points.
If \(a_n\) is the number of derangements of \(n\) objects, then:

\[ n! = \sum_{k=0}^n {n \choose k} a_k \]

because every permutation can be viewed as fixing \(k\) objects and deranging
the others.

Since \(e^x \egf \left\{1\right\}_0^\infty\), taking the egf on both
sides gives:

\[ \frac{1}{1 - x} = f(x) e^x \]

Then the following code counts derangements:

[ ]:
egfPeep $ 1 / (exps * (1 - x 1))
A little algebra gives a formula for the number of derangements of \(n\)
objects:

\[ 1 - 1 - \frac{1}{1!} + \frac{1}{2!} - ... + (-1)^n \frac{1}{n!} \]

Next, we call:

\[f(s) = a_1 + \frac{a_2}{2^s} + \frac{a_3}{3^s} + ... \]

the _Dirchlet series generating function_ for
\( \left\{a_n\right\}_1^\infty = a_1, a_2, ... \)
and we write:

\[
\newcommand{\Dir}{\overset{Dir}\longleftrightarrow}
f \Dir \left\{a_n\right\}_1^\infty
\]

These are too far from power series for our code, so we just state some facts
here. Multiplying Dirchlet series gives:

\[ fg = \left\{ \sum_{d|n} a_n b_{n/d} \right\}_1^\infty \]

where \( g \Dir \left\{b_n\right\}_1^\infty \).

The Dirchlet series for the sequence of all ones is the _Riemann zeta function_:

\[ \zeta(s) = 1^{-s} + 2^{-s} + 3^{-s} + ... \Dir \left\{1\right\}_1^\infty \]

Thus
\( \zeta^2 \Dir \left\{d(n)\right\}_1^\infty \)
where \(d(n)\) is the number of divisors of \(n\).

Let \(\mathbb{P}\) be the prime numbers. For any
https://crypto.stanford.edu/pbc/notes/numbertheory/mult.html[multiplicative
function] \(f\), we have:

+\[
\sum_{n=1}^\infty \frac{f(n)}{n^{-s}}
= \prod_{p\in\mathbb{P}} 1 + f(p)p^{-s} + f(p^2)p^{-2s} + ...
\]+

In particular, since \(f(n) = 1\) is multiplicative:

+\[ \zeta(s)
= \prod_{p\in\mathbb{P}} 1 + p^{-s} + p^{-2s} + ...
= \prod_{p\in\mathbb{P}} \frac{1}{1 - p^{-s}}
\]+

Let \(\mu(n)\) be the multiplicative function satisfying
\(\mu(1) = 1, \mu(p) = -1, \mu(p^k) = 0\) for all primes \(p\) and \(k > 1\).
Then:

+\[
\sum_{n=1}^\infty \frac{\mu(n)}{n^{-s}}
= \prod_{p\in\mathbb{P}} 1 - p^{-s}
\]+

so +\( \zeta^{-1} \Dir \left\{ \mu(n) \right\}_1^\infty \)+.

We call \(\mu(n)\)
https://crypto.stanford.edu/pbc/notes/numbertheory/mobius.html[the _Möbius
function_], which we use to perform _Möbius inversion_: suppose
\( f \Dir \left\{a_n\right\}_1^\infty \) and
\( g \Dir \left\{b_n\right\}_1^\infty \)
are sequences satisfying:

\[ a_n = \sum_{d|n} b_d \]

Then \(f = g \zeta \), hence \(g = f \zeta^{-1} \), implying:

\[ b_n = \sum_{d|n} \mu(n/d) a_d \]

For example, https://en.wikipedia.org/wiki/Cyclotomic_polynomial[the cyclotomic
polynomials] satisfy \(\prod_{d|n} \Phi_d (x) = x^n - 1 \). We take logarithms
to convert a product to a sum:

\[\sum_{d|n} \log \Phi_d (x) = \log (x^n - 1) \]

Möbius inversion followed by exponentiation yields a formula for cyclotomic
polynomials:

\[ \Phi_n(x) = \prod_{d|n} (x^d - 1)^{\mu(n/d)} \]

Ben Lynn blynn@cs.stanford.edu 💡