Euclid’s Algorithm vs. Euclid’s Geometry

Euclid alone has looked on Beauty bare. On the other hand, Euclid’s algorithm alone shrouds the beauty of Euclidean geometry in a fog of algebra, yet is surprisingly effective at providing answers:

Theorem:

Divide and conquer

Euclid’s algorithm is powered by a division that finds a remainder smaller than the divisor. Traditionally, we start from two natural numbers \(s\) and \(p \ne 0\), and find \(q, r\) such that:

\[ s = p q + r \]

with \(r < p\). We iterate until we reach the smallest possible remainder, namely zero, and in doing so we find the answer seek, which in this case is the greatest common divisor of \(s\) and \(p\).

Euclid’s algorithm also works for polynomials if the coefficients are rational, or from some other field. In this setting, we measure size by looking at the degree of a polynomial, and once again the remainder polynomial is guaranteed to decrease until we reach zero. Namely, given the dividend and divisor polynomials \(s(x)\) and \(p(x) \ne 0\), we find quotient and remainder polynomials \(q(x), r(x)\) such that:

\[ s(x) = p(x) q(x) + r(x) \]

with \(\deg r(x) < \deg p(x)\). (Here, we consider the degree of 0 to be negative.)

But what if the coefficients are integers, or themselves polynomials in other variables? Then the first step of polynomial long division only works if the leading coefficient of the dividend is an exact multiple of the leading coefficient of the divisor. Even then, we run into the same issue in the next step of the division, and again require compatible leading coefficients.

This suggests loosening the rules. We permit multiplying the dividend by some constant, that is, we allow the choice of some constant factor \(a\) in the following pseudo-division:

\[ a s(x) = p(x) q(x) + r(x) \]

As before \(p \ne 0\) and we want \(\deg r < \deg p\).

If we choose \(a\) to be a sufficiently high power of the leading coefficient of the divisor \(p(x)\), then pseudo-division succeeds.

Wu’s method uses the Euclidean algorithm with pseudo-division to solve Euclidean geometry problems. See chapter 5 of John Harrison, Handbook of Practical Logic and Automated Reasoning.

{-# LANGUAGE CPP #-}
import Data.Function (on)
import Data.Either (partitionEithers)
import Data.List (minimumBy, intersperse, nub, sortOn)
import Data.Map (Map)
import qualified Data.Map as M
import Data.Maybe (catMaybes)
#ifdef __HASTE__
import Text.Parsec
import Control.Monad (void)
import Haste.DOM
import Haste.Events
import Haste.Foreign (ffi)
import Haste.JSString (pack)
lookupMax m = if M.null m then Nothing else Just $ M.findMax m
type Parser = Parsec String ()
digitChar = digit; letterChar = letter
anySingleBut = noneOf . pure; some = many1
#else
import Text.Megaparsec
import Text.Megaparsec.Char
lookupMax = M.lookupMax
type Parser = Parsec () String
#endif

Multivariate Madness

We represent a monomial (product of variables) with a map of variables to their multiplicities:

type Var = String
type Mo = Map Var Int

We represent a polynomial with a map of monomials to their coefficients:

newtype Poly = Poly (Map Mo Integer) deriving Eq

In some applications, the monomials should be ordered by (combined) degree. Fortunately, we won’t need this.

Unfortunately, we do need the ability to view a multivariate polynomial as a univariate polynomial in some given variable x. We represent a univariate polynomial with a map from the degree of a nonzero term to its coefficient, which is a multivariate polynomial from which x should be absent.

type Univ = Map Int Poly

We now have three varieties of related maps bouncing around our code. It might be best to hide the maps behind data types and use a Ring typeclass so we can reuse various mathematical operators. However, this requires more declarations than I’d prefer for a toy demo, so I settled on only hiding the map for the multivariate polynomials.

We give the multivariate polynomials pretty binary operators. In contrast, operations on monomials are mostly inlined.

noZ :: (Num n, Eq n) => Map k n -> Map k n
noZ = M.filter (/= (fromIntegral 0))

pZero = Poly mempty
pOne = Poly $ M.singleton mempty 1
pIntegral n = Poly $ M.singleton mempty $ fromIntegral n
pVar v = Poly $ M.singleton (M.singleton v 1) 1
pNull (Poly p) = M.null p
pNegate (Poly p) = Poly $ M.map negate p
(Poly p) .+ (Poly q) = Poly $ noZ $ M.unionWith (+) p q
p .- q = p .+ pNegate q
(Poly p) .* (Poly q) = Poly $ noZ $ M.fromListWith (+)
  [(k1 `moMul` k2, v1 * v2) | (k1, v1) <- M.toList p, (k2, v2) <- M.toList q]
  where moMul xs ys = noZ $ M.unionWith (+) xs ys
pSetZero v (Poly p) = Poly $ M.filterWithKey (\xs _ -> M.notMember v xs) p

Operations on univariate polynomials are similar. One difference arises in multiplication: instead of multiplying monomials, we add degrees.

uCanon = M.filter $ \(Poly p) -> not $ M.null p
uConst a = uCanon $ M.singleton 0 a
uAdd p q = uCanon $ M.unionWith (.+) p q
uNegate p = M.map pNegate p
uSub p q = uAdd p $ uNegate q
uMul p q = uCanon $ M.fromListWith (.+)
  [(k1 + k2, v1 .* v2) | (k1, v1) <- M.toList p, (k2, v2) <- M.toList q]

Now for pseudo-division on polynomials, the star of the show. Let s and p be univariate polynomials. Let a be the leading coefficient of the divisor. Then our next function returns k and r such that

\[ a^k s = p q + r \]

for some quotient q.

Each step of the long division involves \(x^{m-n}\) times the divisor times an appropriate constant, where \(m\) is the degree of the dividend and \(n\) is the degree of the divisor.

We optimize slightly: if the leading coefficients happen to match then we can skip one step of scaling up the dividend by a.

euclid :: Univ -> Univ -> (Int, Univ)
euclid s p = go 0 s where
  (n, a) = M.findMax p
  go k s
    | M.null s = (k, s)
    | m < n = (k, s)
    | a == b = go k $ s `uSub` p'
    | otherwise = go (k + 1) $ (uConst a `uMul` s) `uSub` (uConst b `uMul` p')
    where
    (m, b) = M.findMax s
    p' = M.mapKeysMonotonic (+ (m - n)) p

The respect function views a multivariate polynomial as a univariate polynomial with respect to a given variable. I only used its inverse disrespect during testing.

respect :: Var -> Poly -> Univ
respect v (Poly p) = M.map Poly $ M.fromListWith M.union $ go <$> M.toList p
  where
  go (xs, coeff) = case M.lookup v xs of
    Nothing -> (0, M.singleton xs coeff)
    Just k -> (k, M.singleton (M.delete v xs) coeff)

disrespect :: Var -> Univ -> Poly
disrespect v u = Poly $ M.fromList $
  [(if k > 0 then M.insert v k xs else xs, r)
    | (k, Poly p) <- M.toList u, (xs, r) <- M.toList p]

Triangles in Algebra

Cartesian coordinates transform questions of geometry into questions of algebra. Consider the theorem: if M is the midpoint of AC, and BM is perpendicular to AC, then AB = AC. We translate it as follows.

If:

\(A_x + C_x - 2 M_x = 0\)
\(A_y + C_y - 2 M_y = 0\)
\((A_x - C_x) (M_x - B_x) + (A_y - C_y) (M_y - B_y) = 0\)

Then:

\((A_x - B_x)^2 + (A_y - B_y)^2 - (B_x - C_x)^2 - (B_y - C_y)^2 = 0\)

More generally, theorems of Euclidean geometry often take the following form: when the multivariate polynomials \(p_1, …, p_n\) are all zero, the multivariate polynomial \(p\) is zero.

Suppose the variable \(x\) appears in at least two of the \(p_1,…,p_n\). We view them as univariate polynomials \(f(x)\) and \(g(x)\), labeled so that \(\deg f \ge \deg g\). By pseudo-division, we can scale up \(f\) and subtract a multiple of \(g\) to produce a polynomial \(f'\) with \(\deg f' < \deg g\).

The zeros of \(f\) and \(g\) are also zeros of \(f'\), so we may replace \(f\) with \(f'\). This is like Gaussian elimination, but each step is irreversible as the converse may be untrue. Recall the scaling factor possibly contains variables that we are temporarily viewing as constants, so it may really be a polynomial with its own zeros.

Iterating this removes \(x\) from all but one polynomial in \(p_1,…,p_n\). We have thus accomplished a noteworthy feat simply by chasing smaller and smaller remainders. Euclid’s algorithm strikes again!

We repeat on the remaining \(n - 1\) polynomials: we pick some other variable \(y\) and eliminate it from all but one of them, and so on. After \(n\) repetitions we reach a triangular form \(p_1,…,p_k\): for some variables \(x_1,…,x_n\), if we view all other variables as constants, then the polynomial \(p_k\) only contains the variables \(x_1,…,x_k\).

Our triangulate function eliminates variables in the order they appear in a given list. It returns a list of polynomials viewed as univariate polynomials in the given variables. The variable names are recorded with their corresponding polynomials.

For pseudo-division we choose a divisor from the candidates of lowest degree as heuristically this ought to reduce the degrees of the other polynomials most rapidly.

triangulate :: [Var] -> [Poly] -> [(Var, [Univ])]
triangulate [] [] = []
triangulate [] ps = error "leftovers"
triangulate (v:vt) ps = go [] $ respect v <$> ps where
  findConstant p = case lookupMax p of
    Nothing -> Left pZero
    Just (0, c) -> Left c
    _ -> Right p
  go done todo
    | length ncs <= 1 = (v, ncs) : triangulate vt (cs ++ done)
    | otherwise = go (cs ++ done) $ p : filter (not . M.null) (map (\s -> snd $ euclid s p) ncs)
    where
    (cs, ncs) = partitionEithers $ findConstant <$> todo
    p = minimumBy (compare `on` (fst . M.findMax)) ncs

Let us state our transformed problem: given a triangular set of polynomials \(p_1,…,p_n\) with variables \(x_1,…,x_n\) where all other variables are viewed as constants, our goal is to show that when they are all zero, \(p\) is also zero.

Once again, we call upon pseudo-division and recursion. Due to triangulation, only \(p_n\) can possibly shed any light on the effect of \(x_n\) on our answer. Viewing our polynomials as univariate in \(x_n\), we pseudo-divide:

\[ a p(x_n) = p_n(x_n) q(x_n) + r(x_n) \]

When \(p_n(x_n) = 0\), we have \(p(x_n) = 0\) if \(a \ne 0\) and \(r(x_n)\) is the zero polynomial, that is, each coefficient of \(r(x_n)\) is zero. Again, the implication is one-way; these conditions are sufficient, but may not be necessary.

We record \(a \ne 0\), and as for the coefficients, each is a polynomial not containing \(x_n\), so we can recursively generate more conditions with the triangular set \(p_1,…p_{n-1}\).

We bottom out when nothing remains in the triangular set. At this point, we have no choice but to simply generate the condition \(r = 0\). As an extreme example, if asked to prove \(AB = CD\) with no premises, then the triangular set is initially empty, and we simply return the condition \(AB = CD\).

Typically, Wu’s method generates conditions that prevent degeneracy. For example, an equation might state that a certain length is nonzero.

genDegens :: [(Var, [Univ])] -> Poly -> [(Poly, Bool)] -> [(Poly, Bool)]
genDegens [] p acc = (p, False):acc
genDegens ((v, qs):tri) p acc
  | pNull p = acc
  | null qs = genDegens tri p acc
  | k == 0 = acc'
  | otherwise = (snd $ M.findMax q, True):acc'
  where
  q = head qs
  (k, r) = euclid (respect v p) q
  acc' = foldr (genDegens tri) acc $ M.elems r

Popularity Contest

The order we process variables during triangulation affects the speed of the algorithm, thus our wu function takes a vars list so the caller can choose the order.

The results of Euclidean geometry are invariant under translation and rotation, so we may choose one of the points to lie at the origin, and another one of the points to the lie on one of the axes, or more generally, two points to lie on one axis, and a point to lie on the other. If the theorem has certain constraints, we may zero even more variables. Accordingly, our wu function also accepts a list of variables to replace with zero, which should be disjoint from the vars list.

Scale-invariance suggests we could replace another variable with unity, but we must take care when points can possibly coincide. We pass on this idea.

We allow multiple polynomials in the conclusion. We prove each is zero separately.

wu vars zeros antes concls =
  (flip (,) False . pVar <$> zeros) ++
  concat [genDegens tri p [] | p <- unpop concls]
  where
  tri = triangulate vars $ unpop antes
  unpop = map $ \p -> foldr pSetZero p zeros

Heuristically, for a general problem, it seems we should always select the variable that appears in the fewest polynomials, so there is less to eliminate, and we should zero out the most frequently appearing variables to reduce our workload. We ensure the three winners of the popularity contest contain at least one \(x\)-coordinate and at least one \(y\)-coordinate to avoid losing generality.

wuPop antes concls = wu (reverse rest) pops antes concls where
  fvs = map (\(Poly p) -> nub $ concatMap M.keys $ M.keys p) antes
  freqs = foldr (M.unionWith (+)) mempty $
    map (M.fromList . map (flip (,) 1)) fvs
  byPop = map fst $ sortOn snd $ M.toList freqs
  (pre0, (popx:post0)) = break ((== 'x') . last) $ reverse byPop
  (pre1, (popy:post1)) = break ((== 'y') . last) $ pre0 ++ post0
  (pop3:rest) = pre1 ++ post1
  pops = [popx, popy, pop3]

Fermat and Descartes

We provide one grammar for polynomials, and a higher-level macro grammar so we can type things like "line A B C", which automatically expands to a polynomial representing that these points are collinear.

The last line is the conclusion of the theorem, and the other lines are the antecedents. A single line may expand into multiple polynomials.

macroDefs =
  [ ("line", ["(1_x - 2_x) * (2_y - 3_y) - (1_y - 2_y) * (2_x - 3_x)"])
  , ("para", ["(1_x - 2_x) * (3_y - 4_y) - (1_y - 2_y) * (3_x - 4_x)"])
  , ("perp", ["(1_x - 2_x) * (3_x - 4_x) + (1_y - 2_y) * (3_y - 4_y)"])
  , ("mid", ["2_x + 3_x - 2 * 1_x", "2_y + 3_y - 2 * 1_y"])
  , ("intersect",
    [ "(1_x - 2_x) * (2_y - 3_y) - (1_y - 2_y) * (2_x - 3_x)",
      "(1_x - 4_x) * (4_y - 5_y) - (1_y - 4_y) * (4_x - 5_x)"])
  , ("len", ["(1_x - 2_x)^2 + (1_y - 2_y)^2 - (3_x - 4_x)^2 - (3_y - 4_y)^2"])
  , ("sumsq", ["(1_x - 2_x)^2 + (1_y - 2_y)^2 + (3_x - 4_x)^2 + (3_y - 4_y)^2 - (5_x - 6_x)^2 - (5_y - 6_y)^2"])
  , ("same", ["1_x - 2_x", "1_y - 2_y"])
  ]

Fermat and Descartes both pioneered analytic geometry. Fermat started from algebraic equations, while Descartes started from geometric curves, so we name our parsers accordingly.

Thus fermat parses a language representing multivariate polynomials and returns a Poly, while descartes parses a macro language describing geometric constraints and returns strings of our polynomial language.

spc = many $ char ' '

fermat :: Parser Poly
fermat = expr where
  sp = (<* spc)
  spch = sp . char
  expr = foldl (flip ($)) <$> term <*> many
    (   (.+) <$> (spch '+' *> term)
    <|> (flip (.-)) <$> (spch '-' *> term)
    )
  term = foldl (.*) <$> pwr <*> many (spch '*' *> pwr)
  pwr = do
    a <- atom
    option a $ (iterate (a .*) pOne !!) <$> sp (spch '^' *> (read <$> some digitChar))
  atom = between (spch '(') (spch ')') expr
    <|> pIntegral . read <$> sp (some digitChar)
    <|> pVar <$> var
  var = sp $ some $ letterChar <|> char '_'

macro :: Parser ([String], [String])
macro = go . catMaybes <$> some line where
  go ls = (concat $ init ls, last ls)
  line = spc *> (newline *> pure Nothing <|> nonEmpty <* newline)
  nonEmpty = do
    mac <- some (anySingleBut ' ') <* spc
    case mac of
      "raw" -> Just . (:[]) <$> some (anySingleBut '\n')
      "#" -> some (anySingleBut '\n') *> pure Nothing
      _ -> Just <$> do
        args <- some $ some letterChar <* spc
        let
          sub (n:t@('_':_)) = args !! (read [n] - 1) ++ sub t
          sub (h:t) = h:sub t
          sub x = x
        maybe (fail mac) (pure . map sub) $ lookup mac macroDefs

descartes :: String -> Either String ([Poly], [Poly])
descartes s = either (Left . show) Right $ do
  (antes, concls) <- parse macro "" s
  (,) <$> mapM (parse fermat "") antes <*> mapM (parse fermat "") concls

Who’s a Pretty Polynomial?

We pretty-print equations in LaTeX format, so MathJax can render them above.

prMo xs = foldr (.) id $ intersperse (' ':) $ (\(x, k) -> (x++) . prPow k) <$> M.toList xs
  where
  prPow 1 = id
  prPow k = showString "^" . shows k

prettyPoly (Poly p) = (if snd h < 0 then ('-':) else id) . go h .
  foldr (.) id (map (\(k, v) -> ((if v < 0 then " - "  else " + ")++)  . go (k, v)) t)
  where
  go (k, v)
    | M.null k = shows $ abs v
    | otherwise = prCoeff (abs v) . prMo k
  prCoeff 1 = id
  prCoeff n = shows n . (' ':)
  (h:t) = M.toList p
prettyEqn (p, b) = prettyPoly p . showString (if b then " \\ne 0" else " = 0")

I called the following function from GHCi extensively during development:

solve s = putStr $ either show (($ "") . foldr (.) id .
  map ((. ('\n':)) .  prettyEqn) . uncurry wuPop) $ descartes s

▶ Toggle test cases

-- Check against an example given by Harrison.
bookCheck = putStr $ either show
  (unlines . map (($ "") . prettyEqn) . uncurry (wu vs zs))
  $ descartes simson
  where
  vs = reverse $ words "B_x B_y C_x C_y D_x D_y E_x E_y F_x F_y G_x G_y O_x"
  zs = ["A_x", "A_y", "O_y"]

sanity = unlines
  [ "mid M A B"
  , "line A M B"
  ]

sanity2 = unlines
  [ "line A P B"
  , "perp A B C D"
  , "perp A P C D"
  ]

isosceles = unlines
  [ "# If M is the midpoint of AC:"
  , "mid M A C"
  , "# and AC is perpendicular to BM:"
  , "perp A C M B"
  , "# then AB = BC:"
  , "len A B B C"
  ]

pythag = unlines
  [ "# AC is perpendicular to AB:"
  , "perp A C A B"
  , "# Then AB^2 + AC^2 = BC^2"
  , "sumsq A B A C B C"
  ]

simson = unlines
  [ "# Simson's Theorem: Let ABCD be a cyclic quadrilateral."
  , "len O A O B"
  , "len O A O C"
  , "len O A O D"
  , "# Drop perpendiculars from D to the (extended) sides of ABC."
  , "# Let E F G be the points of intersection."
  , "line E B C"
  , "line F A C"
  , "line G A B"
  , "perp B C D E"
  , "perp A C D F"
  , "perp A B D G"
  , "# Then EFG is collinear."
  , "line E F G"
  ]

orthobad = unlines
  [ "line A P X"
  , "line B P Y"
  , "line C P Z"
  , "line A Z B"
  , "line B X C"
  , "line C Y A"
  , "perp A X B C"
  , "perp B Y C A"
  , "perp C P A B"
  ]

orthocenter = unlines
  [ "# Suppose AP is perpendicular to BC"
  , "perp A P B C"
  , "# and BP is perpendicular to CA."
  , "perp B P C A"
  , "# Then CP is perpendicular to AB."
  , "perp C P A B"
  ]

centroidOops = unlines
  [ "# My first attempt at proving the medians of a triangle meet at a single point."
  , "# I thought it failed because of a bug in my code."
  , "# I later realized the antecedents didn't require A B Z to be collinear."
  , "# The program says the conclusion holds if CP:PZ is 2:1."
  , "# Thus it can discover theorems as well as prove them!"
  , "line A P X"
  , "line B P Y"
  , "line C P Z"
  , "mid X B C"
  , "mid Y C A"
  , "mid Z A B"
  ]

centroid = unlines
  [ "# Let X be the midpoint of BC."
  , "# Let Y be the midpoint of CA."
  , "# Let P be the intersection of AX and BY."
  , "# Let CP intersect AB at Z."
  , "line A P X"
  , "line B P Y"
  , "line C P Z"
  , "mid X B C"
  , "mid Y C A"
  , "line Z A B"
  , "# Then Z is the midpoint of AB."
  , "mid Z A B"
  ]

twoToOne = unlines
  [ "# Let AX be a median, P the centroid."
  , "line A P X"
  , "line B P Y"
  , "mid X B C"
  , "mid Y C A"
  , "# Then AP:PX = 2:1"
  , "# We should show this for the y-coordinates too, but the macro language is clunky."
  , "raw A_x - P_x - 2 * (P_x - X_x)"
  ]

desargues = unlines
  [ "# One direction of Desargues' Theorem."
  , "# Project ABC onto abc from X."
  , "line A a X"
  , "line B b X"
  , "line C c X"
  , "# D E F are intersections of corresponding edges."
  , "line A B D"
  , "line a b D"
  , "line A C E"
  , "line a c E"
  , "line B C F"
  , "line b c F"
  , "# Then D E F are collinear:"
  , "line D E F"
  ]

area = unlines
  [ "perp A X B C"
  , "line B X C"
  , "# Compare base * height with determinant."
  , "raw ((A_x - X_x)^2 + (A_y - X_y)^2)*((B_x - C_x)^2 + (B_y - C_y)^2) - ((C_x - B_x)*(A_y - B_y) - (C_y - B_y)*(A_x - B_x))^2"
  ]

Frontend

Barebones code for a barebones user interface:

#ifdef __HASTE__
renderEqns es = ("<ul>"++)
  . foldr (.) id [("<li>\\("++) . prettyEqn e . ("\\)</li>"++) | e <- es]
  . ("</ul>"++)

main :: IO ()
main = withElems ["theorem", "out", "wu"] $ \[iEl, oEl, wuB] -> do
  let
    setTheorem s = do
      setProp oEl "innerHTML" "<p><i>Push the button!</i></p>"
      setProp iEl "value" s
    handle buttonId s = do
      Just b <- elemById buttonId
      void $ b `onEvent` Click $ const $ setTheorem s
  handle "isosceles" isosceles
  handle "centroid" centroid
  handle "oops" centroidOops
  handle "two_one" twoToOne
  handle "orthocenter" orthocenter
  handle "simson" simson
  handle "pythag" pythag
  handle "desargues" desargues
  handle "area" area
  void $ wuB `onEvent` Click $ const $ do
    s <- getProp iEl "value"
    case descartes s of
      Left e -> setProp oEl "innerHTML" $ "<pre>" ++ show e ++ "</pre>"
      Right (as, cs) -> do
        setProp oEl "innerHTML"
          $ ("<p>"++)
          . ("If:"++)
          . renderEqns (flip (,) False <$> as)
          . ("then:"++)
          . renderEqns (flip (,) False <$> cs)
          . ("provided"++)
          . renderEqns (wuPop as cs)
          . ("(up to translation and rotation)</p>"++)
          $ ""
        ffi $ pack "MathJax.typeset"
  setTheorem isosceles
#endif

Geometric Algebra

Having been wowed by conformal geometric algebra as well as Wu’s method, I naturally wondered what might happen if we mixed the two. Could we tackle more problems? Would proofs be shorter and simpler? Could we look on Beauty bare?

It turns out others thought of this years ago. For example, see Li, Automated Theorem Proving in the Homogeneous Model with Clifford Bracket Algebra and other papers.

Ben Lynn blynn@cs.stanford.edu 💡