= All Hail Geometric Algebra! =
Rotate a cube by 90 degrees around the $x$-axis, then rotate it by 90 degrees
around the $z$-axis. This is equivalent to a single rotation. What is the axis
and angle of this single rotation?
If we hold a physical cube in our hands, we can turn it a couple of times and
intuitively find the answer. Is there a similarly straightforward mathematical
method that rigorously solves this problem?
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
What if we generalize? That is, given any two rotations that fix the origin,
what is the single rotation that is equivalent to their composition? Can we
solve this just as easily?
Yes! With 'geometric algebra', we can solve the first problem comfortably with
pen and paper, and apply the same method to its generalization.
We'll take a whirlwind tour of
http://faculty.luther.edu/\~macdonal/GA&GC.pdf['A Survey of Geometric Algebra
and Geometric Calculus'] by Alan Macdonald.
We'll also assume our constructions are well-defined; for proofs, refer to
http://faculty.luther.edu/\~macdonal/GAConstruct.pdf['An elementary
construction of the geometric algebra'] by the same author. See also:
- Papers by Leo Dorst and Stephen Mann: http://www.cgl.uwaterloo.ca/smann/Papers/dorst-mann-I.pdf[Part I]; http://www.cgl.uwaterloo.ca/smann/Papers/dorst-mann-II.pdf[Part II].
- A paper https://www.staff.science.uu.nl/~kreve101/asci/GAraytracer.pdf[comparing geometric algebra and vector algebra for ray tracing] by Daniel Fontijne and Leo Dorst.
We'll explain some ideas in code, for which we need a few declarations.
(link:ga.lhs[The source of this page] is literate Haskell.)
\begin{code}
{-# LANGUAGE OverloadedLists #-}
module AllHailGeometricAlgebra where
import Control.Monad
import Data.List
import Data.Map (Map)
import qualified Data.Map as M
\end{code}
== A number is worth 1000 pictures ==
I was spellbound by Cartesian coordinates as a child. Humble numbers
tame the wild world of geometry. Instead of
https://www.cs.utexas.edu/users/EWD/transcriptions/EWD10xx/EWD1012.html[dubiously
hoping a hand-drawn picture represents the truth], we can solve geometry
problems rigorously by mechanically applying the laws of algebra.
(It turns out http://repository.cmu.edu/cgi/viewcontent.cgi?article=1020&context=philosophy[Euclid's Elements can be formalized] without losing too much
of its flavour.)
However, in practice, coordinates are unwieldy. For example, try finding the
area of a triangle by determining its side lengths with Pythagoreas' Theorem,
then plugging them into Heron's formula. Or try finding the point of
intersection between a line and a plane by solving simultaneous equations.
Cartesian coordinates are the assembly language of geometry. Unimportant
details clutter our reasoning. High-level ideas are difficult to express.
Nonetheless, coordinates suffice for solving our first problem in a rough
manner: we take a cube centered at the origin and figure out where its corners
go. We then determine the axis by finding which corners are fixed, and
determine the angle by counting rotations until the cube returns to the
original position. Of course, this method fails for arbitrary axes and angles.
== I could not have invented vector algebra! ==
Years later, I learned about vectors, which simplified calculations. For
example, http://geomalgorithms.com/a01-_area.html[the area of a triangle is
basically a cross product]. Though grateful for the labour saved, the
mysterious ad hoc nature of these new tools bothered me. How did they figure it
all out?
The dot product almost looks like it could be discovered algebraically: we
might consider applying the multiplication from the direct sum of rings, that
is, `zipWith (*) v w`. We may notice that the squared norm of a vector `v` is
`sum (zipWith (*) v v)`, which suggests we may eventually realize `sum (zipWith
(*) v w)` has geometric significance.
However, the cross product is baffling. Without a superstructure such as
quaternions
(https://en.wikipedia.org/wiki/History_of_quaternions[which themselves took
some effort to discover]), the only plausible route to discovering it is to
http://math.oregonstate.edu/bridge/papers/dot+cross.pdf[derive the algebra from
the geometry]. https://richbeveridge.wordpress.com/2015/03/04/derivation-of-the-cross-product/[Lagrange did this in 1773].
Even if we manage to derive them somehow, these products are pale shadows
of the products of abstract algebra. The dot product is a scalar, a type
different to the inputs. Calling it a product feels like an abuse of
terminology. The cross product only works in precisely three dimensions, and
although it satisfies some identities, it is far from elementary, For
starters, it's non-associative. Is a product with nice properties too much to
ask?
And how about https://en.wikipedia.org/wiki/Pl%C3%BCcker_coordinates[Plücker
coordinates]? They seem underappreciated even by those well-acquainted with
cross products, likely because they are abstruser still.
Worst of all, despite all the above machinery, low-level details remain.
We would like to write down a handful of symbols describing geometric
concepts, then manipulate them with the laws of algebra to perform
amazing feats. Vector algebra only partially succeeds.
If we're skilled with matrices, we can solve our rotation problem by following
along
https://en.wikipedia.org/wiki/Euler%27s_rotation_theorem#Matrix_proof[the
proof of Euler's rotation theorem].
We multiply the matrices for the two given rotations to produce a single
matrix, then find an eigenvector to determine the axis of rotation. As for the
angle, we could study the effect of the matrix on a vector perpendicular to
the axis.
This is disproportionately tedious. Why must we work so hard to handle
a simple idea like rotation?
In contrast, geometric algebra gives a clean and simple mathematical framework
for rotations, built from a clean and simple vector product. We really can
scribble down a few symbols and apply the laws of algebra to do geometry.
Additionally, the weirdest parts of vector algebra, including the cross product
and Plücker coordinates, arise naturally in geometric algebra.
== Complex numbers simplify ==
Thanks to polar coordinates, complex numbers excel at describing 2D rotations.
See also https://en.wikipedia.org/wiki/Phasor[phasors] for a completely
different problem that also benefits from complex numbers and polar
coordinates.
Complex numbers should be more widely used, even for discrete problems. For
example, when navigating a 2D maze, we might need to rotate our orientation
left or right by 90 degrees. If we index the array with Gaussian integers,
90-degree rotations are simply multiplications by $\pm i$.
To go beyond 2D, we can augment complex numbers so they become quaternions,
which excel at describing 3D rotations.
http://www.gamedev.net/page/resources/_/technical/math-and-physics/do-we-really-need-quaternions-r1199[A contentious
article advocates vectors over quaternions]. While the vigorous prose is
admirable, the argument is weak. The claim that quaternions are ``very
complicated'' is false, but more risible is the howler:
``But why has nobody looked beyond quaternions for a simpler solution until
now?''. In fact, a simpler solution was found long ago: geometric algebra.
The real problems with quaternions are that translations and rotations mix
poorly, and that they limit us to 3D.
Geometric algebra is a common framework for vector algebra and quaternions.
Despite its power, geometric algebra is simple, and rotations in any dimension
closely resemble the elegant descriptions of 2D rotations with complex numbers
and 3D rotations with quaternions.
Thus the conclusion of the article is right for the wrong reasons. We should
skip learning quaternions, but not because vectors are better. Rather, it's
because we should study the more powerful geometric algebra, which teaches us
both quaternions and vector algebra along the way.
== Geometric algebra for (almost) free ==
We can trivially define a product with nice properties on any set: just take
https://en.wikipedia.org/wiki/Free_monoid[the free monoid], or as programmers
call it, string concatenation. (Here, each character represents an element of
the set and the empty string is the identity element.)
Similarly, we can define a vector product with nice properties thanks to
https://en.wikipedia.org/wiki/Free_algebra[free algebras]. Again, we
concatenate strings: given vectors $
\newcommand{\n}{\mathbf{n}}
\newcommand{\u}{\mathbf{u}}
\newcommand{\v}{\mathbf{v}}
\newcommand{\w}{\mathbf{w}}
\newcommand{\i}{\mathbf{i}}
\newcommand{\e}{\mathbf{e}}
\newcommand{\f}{\mathbf{f}}
\newcommand{\vv}[1]{\mathbf{#1}}
\u, \v, \w$, we can form products such as
$\vv{www}$, $\vv{vuwvwuuvw}$, and so on. Then we mix in scalars, and add
and multiply them together to get polynomial-like expressions, which we
call 'multivectors'. For example:
\[ (2 \vv{vuvw} + 1)(3\v - 2) = 6\vv{vuvwv} - 4\vv{vuvw} + 3\v - 2 \]
In Haskell:
\begin{code}
type Multivector = Map String Double
freeMul :: Multivector -> Multivector -> Multivector
freeMul x y = M.filter (/= 0) $
M.fromListWith (+) $ f <$> M.assocs x <*> M.assocs y where
f (s, m) (t, n) = (s ++ t, m * n)
prop_exampleFreeMul = freeMul [("vuvw", 2), ("", 1)] [("v", 3), ("", -2)]
== [("vuvwv", 6), ("vuvw", -4), ("v", 3), ("", -2)]
\end{code}
This product is nice by definition. There is an identity element. It's
associative. It's distributive over vector addition. What a shame it's also
useless: how do we geometrically interpret $\vv{uvwuuv}$?
There's no way. But by constraining our free algebra with one simple relation,
we obtain the vaunted 'geometric algebra'. We define the 'geometric
product' by subjecting the free product to the following constraint:
for any vector $\v$,
\[ \v\v = \v\cdot\v \]
Incidentally, we can https://en.wikipedia.org/wiki/Geometric_algebra[generalize
this definition by replacing the dot product with any quadratic form], and
generalize further to https://en.wikipedia.org/wiki/Clifford_algebra[Clifford
algebras] by considering any vector space over any field. For now, we focus on
the standard dot product on $\mathbb{R}^n$, where the resulting geometric
algebra is denoted $\mathbb{G}^n$.
We could have instead written $\v^2 = |\v|^2$, but this hides our need for
certain properties of the standard inner product. Specifically, we need the
https://en.wikipedia.org/wiki/Polarization_identity[polarization identity]:
\[ \u \cdot \v = \frac{1}{2} ((\u + \v) \cdot (\u + \v) - \u \cdot \u - \v \cdot \v) \]
From our constraint, this is:
\[ \u \cdot \v = \frac{1}{2} ((\u + \v)^2 - \u^2 - \v^2) \]
By elementary algebra, this implies:
\[ \u \cdot \v = \frac{1}{2} (\u\v + \v\u) \]
Hence when $\u$ and $\v$ are orthogonal, we have $\u\v = -\v\u$.
As its name suggests, the geometric product is useful for geometry, and also
possesses nice algebraic properties. Even division is possible sometimes: the
inverse of a nonzero vector $\v$ is $\v / |\v|^2$.
[By the way,
http://blog.sigfpe.com/2006/08/geometric-algebra-for-free_30.html[this
section's title is unoriginal]. Also, while I could not have invented
geometric algebra either, at least it's simpler than vector algebra.]
== The canonical basis ==
To recap:
1. We can replace $\v\v$ with $|\v|^2$.
2. We can replace $\u\v$ with $-\v\u$ when $\v$ and $\u$ are orthogonal.
Thus if $\e_1, ..., \e_n$ is an orthonormal basis for $\mathbb{R}^n$, then
every multivector of $\mathbb{G}^n$ can be written as a linear combination of
products of the form:
\[ \e_{i_1}...\e_{i_k} \]
where $i_1,...,i_k$ is a strictly increasing subsequence of $[1..n]$. These
products are 'a canonical basis' for $\mathbb{G}^n$, which implies the
basis for an $n$-dimensional vector space yields a canonical basis of size
$2^n$ for its geometric algebra.
Here's a function showing the details. It computes the geometric product
in canonical form, assuming each character that appears in the input
represents an orthonormal basis vector of the space.
For clarity, we employ the inefficient
https://en.wikipedia.org/wiki/Gnome_sort[gnome sort]:
\begin{code}
-- Computes geometric product of multivectors written in terms of some
-- orthonormal basis. Returns result in canonical form.
gMul :: Multivector -> Multivector -> Multivector
gMul x y = M.filter (/= 0) $
M.fromListWith (+) $ basisMul <$> M.assocs x <*> M.assocs y
basisMul (s, m) (t, n) = gnome [] (s ++ t) (m * n) where
gnome pre (a:b:rest) c
-- Vectors are in order: move on.
| a < b = gnome (pre ++ [a]) (b:rest) c
-- Same vector: remove both copies since magnitude = 1.
| a == b = back pre rest c
-- Wrong order: flip and back up one step. Since the vectors are
-- orthogonal, we flip the sign of the geometric product.
| a > b = back pre (b:a:rest) (-c)
where
back [] rest c = gnome [] rest c
back pre rest c = gnome (init pre) (last pre:rest) c
gnome pre rest c = (pre ++ rest, c)
canon = gMul [("", 1)]
prop_exampleOrthonormal = canon [("uvwuuv", 1)] == [("uw", -1)]
-- Returns canonical basis given orthonormal basis.
canonicalBasis :: String -> [String]
canonicalBasis = subsequences
\end{code}
If the $k$ is the same for every term when an multivector is written
canonically, then we say the multivector is a '$k$-vector'. A 0-vector is also
called a 'scalar', a 1-vector a 'vector', a 2-vector a 'bi-vector', and a
3-vector a 'tri-vector':
\begin{code}
-- Assumes input is in canonical form.
name :: Multivector -> String
name x
| null x = "zero multivector"
| all ((== k) . length) ss = show k ++ "-vector" ++ t
| otherwise = "mixed multivector"
where
s:ss = M.keys x
k = length s
t | null (aka k) = ""
| otherwise = " or " ++ aka k
aka 0 = "scalar"
aka 1 = "vector"
aka 2 = "bivector"
aka 3 = "trivector"
aka _ = ""
prop_exampleNames = and
([ name [] == "zero multivector"
, name [("", 42)] == "0-vector or scalar"
, name [("x", -5), ("y", 98)] == "1-vector or vector"
, name [("ab", 2), ("bc", 3)] == "2-vector or bivector"
, name [("abc", 4)] == "3-vector or trivector"
, name [("abcde", 0.01)] == "5-vector"
, name [("", 2), ("bc", 3)] == "mixed multivector"
] :: [Bool])
\end{code}
(We omit the proof the above is well-defined, that is, a $k$-vector is
a $k$-vector no matter which orthonormal basis is chosen.)
Lastly, $n$-vectors are also called 'pseudoscalars'. We write $\vv{I}$ for
the pseudoscalar $\e_1 ... \e_n$, which is unique up to sign for any
choice of basis. (This is implied by the omitted proof.)
Then $\vv{I}^{-1} = \e_n ... \e_1 = (-1)^{n-1}\vv{I}$.
== The fundamental identity ==
Let's try a couple of 2D examples.
Let $\e, \f$ be an orthonormal basis of a plane.
\begin{code}
prop_2Dsquare = join gMul [("e", 3), ("f", 4)] == [("", 25)]
prop_2Dmul = gMul
[("e", 3), ("f", 4)] [("e", 5), ("f", 6)] == [("", 39), ("ef", -2)]
prop_2DmulOrth = gMul
[("e", 3), ("f", 4)] [("e", -4), ("f", 3)] == [("ef", 25)]
\end{code}
We see the geometric product of two vectors is a scalar and a bivector.
Furthermore, appearances of $0$ and $25 = 5^2 = 3^2 + 4^2$ hints at a deeper
truth. Expanding $(a \e + b \f)(c \e + d \f)$ proves the 'fundamental
identity': for any vectors $\u$ and $\v$ in our plane,
\[
\u\v = \u\cdot\v + \u\wedge\v
\]
where $\u\wedge\v$ is the 'outer product' or 'wedge product', the bivector
component of the geometric product of two vectors, and satisfies:
\[
\u\wedge\v = |\u| |\v| \sin \theta \i
\]
where $\theta$ is the angle between $\v$ and $\w$, and
$\i$ is the bivector $\e \f$.
We interpret $\i$ to mean the plane generated by $\e$ and $\f$. With vector
algebra, we represent planes with normal vectors, which only makes sense in
exactly three dimensions. With geometric algebra, we represent a plane by
multiplying two orthonormal vectors that generate them; this works in any
dimension.
We find $\i^2 = -1$, which explains the choice of notation. In fact, for any
given plane we can construct a complex number system in this manner, that
is, multivectors of the form $a + b\i$. Such
numbers are called 'geometric algebra complex numbers'.
We can also write:
\[
\u\v = |\u| |\v| e^{\i\theta}
\]
which resembles a standard complex number in polar form.
Suppose $|\u| = |\v|$. Then left multiplying by $u$ gives:
\[
\v = \u e^{\i\theta}
\]
Thus the geometric algebra interpretation of
$e^{\i\theta}$ is a rotation by $\theta$ in the plane $\i$, rather than a
particular point on a unit circle measured from some predefined zero
bearing. This contrast is a rotational analogue of the difference between
vectors and Cartesian points.
== Rotations, in any dimension ==
We noted complex numbers excel at describing rotations in two dimensions,
and quaternions in three. Geometric algebra complex numbers excel at describing
rotations in any dimension.
Let $\i$ be the product of two orthonormal vectors representing the plane in
which we wish to rotate. Let $\theta$ be the angle about the origin we wish to
rotate. Let $\u$ be a vector.
Decompose $\u$ with respect to $\i$:
\[
\u = \u_{\perp} + \u_{\parallel}
\]
That is, we find vectors $\u_{\perp} \cdot \i = 0$ and
$\u_{\parallel} \wedge \i = 0$
satisfying the above summation. These are unique and readily computed, but for
this section we only need their existence.
Rotation only affects the second term, so $\u$ rotates to:
\[
\begin{array}{ccl}
\u_\perp + \u_\parallel e^{\i\theta}
&=& \u_\perp e^{-\i\theta/2} e^{\i\theta/2} + \u_\parallel e^{\i\theta/2} e^{\i\theta/2} \\
&=& e^{-\i\theta/2} \u_\perp e^{\i\theta/2} + e^{-\i\theta/2} \u_\parallel e^{\i\theta/2} \\
&=& e^{-\i\theta/2} \u e^{\i\theta/2}
\end{array}
\]
where we have used $\u_\perp \i = \i \u_\perp$ and
$\u_\parallel \i = - \i \u_\parallel$, identities that can be verified by
setting $\i = \e\f$ for orthonormal vectors $\e,\f$ and a touch of
algebra.
In 3D, choose a unit normal $\n$ to $\i$ so that $\e\f\n =\vv{I}$. Then we can
phrase a rotation of a vector $\u$ about the axis $\n$ by the angle $\theta$
as:
\[
e^{-\n\vv{I}\theta/2} \u e^{\n\vv{I}\theta/2}
\]
This formula implies Euler's rotation theorem, as composing two rotations
results in an expression of the same form, but our argument has gaps since we
skipped some proofs.
The multivector $e^{-\n\vv{I}\theta/2}$ is sometimes called a 'rotor'.
== Problem solved ==
We now solve our opening problem with a minimum of paperwork.
Let $\newcommand{\x}{\vv{x}} \newcommand{\y}{\vv{y}} \newcommand{\z}{\vv{z}}
\x, \y, \z$ be an orthonormal basis of $\mathbb{R}^3$,
so $\vv{I} = \x\y\z$. We focus on the right half of the rotation formula.
For a rotation of 90 degrees around the $x$-axis followed by a rotation of
90-degrees around the $z$-axis, it reads:
\[
\begin{array}{ccc}
e^{\x\vv{I}45^\circ} e^{\z\vv{I}45^\circ}
&=& (\cos 45^\circ + \y\z \sin 45^\circ)(\cos 45^\circ + \x\y \sin 45^\circ) \\
&=& \frac{1}{2} + \frac{\y\z + \x\y - \x\z}{2} \\
&=& \cos 60^\circ + c(\x + \z + \y)\vv{I} \sin 60^\circ
\end{array}
\]
for some constant $c \gt 0$. Therefore the combined rotation is 120 degrees
around the axis $\x + \y + \z$, or in Cartesian coordinates, the axis through
the origin and (1, 1, 1).
We can easily write code to compute the axis and angle of the composition of
any two 3D rotations with axes through the origin, and use it to check our
work:
\begin{code}
rotr :: [Double] -> Double -> Multivector
rotr axis theta = canon [("", c), ("yz", nx*s), ("zx", ny*s), ("xy", nz*s)]
where
c = cos (theta / 2)
s = sin (theta / 2)
[nx, ny, nz] = map (/ sqrt(sum $ map (^2) axis)) axis
m!s = M.findWithDefault 0 s m
axisAngle :: Multivector -> ([Double], Double)
axisAngle m = ([m!"yz", -(m!"xz"), m!"xy"], acos (m!"") * 2)
prop_rotationExample =
(theta * 180 / pi) ~== 120 && all (~== 1) (map (/ head axis) axis)
where
a ~== b = abs (a - b) < 10**(-6)
x90 = rotr [1, 0, 0] $ pi/2
z90 = rotr [0, 0, 1] $ pi/2
(axis, theta) = axisAngle $ gMul x90 z90
\end{code}
We can animate these rotations on this very webpage using these geometric
algebra routines by generating JavaScript from
link:garot.hs[Haskell code that draws cubes].
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
We've barely scratched the surface.
With
link:hga.html[the geometric algebra homogeneous model of $\mathbb{R}^n$],
we get:
* 4x4 matrices for manipulating points in 3D, just like with vector algebra: rotations, translations, perspective projection, and so on.
* Simple formulas for subspaces (lines, planes, and beyond). for example, a line is defined by two points $p, q$, so its equation is $p \wedge q$; a plane is defined by three points $p, q, r$ , so its equation is $p \wedge q \wedge r$.
* Simple formulas for transforming subspaces, aka outermorphisms:
$f(p \wedge q) = f(p) \wedge f(q)$.
* Simple formulas for finding the intersection of subspaces: $a^* \cdot b$.