Computing Pi in C

Dik T. Winter wrote a 160-byte C program to compute the first 800 digits of pi. We analyze his code here. The original code

int a=10000,b,c=2800,d,e,f[2801],g;main(){for(;b-c;)f[b++]=a/5;
for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,
f[b]=d%--g,d/=g--,--b;d*=b);}

can be rewritten as:

#include <stdio.h>

int main() {
    int r[2800 + 1];
    int i, k;
    int b, d;
    int c = 0;

    for (i = 0; i < 2800; i++) {
	r[i] = 2000;
    }
    r[i] = 0;

    for (k = 2800; k > 0; k -= 14) {
	d = 0;

	i = k;
	for (;;) {
	    d += r[i] * 10000;
	    b = 2 * i - 1;

	    r[i] = d % b;
	    d /= b;
	    i--;
	    if (i == 0) break;
	    d *= i;
	}
	printf("%.4d", c + d / 10000);
	c = d % 10000;
    }

    return 0;
}

Then we see that during the first iteration of the \(k\) loop, the code is essentially computing the digits of

\[ 2 \left( 1 + \frac{1}{3}\left( 1+\frac{2}{5} \left( 1 + ... \left( 1 + \frac{2799}{2\cdot 2799 + 1}(1 + 0)\right) ...\right) \right) \right) \]

By Beeler et al. 1972, Item 120, this is an approximation of \(\pi\).

Each term in the approximation gives an additional bit of precision (see above link) thus 14 terms give 4 decimal digits of precision each time (since \(2^{14} > 10^4\)). This is why \(2800 = 14 \times 200\) terms are used.

(An earlier translation neglected to initialize r[2800] to zero, so it differed to the original program, but it turns out the bounds are sufficiently loose that it worked anyway.)

To get a better idea of what the program does, let us write

\[ \frac{a_0}{b_0} = 1, \frac{a_1}{b_1} = \frac{1}{3}, \frac{a_2}{b_2} = \frac{2}{5}, ... \]

where each \(a_i, b_i\) are coprime positive integers.

Set \(n = 2799\). Then our goal is to compute the digits of

\[ P = 2 {(1 + \frac{a_1}{b_1}(1+\frac{a_2}{b_2}(1 + ...(1 + \frac{a_n}{b_n}(1))...)} \]

Let \(x = 2 \cdot 10^7\). In the first iteration, we compute

\[ P_0 = x + a_1 {\lfloor \frac{1}{b_1} {( x + ... a_{n-1} {\lfloor \frac{1}{b_{n-1}} {(x + a_n { \lfloor \frac{1}{b_n} x \rfloor } )} \rfloor} ... )} \rfloor} \]

Let \(x = q_n b_n + r_n\) and \(q_i b_i + r_i = x + q_{i+1}b_{i+1}\) for \(i < n\) where the \(q_i, r_i\) are positive integers. Then \(P_0 = q_0\), and we have

\[ 10^7 \cdot P = q_0 + r_0 + \frac{a_1}{b_1}{(r_1 + \frac{a_2}{b_2}{(r_2 + ...(r_{n-1} + \frac{a_n}{b_n}{(r_n)} )} ... )} .\]

The program then prints out \(\lfloor q_0 / 10^4 \rfloor\), which should be the first four digits of an eight-digit number.

The first four digits are lopped off by reducing modulo \(10^4\). Then on the next iteration, we roughly compute \(10^4\) times the error term

\[ r_0 + \frac{a_1}{b_1}{(r_1 + \frac{a_2}{b_2}{(r_2 + ...(r_{n-1} + \frac{a_n}{b_n}{(r_n)} )} ... )} \]

add any possible carry to the unprinted digits from the last iteration, and print the first four digits. This process is repeated until 200 digits are being printed. At each step, we can forget about 14 terms because from the above bounds we know they cannot possibly affect any other digits.

However there is one thing I don’t understand. Why is it guaranteed only four digits will be printed each time? For example if we add a carry to 9999 then it will take five digits. In fact, 9999 appears in the first 200 digits, but fortunately there is no carry immediately after that.


Ben Lynn blynn@cs.stanford.edu 💡