Proof that Pi Exists

by Ilan Vardi

Problem: Prove that pi exists.

Well, the first problem is understanding what the problem is. Indeed, even the compendium [L. Berggren, J. Borwein, P. Borwein, Pi: A Source Book, Springer Verlag, New York 1997] fails to provide a proof of pi's existence! Basically, you need to figure out what the exact definition of pi is, and then rigorously prove that this defines a unique real number. In effect, this problem is an exercise in mathematical rigor.

Sub Problem 1: What is the correct definition of pi?

Some people try to get around the technical difficulties encountered in the proof below by defining pi in unusual ways (e.g., Apostol in what is therefore his deficient Calculus text defines pi as the area of a circle of radius 1), but you really can't get around the following:

Definition: Pi is the ratio of the circumference of a circle to its diameter.

Now that the definition is settled, it's time to grapple with the more subtle issue which is simply:

Sub Problem 2: Why does the above definition require any kind of proof?

Now is the time to think deep thoughts.

Deep Thought Break.

Well, the first deep point shouldn't be that hard to figure out: The definition of pi is for any circle, so you have to prove the (obvious) fact that the ratio of circumference to diameter is always the same for all circles.

Lemma 1: Define pi(C) to be the ratio of the circumference of circle C to its diameter. Then for any two circles C_1, C_2 one has pi(C_1) = pi(C_2).

That is not too hard to prove, as will be seen below. But before doing that, the time has come to face the more difficult issue, which is that even Lemma 1 is not sufficient to prove pi's existence (and even Lemma 1 fails to work at this point). Deep thought time again.

Deep Thought Break.

The problem is that the circumference of a circle is not a straight line, and it may be possible that its length is infinite. If that were the case, then pi would certainly not be defined, since it would not be a real number (and Lemma 1 would be meaningless). In other words, one must provide a completely rigorous argument which shows that the circumference of a circle is bounded above.

Lemma 2: Let C be any circle and let pi(C) be defined as above, then pi(C) <= 4.

Finally, Lemmas 2 and 1 imply the existence of the real number pi proving the main result.

Proof of Lemma 2: I will give an outline of a proof and state at every point what assumptions I am making and what proofs I am omitting or leaving to the reader. These gaps can be filled in by refering to any geometry text. A complete proof of this result from a set of axioms and postulates (what is the difference between these two?) appears in the book [E. Moise, Elementary Geometry from an Advanced Viewpoint, Addison-Wesley, Reading MA, 1974]. Note that I have not included any diagrams. You will have to use your imagination!

Consider a circle C of radius r (and diameter 2r). Note that a circle is defined to be the set of points equidistant from a point (this common distance is the definition of the radius). Before proving the upper bound, I will first prove the lower bound pi(C) > 3. This will be useful to understand the subtleties involved.

To prove the lower bound, just inscribe a regular hexagon. Each side of the regular hexagon has length r (explain why), so the total perimeter of the hexagon is 6r. But the length of the perimeter of the circle is greater than the perimeter of the hexagon. The reason is that each arc subtending a side of the hexagon is longer than the side because it is an axiom that a line is the shortest distance between two points.

It follows that

pi(C) = (perimeter of C)/(diameter of C)= (perimeter of C)/(2 r) > (6r)(2r) = 3,

which is the lower bound.

That was not too difficult, but showing that pi(C) <= 4 is much harder, as will now be seen. The basic idea is just as simple, you circumscribe a square of side 2r and since the square has a longer perimeter, the result should follow.

Unfortunately, it is no longer an axiom that the perimeter of the square is longer than the perimeter of the circle. Of course, you can make it an axiom, which is exactly what Archimedes did (the main point is that Archimedes recognized that this needs to be based on some unproved assumption). Archimedes' axiom says [Archimedes, On the Sphere and Cylinder I, Assumption 2, in ``The Works of Archimedes,'' translated by T.L. Heath, Dover, New York, 1953]:

``Of other lines in a plane and having the same extremities, [any two] such are unequal whenever both are concave in the same direction and one of them is either wholly included between the other and the straight line which has the same extremities with it, or is partly included by, and is partly common with, the other; and that [line] which is included in the lesser [of the two].''

This definition says that if two concave curves are such that one is inside the other except that their ends meet, then the outside one is longer. In fact, it is not that hard to prove Archimedes' axiom,

Self Study Problem: Prove Archimedes' axiom from first principles.

Since the circle and the square are both concave and the four pieces of the square that touch the circle satisfy the assumption, it follows that the square has a longer perimeter than the circle, proving the lemma.

The appeal to a new axiom is not too elegant, and should be avoided if possible. Of course, one can simply prove the axiom directly, but that is a bit of work. One is spared this effort in this case, because the two curves involved in the lemma, the circle and the square are much simpler than general concave curves, and it is possible in this case to prove Archimedes' axiom directly. Before you can do this, though, you have to define what you mean by length. One definition can go as follows:

Definition. Consider a curve in the plane given by a map f:[0, 1] -> R^2 (explain why this assumption does not really lose any generality). Then the length of the curve from f(0) to f(1) will be the least upper bound of the numbers

|f(a_0) - f(a_1)| + |f(a_1) - f(a_2)| +... + |f(a_{n-1}) - f(a_n)|,

where 0=a_0 < a_1 < a_2 <... < a_n = 1 ranges over all partitions of [0, 1].

In other words, you approximate the curve by little segments, and add up the lengths of all the little segments and look at the values this takes. It is actually a theorem that must be proved that no matter how you decide to partition [0, 1], as long as you have |a_i - a_{i+1}| -> 0 (i.e., your partitions become finer and finer), the sum will approach the same number. Note that this definition requires the least upper bound axiom of the real numbers, i.e., that every bounded set of real numbers has a least upper bound that is a real number.

In the case of a circle, the definition of length is much simpler:

Definition: The circumference of a circle is the least upper bound of the perimeters of all inscribed polygons.

Given this definition, in order to show that pi(C) <= 4, one has to show that given any partition of the circle, i.e., an inscribed polygon, its perimeter is less than the circumbscribed square

The proof of this is based on a simple lemma

Lemma 3. Consider an isosceles triangle ABC with AB = AC. If AB and AC are extended to AD and AE, then DE >= BC.

Before proving this lemma, let's see how it proves the result. Given an inscribed polygon, take any side meeting the circle at P and at Q. If the circle has center O, then OP=OQ. Now Let OP meet the square at R and OQ meet the square at S, then by the lemma RS > PQ. Doing this for every side of the polygon shows that the perimeter of the polygon is smaller than the perimeter of the square.

It follows that any inscribed polygon has perimeter smaller than the square, so the least upper bound of the perimeters is smaller than or equal to the perimeter of the square which is equal to 8r. Since the least upper bound of the perimeters is equal to the length of the perimeter of C, one has

pi(C) = (perimeter of C)/(diameter of C) = (perimeter of C)(2r) <= (8r)(2r) = 4.

It follows that pi(C) <= 4, so you need to do even more work to show that pi(C) < 4. The main point, however, is that you have at least shown that pi(C) is finite.

Proof of Lemma 3: Assume that AE >= AD (the other case is similar) then let F be on AE such that AF=AD.

The triangles ABC and ADF are similar so it follows that DF >=BC (explain) and it is sufficient to prove that DE \ge DF. To do this, note the angle AFD is less than 90 degrees. This is shown by appealing to the elementary theorem that the base angles of an isosceles triangle are equal. Since the angle sum of a triangle is 180 degrees (why?), this implies that the base angles are each less than 90 degrees. Since angle AFD is less than 90 degrees, angle EFD is greater than 90 degrees (why?), so that the other angles in triangle EFD are less than 90 degrees (why?) so that angle EFD is the largest angle in this triangle. The result now follows from

Proposition. If two sides of a triangle are not congruent, then the angles opposite them are not congruent, and the larger side is opposite the larger angle.

Proof: Let the triangle be PQR, where PQ != PR.

Assume that PR > PQ (the other case is similar), and let S be on PR such that PS=PQ. It follows from the isosceles triangle theorem that angle PQS is equal to angle PSQ. Since S is in the interior of PR, it follows that angle PQS is less than angle PQR (though obvious, this must be proved from the axioms). Another elementary fact (though obvious this must be proved from the axioms) is that S lying in the interior of PR implies that angle PSQ is greater than the angle PRQ. Combining all these facts shows that angle PQR is greater than angle PRQ proving the proposition. Q.E.D.

Proof of part Lemma 1: To show that pi(C) is the same for all circles, one uses the method of exhaustion invented by Eudoxus (ca. 408B.C.-355B.C.) and used extensively by Archimedes to prove his results on areas and volumes of circles, spheres, etc. You will note that this method is essentially the method of limits used in calculus (so the ancient Greeks already understood limits).

This method uses a proof by contradiction. So assume that there are two circles C_1 and C_2 such that pi(C_1) != pi (C_2). One can assume that pi(C_1) > pi(C_2) since the other case works the same way.

The first step is to move circle C_1 so it has the same center as circle C_2, i.e., they are concentric. You have to prove that this does not change the length of the perimeter of circle C_2 nor its radius. This is a good exercise in seeing if you have understood the definitions.

Given that you have proved that this does not change pi(C_2), I will arrive at a contradiction. Now, since pi(C_1)> pi(C_2), the definition of least upper bound implies that there is a partition P_1 of the circle C_1 such that the length of P_1, i.e., the perimeter of the inscribed polygon given by P_1, divided by the diameter of C_1 is greater than pi(C_2).

Now consider the partition P_2 of C_2 generated by P_1 as follows: For each point p_1 of P_1 find the corresponding point p_2 on C_2 by drawing a radius through p_1 and letting p_2 be the point on C_2 meeting this radius (on the same side of the common center).

Now for any two adjacent points p_2,p_2' of P_2, consider the ratio |p_2 - p_2'|/r_2, where r_2 is the radius of C_2. By similar triangles and the construction of P_2, this is equal to |p_1 - p_1'|/r_1, where r_1 is the radius of C_1. It follows that

1/r_1 sum |p_1 - p_1'| = 1/r_2 \sum|p_2 - p_2'|

where the summation is over all adjacent points in each partition. Dividing this by 2 gives

1/d_1 \sum |p_1 - p_1'| = 1/d_2 \sum|p_2 - p_2'|

where d_1,d_2 are the diameters of C_1,C_2, respectively. But the assumption that the partition P_1 gave a ratio greater than pi(C_2) and the above equation leads to

1/d_2 \sum|p_2 - p_2'| > pi(C_2)

which contradicts that pi(C_2) is the least upper bound for such ratios (why is it the least upper bound of such ratios?). The result follows by contradiction.


Back to ilanpi