The future of proof

The use of computers means that modern mathematical proofs can run to millions of pages. Such proofs can never be fully verified by humans alone. Does this mean, as some argue, the death of proof?
March 22, 2007

The year: 1611. The person: Johannes Kepler, mathematician and astrologer. The problem: Kepler wanted to give his sponsor, John Matthew Wacker, a new year's present, but was short of cash. The solution: he wrote a book and dedicated it to Wacker. The work was Strena Seu de Nive Sexangula ("On the Six-Cornered Snowflake"); it discussed the puzzling tendency of snowflakes to form beautiful tree-like shapes with approximate sixfold symmetry.

We now know that these shapes—and many others that snowflakes can take—are the result of crystal growth in the presence of varying temperature and humidity, up above in the storm clouds. Atom by atom, the snowflake puts itself together, and initially the atoms join up into a structure resembling a microscopic honeycomb. The snowflake retains its hexagonal symmetry as more atoms attach themselves to this initial seed, but those conditions fluctuate over time, leading to an endless variety of shapes.

In the 17th century, the atomic structure of matter was still philosophical speculation, but in a few pages, and without experimental evidence, Kepler deduced that a snowflake must be made from many identical tiny spherical units, closely packed together. On a plane surface, the densest way to pack identical spheres is in a honeycomb—like the red snooker balls at the start of a frame. But what about in three dimensions? Kepler suggested—though not in this language—that the most efficient packing is the one that greengrocers use to stack oranges.

This may seem obvious, but it took nearly 400 years for someone to put together a logically rigorous proof that Kepler was right. In 1998 Thomas Hales announced such a proof, but there was one snag: it involved a three-gigabyte computer file.

What makes the proof so complicated? It takes a lot of work to rule out vast numbers of irregular packings, showing that they are not as dense as the one that greengrocers use. The wonder is that the question can be reduced to a computation at all; it's no surprise that the answer turns out to be gigantic.

Proofs like Hales's raise a serious philosophical question. Until recently, mathematical proofs could always be verified by any competent human being. It might take weeks to work through a proof, but it could be done, because that's what the originator of the proof did. Computers change the game: in a few seconds they can perform more calculations than an unaided human can in a lifetime. (This does not make mathematicians obsolete, as some assume—the problem still needs to be turned into something that a computer can answer. This step is often at least as difficult and creative as a traditional proof.)

Can something be considered a proof if no human can verify it without a computer? For that matter, what is a proof? And why are proofs needed at all?

The traditional model derives from the writings of Euclid of Alexandria, particularly his famous Elements. For Euclid, it was not enough to provide circumstantial evidence or numerous examples for mathematical statements. Everything had to be proved, with logic. A familiar example is Pythagoras's theorem: the square on the longest side of a right-angled triangle is the sum of the squares on the other two sides. This is proposition 47 of Book I of the Elements, and its proof relies on most of the previous 46.

Euclid's successors made proof the cornerstone of mathematics. Even the more pragmatic among them refused to accept assertions that lacked proofs. Those of a philosophical bent dug into the subject's logical foundations, and proposed a formal definition that is still taught to most budding maths undergraduates: a "proof" is a series of statements, each following logically from previous statements and a fixed list of axioms (statements agreed as a starting point).

How do we guarantee that an alleged proof has no logical errors or gaps? The time-honoured method is to write it down, so that anyone can check the logic for themselves. In this way, mistakes can be isolated and gaps detected. The formal definition of "proof" places no limit on the number of logical steps involved, but there used to be an unspoken assumption that the verification process could, and should, be carried out by one unaided human brain.

However, a three-gigabyte computer file is equivalent to about 10,000 full-length novels, and it would take about 30 years merely to read it. Making sense of it, and checking each deduction, would take several lifetimes. If computers were infallible, we might be happy to accept the results without question. Then we would just have to check that the computer was obeying the correct instructions, and satisfy ourselves about the logical structure of the computation. Unfortunately, computers can make mistakes. The most famous example is a design flaw in the Pentium chip, discovered in 1994—the chip sometimes got division sums wrong. In practice, such flaws seldom cause serious problems, either because they have no effect or because they lead to such egregious nonsense that it stands out like a sore thumb. But can a computer-aided proof ever be as trustworthy as traditional pencil and paper proofs?

Perhaps not, but humans are far more likely to make undetected errors than computers. If the same program is run on several computers, using different operating systems, and the results agree, then computer errors are unlikely to have affected the result. Moreover, the problem of size is not confined to computer-aided proofs. One of the big breakthroughs of the late 20th century was a complete list of all possible "finite simple groups"—the basic building blocks of symmetry. The proof—the joint efforts of hundreds of mathematicians—occupies around 15,000 pages. Can anyone truly read through all that and swear that every single step is right?

One school of thought considers these problems to be insuperable. Writing in Scientific American in 1993, John Horgan boldly declared the "death of proof." Andrew Wiles's magnificent proof of Fermat's last theorem, obtained a few months earlier without computers, was mentioned under the heading "A splendid anachronism?" Logical proofs, Horgan suggested, might become obsolete; to be replaced by experiments. To prove Pythagoras's theorem, draw a lot of triangles. To prove the Kepler conjecture, play with oranges. However, rejecting a computer-assisted proof because it might have logical gaps only to replace it with experiments that definitely do have gaps makes no sense. Indeed, there are plenty of cases where "experimental" evidence points to one conclusion while a proof points to a different one.

Most mathematicians rejected Horgan's article as a selective presentation of a complicated issue. More than a decade later, proof is still thriving. But there is nevertheless a fundamental tension between the philosophical definition of a proof—what mathematicians should do—and what they actually do. Mathematicians are much more interested in solving problems than in philosophising about their methods. They set very high logical standards, but they are also pragmatic. Life is too short to write proofs the way philosophers say we should, and the results would be incomprehensible if we did. It took Bertrand Russell and Alfred North Whitehead several hundred pages to give a formal proof, in their Principia Mathematica, that 2+2=4. The proof is virtually unreadable, and Russell nearly had a nervous breakdown checking it. So the formal definition of proof may be neat and tidy, but it has several defects. It focuses solely on content and ignores context, it fails to distinguish important ideas from trivial ones, and it does not represent what people do in practice.

Suppose you ask for directions to a small village in Devon. What you don't want is a lengthy description of how to get your car out of the garage, reverse into your home street and avoid next door's cat. A list of 10,000 instructions, presented as if they were all equally important, is useless. But some of those instructions could be about things you really do need to know. The 815th instruction, for instance, might take you on to the slip road leading to the M5.

A more useful set of instructions, then, would be something like this: "Get on the M5 going south until you get past Bristol. Take exit 27 on to the A361 heading for Tiverton." When you get close to your destination, however, you really do want the details: "Drive along Old Barn Lane, past the duc pond; then take the first right by the dead oak tree on to the cart track, and watch out for the concrete block filling a pothole." You want the details in places where you are on unfamiliar ground, or where it is easy to take a wrong turn.
Real proofs, written and published by real mathematicians for the benefit of other mathematicians, are like that. "By a straightforward generalisation of the Bloggs-Smith theorem… An obvious calculation now shows that…" A really good proof, like a hassle-free car journey, includes signposts to tell you where you are going next, and why.

An accurate formal description of War and Peace might be "a series of words that follows the rules of grammar." But what matters about any novel is quite different: it has to tell a story. And it's the same for proofs. Yes, a proof is a series of logical deductions, but it is more informative to see it as a story—a story told by mathematicians to mathematicians, in a familiar language. If it contains logical gaps, or errors, the readers can—and eventually will—notice. That's what they have been trained to do.

What working mathematicians really mean by a proof, then, is both more and less than a formal list of logical steps. It is more, because the steps have a narrative structure and are equipped with signposts; it is less because gaps are deliberately left whenever the corresponding logical steps are routine and can be reconstructed by anyone familiar with the topic.

This style of proof is more acceptable to human thought patterns than a structureless series of tiny steps, all explicitly cross-referenced to sources that justify them. However, there is a pitfall: steps that in prospect look routine, and are therefore omitted, may conceal unexpected difficulties. Few attempted proofs fall down because some explicit step has been carried out illogically, but a lot fall down because "obvious" steps turn out to be wrong.

article body image

For example, it may seem obvious that you can't cut a solid sphere into a finite number of pieces and fit the same pieces back together to make two solid spheres, each the same size and shape as the original. Obviously the total volume of the pieces can't change if they are rearranged. However, this argument tacitly assumes that the volumes of the pieces can sensibly be defined. As it happens, sufficiently complicated pieces need not have sensible volumes at all—and the "impossible" dissection can actually be achieved.

Mathematicians are aware of the danger of omitting allegedly obvious steps, but because the formal style of proof is incomprehensible, they prefer to follow the narrative line. As a result, there are many styles of proof. There is the short, snappy proof, like a poem. There is the proof that carefully assembles key facts, one at a time, and puts them together, like a magazine feature article. There is the 300-page blockbuster that develops entire new theories, and uses them to demolish a major problem—roughly like a novel. And there is the massive, computer-assisted compilation, akin to a telephone directory.

Here's a simple "poetic" proof. Remove two diagonally opposed corner squares of a chessboard, then try to cover all the remaining squares using 31 "dominoes," each the same size and shape as two adjacent squares. It's impossible: however you try to make the pieces fit, you always run into trouble. But how do you prove it? The proof requires one simple idea: every domino must cover one black square and one white square, so the total numbers of each square covered are equal. But the mutilated chessboard has 32 squares of one colour and 30 of the other. Once you get the clever idea of looking at the colours, this proof is short, sweet and convincing.

The novel-like proof is familiar to any professional mathematician; most of us have to work through several of these blockbusters during our careers. Quite a few of us write them. My longest is about 60 pages—a novelette. Wiles's proof of Fermat's last theorem is a good example. The story that it tells begins by linking the desired theorem to a different area of mathematics, the theory of "elliptic curves." It then states a new theorem about such curves which, if true, would finish the job. Then it develops several key techniques that are needed to prove this new statement (the "Taniyama-Shimura conjecture"). Finally, it brings the new methods to bear on the new statement, and proves that it is true.

Thomas Hales's proof of the Kepler conjecture is mostly of the phone directory type, though it sets up the computer calculation with something closer to a poem. That is, there is a simple, very clever idea, which reduces the problem to a very large list of routine computations; this reduction is "poetic." But the poem has no implications until the calculations are carried out, and that bit is a phone book.

Mathematicians prefer poetry to phone books. The famous—and eccentric—mathematician Paul Erdös described poetic proofs as from "the Book," a fanciful image in which God owns a book containing all the best proofs, and mere mortals are occasionally afforded a glimpse of its contents.

Two proofs that have featured prominently in the mainstream press are Wiles's proof of Fermat's last theorem and Grigori Perelman's (still slightly controversial) proof of the Poincaré conjecture. The first states that two perfect nth powers cannot add up to a perfect nth power, for all n greater than or equal to three. The second characterises the three-dimensional analogue of the surface of a sphere in terms of closed curves. Both proofs are of the blockbuster novel type, rich in new mathematical ideas, telling a strong but complex mathematical story.

In most areas of human activity, accomplishing something difficult is seen as a triumph. It's the same in mathematics. Computers open up new realms of discovery, relieving the human brain of routine tasks, leaving it free to concentrate on the big picture. Using computers in proofs—provided it is done carefully—is an intelligent application of technology to extend the powers of the human mind. Poems from the Book will always take pride of place, but in their absence, a telephone directory is entirely acceptable. And if a long, complicated proof is all we can find, it should be celebrated as a major intellectual achievement, not derided because it's difficult.