## The Pentium Problem

Late last year [1994] there was a major flap in the media about Intel's Pentium (TM) microprocessor chip. In this article I will try to summarize what the fuss was about, and some of the mathematics involved.

The Pentium microprocessor is the CPU (central processing unit) for what are now possibly the widest-selling personal computers. Unlike previous CPUs that Intel made, the 486DX and Pentium chips included a floating-point unit (FPU) also know as a math coprocessor. Previous Intel CPUs did all their arithmetic using integers; programs that used floating-point numbers (non-integers like 2.5 or 3.14) needed to tell the chip how (for example) to divide them using integer arithmetic. The 486DX and Pentium chips have these instructions built into the chip, in their FPUs. This makes them much faster for intense numerical calculations, more complex, and more expensive. The problem for Intel is that all Pentiums manufactured until sometime this fall had errors in the on-chip FPU instructions for division. This caused the Pentium's FPU to incorrectly divide certain floating-point numbers.

Many software packages, including many that do use floating-point numbers, don't actually use a computer's FPU. These packages don't show the error. Also, only certain numbers (whose binary representation show specific bit patterns) divide incorrectly. Consequently many users may never encounter the division error. The most famous example and the worst well-known case is 4195835/3145727, discovered by Tim Coe of Vitesse Semiconductors. The correct value is 1.33382 to 6 sig. figs, while the flawed Pentium's floating-point unit computed 1.33374 to 6 sig figs, a relative error of 0.006%. One can easily test a Pentium using Microsoft's Windows and this example: Use the Windows calculator in scientific mode to divide Coe's numbers and compare to the numbers above.

How did all this get into the news? Thomas Nicely is a math professor at Lynchburg College, a school about Willamette's size in Virginia. In summer/fall 1994, he was computing the sum of the reciprocals of a large collection of prime numbers on his Pentium-based computer. Checking his computation, he found the result differed significantly from theoretical values. He got correct results by running the same program on a computer with a 486 CPU, and finally he tracked the error to the Pentium itself. After getting no real response to his initial queries to Intel, and after checking his facts, Nicely posted a general notice on the Internet asking for others to confirm his findings. Magazine interviews and ultimately a CNN interview followed.

Intel publicly announced that "an error is only likely to occur [about] once in nine billion random floating point divides", and that "an average spreadsheet user could encounter this subtle flaw once in every 27,000 years of use." Critics noted that while hitting a pair of "bad inputs" was unlikely, the Pentium's output for those inputs was wrong every time. Others suggested that some "bad inputs" might occur with disproportionate frequency in common calculations. Many noted that without completely repeating massive calculations on other computers, they could never tell if they had indeed encountered any of the bad inputs. Within a month IBM halted shipment on Pentium-based computers (which comprised only a small percentage of IBM's computer production) and announced that "Common spreadsheet programs, recalculating for 15 minutes a day, could produce Pentium-related errors as often as once every 24 days."

Intel's policy, when it first publicly admitted the problem around November 28 of 1994, was to replace Pentium chips only for those who could explain their need of high accuracy in complex calculations. (Being a math professor seemed to help.) Great public outcry ensued, with Intel the butt of many jokes. By late December Intel capitulated and announced a free replacement Pentium for any owner who asked for one.

The mathematical basis for the bug: The built-in divider in the Pentium FPU uses a radix 4 SRT algorithm. The strength of this algorithm is that it can compute two (binary) digits of a quotient per step, rather than only one per step as in earlier Intel FPU's. The weakness is that the algorithm needs a stored table of values (a "division table", not unlike a "multiplication table"). This table was incorrectly entered into the Pentium FPU -- five entries out of about a thousand were omitted. As in normal long division, at each step in dividing m by n, the Pentium looks at the first few digits of n and of the remainder so far. It uses these as column and row entries in its table to estimate the next few digits of the quotient, then multiplies and subtracts to get the next remainder by the usual method. The table includes negative entries that make up for over-estimation in previous steps, and of course the Pentium does it all using binary instead of decimal numbers.

Using regular long division, the remainder should always be less than ten times the divisor. Similarly, some combinations of remainder and divisor should not occur for the SRT method. Consequently entries above a certain diagonal line in the table can be omitted. It was some of the Pentium's entries on this line that were incorrectly left set at zero. There is disagreement over why. Intel's official position is that "a script was written to download the entries into a hardware PLA [i.e. division table]. An error was made in this script that resulted in a few lookup entries ... being omitted from the PLA." Others have claimed to know that someone erroneously proved these entries would never be used, so the entries could be omitted. In either case, erroneous results followed.

References:
Statistical Analysis of Floating Point Flaw in the Pentium (TM) Processor (1994), Sharangpani and Barton, Intel Corporation, Nov. 30,1994.
The Pentium Papers collection is an archive with many original sources from the principal parties involved.