= 34905295108476509491478496199038 98133417764638493387843990820577
times
32769132993266709549961988190834 461413177642967992942539798288533.
Some have called it the largest single computation in history. What's the big deal, how was it done, and how was WU involved?
The group was led by Arjen Lenstra, a computer scientist and factoring specialist at Bellcore in New Jersey, and included Paul Leyland from Oxford (England) and students from MIT and Iowa State. They used an advanced factoring method called the Multiple Polynomial Quadratic Sieve and used the internet to solicit the help of about 600 volunteers and their computers from around the world. Some of Willamette's math faculty and their NeXT desktop machines participated during off-hours. The project took eight months and the equivalent of approximately 750 ten-MIPS computers. A $100 prize given by RSA was donated to the Free Software Foundation.
RSA stands for Ronald Rivest of MIT, Adi Shamir of the Weizmann Institute of Science, Israel, and Leonard Adleman of USC, who in 1977 announced a new cryptographic scheme that became know as the RSA public-key system. The RSA system is based on pairs of large primes (50-100 digits or more). It is relatively easy (for computers) to multiply such numbers, but very hard to find the product's factors if they are not known. Cracking the system would require this factorization, so one's secrets would be safe, said RSA.
To support their claim that such factorization was hard, RSA published several numbers in Scientific American, each of which were the products of two large primes, and offered token prizes for their factorization. One was RSA-129, the number now factored. Using factoring methods and computer technology available in 1977, factoring RSA-129 would have required 40 quadrillion years, RSA claimed. Lenstra's team used vastly superior mathematical methods and considerably more powerful computers.
With a public-key cryptographic system, each user maintains both a public key which s/he widely disseminates, and a private key. Encryption is easy with the public key, but decryption is very hard without the private key. If "Alice" wants to send private e-mail to "Bob", she encrypts using Bob's public key. "Charles" would use the same public key to mail to Bob. However, neither Alice nor Charles could decrypt the other's message, even if they eavesdrop, without Bob's private key.
Using the RSA scheme, one's public key is the product of a pair of primes, and the primes themselves are the private key. The mechanism of encrypting and decrypting builds on the unequal difficulty of multiplying vs. factoring. The difficulty of factoring is normally expected to keep the private key private.
It should be noted that by no means is the RSA system broken; this project did the equivalent of finding one person's private key from the public one. A new key is easy to choose, and RSA recommends larger, harder-to-factor keys anyway.
The mathematically interesting thing is how such a "simple" problem like factoring could be computed in parallel, i.e. broken up so that many workstations, acting independently, could each contribute part of a solution.
First, congruences: we say that a is congruent to b mod n, denoted
ab
(mod
n)
if a=b+kn for some integer
k. Think of it as
"clock arithmetic" - on a clock, 12 corresponds in some sense to 0; we
would say 120
(mod 12) and using a 24-hour-time example, 175
(mod 12). It's an interesting exercise to show that many properties of
equality also work for congruences, such as the ability to multiply both
sides by equal or congruent things, etc.
Now suppose n is a large positive integer we wish to factor. If we can find integers x and y such that x²y² (mod n) then x²-y²=kn for some integer k, so (x+y)(x-y)=kn. There's a chance that (x+y) and (x-y) might split the non-trivial factors of n. On the other hand, we might be unlucky; (x+y) and (x-y) could be k and n, respectively. (The RSA-129 factoring project was "unlucky" on its first three tries; see exercise #1 below.) If we find several such x,y pairs, we would expect eventually to get both (x+y) and (x-y) to share a non-trivial factor with n. Then using Euclid's (relatively fast) g.c.d. algorithm, g.c.d.(x+y,n) would give us one factor of n, and division would give the other.
So how do we find such x,y pairs? Again, suppose n is a large given integer. Let's use n=4633 as a "toy" example. First, choose a "small" set of primes (small is relative; the RSA-129 project used a list of about 500,000 primes). We choose {2,3,5}. Next, for each integer t "close" to n, try to factor t²-n into primes using only our small list of primes. This is supposed to be much easier than factoring the larger n. If it isn't possible, just go to the next t. For our example, n is approximately 68.07, we try t's from 38 to 98, and we get (mod n, i.e. mod 4633):
These lines are called relations, and finding them is the part that is easily run on separate computers. One computer could have tried factoring t²-n for t=58 to 68, another for t=78 to 88, etc. For RSA- 129, a very large number of such smaller factorizations were attempted, and 569,466 relations were finally found. Note that the Multiple Polynomial Quadratic Sieve was used, i.e. more quadratic polynomials than just x² were used; there were also other "tweaks" of the method to get a bit more efficiency.
Now one needs to pick several relations so that upon multiplying the right-hand sides, all the exponents come out even. We choose relations 3,4, and 6 to get 68²·69²·96²(-1)²·2^{8}·3²·5². The left side is (68·69·96)² and the right side is (2^{4}·3·5)². This gives our x, i.e. 68·69·96, and y=2^{4}·3·5, so that x²y² (mod n) Then (x+y) and (x-y) are 450672 and 450192 respectively. and g.c.d.(x+y,n)=41, so 41 is a common factor of x+y and n, and hence a non-trivial factor of n. This gives 4633=41×113. On the RSA-129 project, the relations sent in by participants were checked and then combined on a supercomputer, using matrix row-reduction methods.