Integer Overflow

Not quite insightful

The RSA Cryptosystem

with one comment

The RSA Cryptosystem (named for Rivest, Shamir, and Adleman) is a public key cryptosystem — that is, the keys used for encryption and decryption are different, and (in general) it is believed that it is difficult for an attacker to use knowledge of one number to produce the other.

Typically, RSA is used for one-way communication. The receiver produces a large number n, and two numbers e,d (these numbers are chosen in a special way that I’ll soon describe). They publish the numbers e and n, but they keep the number d a secret. The sender uses e and n to encrypt their message; the receiver uses d and n to decrypt the number.

We’re going to take a look at how these numbers are chosen and how the encryption and decryption process works.

Some background

The RSA cryptosystem is based on some elementary facts about prime numbers. We’re going to review those facts right now, just to establish some context.

Definition 1 (Relatively prime). Let a,b be positive integers. We say that a and b are relatively prime (sometimes also called coprime) if there does not exist an integer c > 1 where c divides both a and b.

Intuitively, two positive integers are relatively prime if they don’t have any of the same factors.

Definition 2 (Euler \phi function). Let m be a positive integer. We define \phi(m) to be the number of positive integers less than m that are relatively prime to m.

The \phi function has the following property that motivates our interest in it:

Theorem 1 (Computing \phi(m)). Let m be a positive integer. Factor m as

m = p_1^{k_1} \cdot p_2^{k_2} \cdot \cdots \cdot p_q^{k_q}.

Then

\phi(m) = (p_1 - 1)p_1^{k_1-1}\,(p_2 - 1)p_q^{k_2-1}\,\cdots\,(p_q - 1)p_q^{k_q-1}.

Theorem 1 is extremely useful. If you don’t know how to factor m, then computing \phi(m) basically requires you to test very number less than m to see if it is relatively prime to m. If m is a d digit number (say, written in base 10), there are about 10^d numbers you need to test. That grows pretty quickly! So quickly that it’s effectively impractical if m is a moderately big number.

The amazing power of Theorem 1, though, is that if you can factor m, then you can compute \phi(m) very efficiently. This asymmetry is at the heart of the RSA cryptosystem.

Theorem 2 (Euler’s Theorem). Let a,b be relatively prime positive integers. Then

a^{\phi(m)} \equiv 1 \mod m.

With that background out of the way, we are now ready to describe the RSA cryptosystem.

Choosing n,e,d.

The receiver begins by picking two prime numbers p_1,p_2. These numbers will be treated as secret information; the receiver will never tell anyone what these numbers are. In practice, these are chosen to be very large; heuristically speaking, bigger primes will lead to better security. Then let n = p_1 \cdot p_2. This is how n is chosen. Note that by Theorem 1 the receiver can compute \phi(n) = (p_1 - 1)(p_2 - 1) rather easily, whereas anyone besides the receiver must compute \phi(n) the hard way.

Now choose an integer e with 0 \leq e < n where e and n are relatively prime. The number e and n are now published for the world to see.

The receiver now must produce a number d that has the following relationship:

e\cdot d \equiv 1 \mod \phi(n).

This is easy to do if one uses the Extended Euclidean Algorithm. The number d is kept private, just like p_1 and p_2.

Encrypting and decrypting messages

Now suppose someone wishes to send a secret number P < n to the receiver. The encryption process is simple:

Let C = P^e \mod n.

Now the sender transmits C to the receiver.

To decrypt, the receiver computes

C^d \mod n.

To see why this computation decrypts the message, simply apply Theorem 2. Since we chose e,d such that e\cdot d \equiv 1 \mod \phi(n), we know that there exists some k such that

e\cdot d = k\,\phi(n)+1.

Then by Theorem 2,

C^d \equiv (P^e)^d \equiv P^{ed} \equiv P\cdot P^{k\,\phi(n)} \equiv P\cdot \left(P^{\phi(n)}\right)^k \equiv P \mod n.

Now, I feel as though I need to point out that I cheated a little bit. See, I wanted to apply Theorem 2, but in order to do so I must know that P and n are relatively prime. I can’t actually guarantee that this will always be the case. However, since n is a factor of two primes, I’m only in trouble if P is divisible by one of these numbers. Since this is fairly uncommon, I don’t really worry about it too much. However, when it does happen, we are in a lot of trouble.

The security of the system is largely based on the idea that it is difficult to factor n. Now, it turns out that it’s actually quite easy to determine if a given P and n share a factor; indeed, one needs only compute \gcd(P,n), which can be done efficiently using the Euclidean Algorithm. If they do share a factor, then not only can the sender not use the RSA cryptosystem to send this message using this n, but in fact the sender has figured out the factorization of n. Once someone knows how to factor n, the security of the entire system is compromised.

So when this happens, rather than simply coming up with a different way to encrypt P, it is probably a good idea to generate a whole new n to use for all future communications.

An example

Let’s compute an example of the RSA cryptosystem in action. Let p_1 = 15,485,863 and p_2 = 32,452,843. Then n = 502,560,280,658,509, and \phi(n) = 502,560,232,719,804.

We now need to produce an e that is relatively prime to n. I just picked e = 1234567 and got lucky in that it was already relatively prime to n. If I was unlucky, I would just pick e' = \frac{e}{\gcd(e,n)}.

Finally, it’s time to compute d. Using the Extended Euclidean Algorithm (well actually I cheated and just used a computer algebra system), I find that d = 134,471,225,162,935.

Now suppose someone wants to send me the number P = 11,223,344. They compute and send to me

C = P^e \equiv 17,353,488,832,595 \mod n.

I receive this number, and immediately compute

17,353,488,832,595^d \equiv 11,223,344 \mod n,

which gives me the secret number.

The security of the RSA cryptosystem

If someone wants to attack the cryptosystem, the most obvious scheme would be to find a way to compute d given e and n.

What are the challenges in doing this? The biggest one is in computing \phi(n). You see, when the receiver produces the number d, they do so knowing how to factor n, and are therefore able to compute \phi(n) efficiently using Theorem 1. However, when attacker wishes to compute d, they have no choice by to use the naive algorithm for computing \phi(n). If an attacker were to try to do this, the hope is that it would take so long that, by the time they had succeeded, the contents of the message have become moot.

However, if it were easy to factor the number n, then the RSA cryptosystem would offer no real security. At the time of writing, there is not a general purpose factoring algorithm that is fast enough for this to be a reasonable attack. Still, there are many good factoring algorithms out there that give perfectly feasible attacks against implementations of the RSA cryptosystem that use relatively small prime numbers to produce n. In the articles that follow, we will look at some of these algorithms.

Written by intoverflow

20 February, 2008 at 1:48 pm

One Response

Subscribe to comments with RSS.

  1. [...] cryptosystem. Thus, to kick the whole thing off, I’ve posted an account on the working of the RSA cryptosystem under my new special section on [...]


Leave a Reply