Dangers in key reuse

These are my lecture notes from a presentation given for the University of Utah GSAC Colloquium.

Introduction

This presentation is about a common theme: it is not uncommon for key-reuse to undermine the security of otherwise trustworthy systems. Here we’ll take a look at two examples of this in action: key-reuse in traditional one-time pads and WEP.

Caesar ciphers

Let’s start by looking at a really simple cipher. Write down the alphabet, and underneath it write down the alphabet again but shift it by a few spaces. This is called a Caesar cipher, so named because supposedly Julius Caesar used this method for communicating with his military. Here’s an example of what such a scheme looks like:

 a b c d e f g h i j k l m n o p q r s t u v w x y z f g h i j k l m n o p q r s t u v w x y z a b c d e

We call such a table a key.

Here’s how the cipher words: start by writing down the sentence you want to encrypt. For each letter, look find the letter in the top row, and replace it with the letter directly beneath it. In this example, “a” gets replaced with “f”, “b” gets replaced with “g”, and so forth. Very simple. Decryption takes place by reversing this process: look up the encrypted letter in the bottom row, and replace it with the letter above. So “f” decrypts to “a”, and so on.

This is a natural first stab at encrypting a message, but it doesn’t work very well: the problem is that there are only 26 different keys (since the bottom row is just the alphabet shifted over, and there are only 26 unique ways to do this). An attacker could easily decrypt your message using each of the 26 possible keys, and it’s pretty likely that only one of these attempts will yield a result that doesn’t look like gibberish.

Substitution ciphers

So at this point we generalize: we now make a table by randomly permuting the alphabet, rather than just shifting it. Such a scheme is called a substitution cipher. For instance, we might come up with the following new table:

 a b c d e f g h i j k l m n o p q r s t u v w x y z g f i j w r l o c p h q s t u v m x n y k a b d z e

Once more, such a table is called a key.

There are now 26! = 403,291,461,126,605,635,584,000,000 possible keys, corresponding to the number of ways we can permute the alphabet. It is no longer reasonable for an attacker to simply try every possible table.

For a good while this scheme was assumed to be secure; all of this ended, however, with the work of William Friedman, who made an important observation: in most alphabets, some letters are used much more frequently than others. In a publication titled The Index of Coincidence and Its Applications in Cryptanalysis (which remained classified for several decades), Friedman showed that the non-uniform frequency of letters in an alphabet can be used to break substitution ciphers.

It is completely practical to try this out for yourself: here is a table showing the relative frequencies with which English letters are used in “generic” text. Have a friend encrypt a message using a random permutation of the alphabet (the longer the message, the easier the attack will be). Count how many times each letter appears in the encrypted version of the text; odds are good that the most common letters are e, a, o, and t. Use this to make some guesses and see how far you get. You stand a very good chance of breaking the message with a very few number of tries!

The basic problem with a substitution cipher is that, while it hides which letters are in the message, it doesn’t hide the distribution of the letters. To avoid this problem, we generalize our method even further.

Instead of producing a single substitution table, we are now going to produce many, say $n$. We will encrypt the first letter using the first table, the second letter with the second table, and so forth. When it’s time to encrypt the $n+1$-th letter, we will go back to using the first table. In this cipher, our key is just the collection of substitution ciphers.

As we increase $n$, we do a better job of obfuscating the underlying distribution of letters. Ideally, we take $n$ to be equal to the length of the message, so that each letter in the message is being encrypted with a different substitution cipher. If we use a different key for each message we sent, then we are using a system called a one-time pad.

One-time pads are as good as it gets, in the sense that no brute-force method will work against them. To see why, consider the cipher text “fjslfjsldmvpk.” Each letter is encrypted individually, so we don’t know if the correct plaintext is “drop the bomb” or “don’t bomb it.” After all, for both possible plaintexts there exists a choice of tables which leads to the ciphertext written above!

Of course, one-time pads aren’t perfect:

• Each message needs to be encrypted using a different key.
• For two parties to communicate, they need to agree on which keys to use in advance.
• Since each key consists of a choice of permutation for each letter in the message, there’s a total of $(26!)^n$ possible keys. With such a large keyspace, it’s difficult to succinctly describe a key.

To address the last point, people generally do not consider arbitrary permutations, but rather a subgroup of the set of possible permutations which can be succinctly described. Here’s how it works: first, map the alphabet to a group $G$. One obvious choice is to just map to $\mathbb{Z}/26\mathbb{Z}$ (though in practice no one uses this group; these days encryption is done by computers and is used on data other than English letters, so other groups get used instead). Encode the message as a tuple in the product group $G^n$. As a key, pick a random tuple in $k \in G^n$. To encrypt a message $m$, simply compute $m + k$. To decrypt, compute $(m+k) - k$.

As it happens, $\mathbb{Z}/26\mathbb{Z}$ is a pretty uncommon choice of group. In digital communications, it is much more common for people to represent their characters in Unicode (say, UTF-8), then choose their group to be $(\mathbb{Z}/2\mathbb{Z})^8$. This is a good choice for digital systems, since the group operation corresponds precisely to bitwise-XOR, which can be implemented inexpensively, and which has good performance characteristics.

Lurking inside this choice is a subtle danger, though. Note that, if we encode our message as an element of $((\mathbb{Z}/2\mathbb{Z})^8)^n \cong (\mathbb{Z}/2\mathbb{Z})^{8n}$, then every element of our group satisfies $2g = 0$. This is ultimately the reason why key reuse is a bad thing: if two messages $m_1,m_2$ are transmitted with the same key $k$, then an adversary is able to compute

$(m_1 + k) + (m_2 + k) = m_1 + m_2 + k + k = m_1+m_2 + 2k = m_1+m_2$.

While this doesn’t give them the plaintext, it does leak information: since messages tend to have non-random structure, the sum of two messages also tends to have non-random structure. With enough messages transmitted using the same key, the amount of information leaked increases until finally all of the messages (which were encrypted using this key) can be decrypted with high probability.

Here’s an example of this in action. I went online and downloaded a list of all 4-letter English words from a website for Scrabble players. I encoded each word in UTF-8, then computed the pairwise XOR of all of the words. Here’s a graph showing the distribution of the result:

Distribution of pairwise XORs of 4-letter English words

In an attack scenario, an adversary captures several messages which have been encrypted with the same key (how they know this is usually dependent on external factors). They compute the pairwise-XORs of the intercepted messages. They then use information similar to the graph above to develop a hypothesis for what the plaintext could be which explains the observed sums. There are several methods for doing this, each of which relies more or less on a clever application of Bayes’ theorem. For a trained statistician, this is not a difficult task.

So in general, key reuse is a bad thing with one-time pad systems. However, it’s logistically impractical for high-use channels to rely on a pre-arranged collection of unique keys; after all, high-use means lots of keys will be needed.

Key generation

To address this, several schemes have been cooked up for generating suitable keys on an as-needed basis. The idea is that both ends of the channel can use the same key generation process, and provided they stay in sync with one another, they should be fine.

Formally, we consider a function $f(p,n) : G^n \times \mathbb{N} \to G^n$. We will use this function to generate keys. At some point in time, two parties who with to communicate will agree on some $p \in G^n$, which they will keep secret. For the first transmitted message, both parties use the value $f(p,1)$ as the key. For the second transmitted message, both parties use the value $f(p,2)$ as the key. And so on.

Ideally the function $f(p,n)$ should satisfy a few properties:

1. For all $p \in G$, the long-term behavior of $f(p,n)$ should be complete erratic; that is, there should be no long-term stability in any sense of the word.
2. For all $p \in G$, even with total knowledge of the values of $f(p,n)$ for all $n \leq N$, it should be impractical to predict the value of $f(p,N+1)$.

The development of such a function $f$ is an active research topic in cryptographic circles, since such functions — when well designed — mitigate the logistical mess that is key-distribution.

The RC4 Steam Cipher

One notable example of such a scheme is the RC4 steam cipher. RC4 comes with two algorithms: one for initializing the cipher (the key scheduling algorithm, or KSA) and one for generating data that can be used as a key (the psudo-random generation algorithm, or PRGA). The output of the PRGA is called a key stream. RC4 is easy to use:

1. Initialize it with a secret key that the concerned parties have agreed upon in advance.
2. When you want to encrypt a message consisting of $n$ bytes of data, simply iterate the PRGA $n$ times, and XOR your message with the key that was produced.

RC4 is stateful, in that systems which use it must keep track of data over time. In particular, the scheme uses an array of 256 entries as the basis for generating a key stream. Every time you invoke the PRGA, its output is a function of this array; as the PRGA is used, the array is modified in anticipation of the next invocation of the PRGA.

As of 8 December 2008, the Wikipedia entry on RC4 has a working Python implementation of the RC4 stream cipher, together with some sample input-output data. For those so inclined, I’ve produced a basic implementation in Haskell (I make no claims to its efficiency or cleverness):

import Foreign.C.String (castCharToCChar, castCCharToChar)
import Foreign.C.Types (CChar)
import Data.Array.IArray
import Numeric (showHex)
import Data.Bits (xor)

swap a i j = let ai = a ! i
aj = a ! j
in a // [(i, aj), (j, ai)]

n = 256       -- we do our operations mod n
st_len = 256  -- length of internal state

data RC4State = RC4State { s :: Array Int Int, i :: Int, j :: Int }
deriving Show
type RC4Key   = Array Int Int

ksa :: RC4Key -> RC4State
ksa k =
let key_len = snd (bounds k) + 1
initial = listArray (0, st_len-1) [0 .. st_len-1]
perm (j', s) i = let j = (j' + (s ! i) + (k ! (i mod key_len))) mod n
in (j, swap s i j)
in RC4State { s = snd \$ foldl (perm) (0, initial) [0 .. st_len - 1],
i = 0, j = 0 }

rpga :: State RC4State Int
rpga = do st <- get
let i' = ((i st) + 1) mod n
j' = ((j st) + ((s st) ! i')) mod n
s' = swap (s st) i' j'
put st{ s = s', i = i', j = j'}
return (s' ! (((s' ! i') + (s' ! j')) mod n))

encrypt :: [Int] -> State RC4State [Int]
encrypt [] = return []
encrypt (d:ds) = do k  <- rpga
rs <- encrypt ds
return ((d xor k) : rs)

toKey :: String -> RC4Key
toKey key = listArray (0, (length key) - 1)
(map (fromIntegral . castCharToCChar) key)

rc4 :: String -> String -> String
rc4 key dta = let k = toKey key
dta' = map (fromIntegral . castCharToCChar) dta
crypd = evalState (encrypt dta') (ksa k)
in foldr (\b -> \a -> showHex b a) "" crypd


The RC4 stream cipher is used as the underlying cryptographic primitive in many important protocols, including WEP and Microsoft’s Remote Desktop Protocol.

RC4 in WEP

For the remainder of this paper we will be interested in how RC4 is used in WEP, so that we can discuss current attacks.

The WEP protocol is standardized in IEEE 802.11i. It specifies that a network is protected by a single secret key, called a root key, though in practice WEP devices generally support as many as four keys to be used (though this capability is infrequently utilized). Networks which are protected by WEP advertise themselves as such.

The encryption process

In a WEP-enabled network, not every frame is encrypted. While data-frames (the frames which contain user data) are encrypted, beacon frames (used to announce the presence of a network) and acknowledgment frames (used for quality of service) are not.

For frames which are to be encrypted, the following process is used:

1. The transmitting station picks a 24-bit value called an initialization vector (IV). The standard does not specify how this value should be selected; in most implementations, this value is either chosen sequentially or is generated by a psudo-random number generator.
2. The IV is prepended to the root key. The result is called a per-packet key.
3. A CRC-32 checksum of the payload is computed and appended to the end of the payload. The checksum is called an integrity check value (ICV).
4. The per-packet key is used to initialize the RC4 stream cipher, which is then used to generate a key stream long enough to encrypt the payload and the integrity check value.
5. The payload and the integrity check value are encrypted using the key stream generated in the previous step. The encryption is just bitwise XOR, as usual.
6. The ciphertext, the initialization vector, and other header fields are assembled into a frame and transmitted.

Notice that the root key remains fixed, and the initialization vector ranges over all possible 24-bit values. This means that there are $2^24$ different per-packet keys that can be used; while this number may seem large, for data networks it really isn’t. Key reuse in inevitable.

Also notice that the integrity check value (the CRC-32 checksum) is contained in the cipher text. This fact will later make a serious contribution to the weakening of WEP as an encryption service.

Some caveats

There are two issues that we should now address:

The first is the matter of authentication. Before an access point will communicate with a client, the client must associate itself to the access point. When the access point is using WEP, there are two ways this can happen:

1. The client requests to join the network and the access point simply says “yes.” In this mode, there is no attempt to authenticate that the client has permission to use the network.
2. The client requests to join the network and the access point responds with a challenge:
1. The access point sends a random number $r$ to the client. This number is sent unencrypted.
2. The client uses the root key to encrypt the random number, and transmits the encrypted version back to the access point.
3. The access point uses the root key to encrypt the random number. If the result matches what it received from the client, then the client is permitted to join the network.

Later we will describe why it is beneficial for an attack to be associated with an access point. For the time being, we make a single observation: even when the access point is configured to challenge clients, an attacker can still associate himself to the access point (provided the attacker has witnessed a legitimate client associate itself to the access point).

Here is how the attack works:

1. A legitimate client authenticates itself to the access point. The attacker monitors the transaction and records the random number $r$ as well as the encrypted version of this number, which we denote by $r+k$. The attacker also notes the initialization vector used by the client for the encrypted transmission, which is data that is transmitted but not encrypted.
2. A priori, the attacker does not know the value $k$. However, since the encryption mechanism is just addition in the group $(\mathbb{Z}/2\mathbb{Z})^n$, the attacker knows that $2r=0$, so that $r + (r+k) = 2r + k = k$. In this way, the attacker recovers $k$.
3. The attacker transmits a request to associate to the access point. The access point responds by sending a new random number $r'$ to the attacker.
4. The attacker knows the value of $k$. The attacker crafts a response using the same initialization vector as the legitimate client, and transmits $r' + k$.
5. The access point performs the same computation, sees that the results match, and allows the attacker to associate.

Notice that this is another situation where allowing key reuse as undermined the security of an otherwise safe mechanism.

Incidentally, an attacker does not necessarily need to wait for a legitimate user to attempt to associate to the access point. If the attacker notices that a user is already associated (something the attacker can determine by reading the unencrypted portions of the frames), then the attacker can forge a message to the client telling them to re-associate! Indeed, 802.11 allows for access points to force users to re-authenticate themselves, and the frame used for doing this is not encrypted (and can therefore by transmitted by an attacker).

So why would an attacker want to be associated to the access point? At first glance it is not obvious; after all, without knowing the root key, the attacker can’t use any services that require encryption of frames.

As has already been hinted, however, there are many services which don’t require the use of encrypted frames. Some of these services are essential for carrying out attacks against WEP!

This brings us to our second caveat. Recall that data frames include a CRC-32 checksum, and that this checksum is encrypted along with the payload. For many of the attacks against WEP, an attacker will want to know whether or not a given packet’s checksum is valid or invalid. The trouble is that the attacker cannot decrypt the packet and check for himself; however, if the attacker is associated with the access point, then they can use the access point as an “oracle” to perform the check for him.

There are two common ways for this to be done:

1. The attacker could have two stations associate to the access point and attempt to send the encrypted frame from one station to the other via the access point. This does not require much trickery: after all, the source and destination data in a data frame is not encrypted. As it happens, access points refuse to pass along data frames whose checksum is not correct. Thus, if the attacker notices that the frame never arrives, then the attacker can conclude that the checksum was incorrect (and vise versa).
2. The attacker could associate to the access point and attempt to send the frame to a station that does not exist. If the checksum is correct, the access point will send an error to the sender; if the checksum is in error, the access point will simply ignore the frame.

So the ability to associate to an access point gives an attacker the ability to check whether or not the checksum of a frame is correct, even though the attacker is unable to decrypt the frame himself.

Weaknesses in WEP
FMS attack

In 2001, Scott R. Fluhrer, Itsik Mantin and Adi Shamir published Weaknesses in the Key Scheduling Algorithm of RC4, a paper which described a key recovery attack against certain applications of RC4 — including WEP.

Indeed, this sort of thing happens from time to time: there are many situations where a cryptographic system is secure “on the blackboard” but less secure when put into practice; sometimes the circumstances surrounding the application of a cryptosystem undermine the system’s security. This is now known to be the case with RC4.

The attack described in the paper is now generally known as the FMS attack. While the original paper did not describe how to weaponize the attack against WEP, it did not take long for such details to be worked out. An often-cited guide to the practical application of the attack was given by David Hulton (a.k.a. h1kari) in Practical Exploitation of RC4 Weaknesses in WEP Environments, though the first implementation of the attack is due to Adam Stubblefield, who detailed his implementation with John Ioannidis, John Ioannidis, Aviel D. Rubin, and Aviel D. Rubin in their paper Using the Fluhrer, Mantin, and Shamir Attack to Break WEP.

Recall from earlier that an attacker is able to use the access point to check whether or not a given frame’s checksum is correct. With this capability, the consequences of the FMS attack can be summarized as follows:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 9,000,000 queries to the access point and negligible computational effort.

Since the per-packet key contains the shared root key, this amounts to a full key recovery attack against WEP.

In practice, the success of the FMS attack tends to be pretty low. Though we won’t get into the details of how the attack works here (see the references), it requires a considerable amount of traffic to be captured by the attacker prior to carrying out the attack. Still, the existence of such an attack served to encourage others to look for more efficient methods.

KoreK attack

In 2004, KoreK posted an attack to the netstumbler forum which significantly improved upon the performance of the FMS attack. The consequences of the attack can be summarized as follows:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 700,000 queries to the access point and negligible computational effort.

Anecdotal evidence suggests that the KoreK attack is significantly more practical than FMS, and is generally considered to be a reliable means of recovering the shared root key.

Klein’s attack

In 2005, Andreas Klein described two attacks against RC4 which hold in WEP-line environments. His work gives the following:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 43,000 queries to the access point and negligible computational effort. Success of 95% can be achieved with 70,000 queries.

PTW attack

In 2007, Pyshkin, Tews and Weinmann improved the performance of the Klein attack to what stands today as the most efficient method for recovering the root key in WEP systems. Their work gives the following:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 35,000 queries to the access point and negligible computational effort. Success of 95% can be achieved with 55,000 queries.

Today it is generally considered a “sure thing” that the PTW attack will recover the root key from a target network.

Remarks

While we haven’t gotten into the details of how any of these attacks work (we leave that for the references), we have conveyed that these attacks are practical, and have hinted that they are ultimately enabled by key reuse in WEP.

Practical matters

Today, the most sophisticated toolkit for attacking wireless networks is undoubtedly the aircrack-ng suite, which is able to perform all of the wireless attacks we have described, as well as many others.

The software alone is not enough for carrying out the attacks, though. Additionally, attackers must also have wireless network adapters which permit certain “unusual” operations (such as sniffing and injecting traffic). Knowledge of which wireless adapters are capable of these operations is mostly lore, although a great deal of information can be found at the madwifi website. The author has personal experience with using the Ubiquiti Nanostation 2 as a launch point for key recovery attacks against WEP.

Conclusions

This paper has not presented anything new: indeed, after the discovery of the FMS attack, WEP has generally been considered “old and broken.”

Hopefully this paper has, however, provided the reader with some insight into some of the issues associated with WEP. We’ve discussed how key-reuse undermines security, and we’ve also provided a concrete example of how “blackboard security” does not necessarily mean “practical security.”

Readers who are interested in the cryptographic details of the attacks are strongly encouraged to consult the references.