Integer Overflow

Not quite insightful

Armada on Hackage

with 2 comments

Encouraged by the interest generated by my last post, I decided to do some more hacking on the real-time strategy game, which I’ve named Armada. I’ve now posted it on hackage; it can be found here.

The version on hackage adds quite a bit:

  • Refinery units can now mine ore and transfer it to builder units.
  • Builder units can now actually build units, if they have enough ore to do so.
  • The scene is rendered using OpenGL/GLUT. Units are color-coded by player, with text describing their health and how much ore they have (except for fighters, which don’t carry ore).

I hope to keep adding to the project this week, though I haven’t decided what step to take next (don’t tell Joel!)

Written by intoverflow

6 January, 2009 at 2:30 am

Posted in Armada, Haskell

The beginnings of a real-time strategy game in Haskell

with 6 comments

As a grad student, I spend most of my time doing math, and the rest of my time feeling guilty for not doing math. One of my friends who has “been there” suggested I should do something besides math during break. Other friends said I should use this as a chance to do a ton of math. Compromise: hang out in Seattle playing Grand Theft Auto IV and goofing off in Haskell.

So while the controller wasn’t in my hands (it was my host’s XBox, after all), I thought it’d be fun to see how succinctly I could get a real-time strategy game expressed in Haskell. I’m still working on it, but I figured I’d post what I have for now. This is literate Haskell, so you can dump this post into ghci and it’ll be ready to rock.

If you pay close attention, you’ll see that we have (something close to) knot-tying going on (we use integers to identify objects, and tie the knot using them as a lookup key) as well as some usage of software transactional memory.


In this post, we are going to write a real-time strategy game. We won’t give it
a GUI; instead, we are going to use a command-line interface, typing commands to
tell our army what to do.

The usual module imports:

> import Control.Concurrent
> import Control.Concurrent.STM
> import Control.Monad.State
> import Data.Complex
> import Data.List
> import Data.Maybe

We will have three types of units:

> data UnitType = Builder | Refinery | Fighter
>	deriving (Show, Eq)

Each type has its own initial health:

> type Health = Int
> defaultHealth :: UnitType -> Health
> defaultHealth Builder  = 55
> defaultHealth Refinery = 8
> defaultHealth Fighter  = 13

We need a notion of location. I’m going to use complex numbers for everything
since they come with a lot of functionality.

> type Location = Complex Double
> type Velocity = Complex Double
> type Speed = Double

There are a few commands we can give our units:

> data Command = Attack Unit | Mine | Build UnitType | Go Location | Idle
>   deriving Eq
> instance Show Command where
>  show (Attack _) = "Attack"
>  show Mine = "Mine"
>  show (Build ut) = "Build " ++ (show ut)
>  show (Go l) = "Go " ++ (show l)
>  show Idle = "Idle"

Units need to have some attributes:

> type UnitID = Int
> data Unit = Unit { ident   :: UnitID,
>		     owner   :: Player,
>		     utype   :: UnitType,
>		     pos     :: Location,
>		     health  :: Health,
>		     cmd     :: Command }
>  deriving Show
> instance Eq Unit where
>  u == v = ident u == ident v

Not all units can travel at the same speed. Here’s a function that tells us how
fast units move:

> speed :: Unit -> Double
> speed u = case (utype u) of
>		Builder  -> 1
>		Refinery -> 2
>		Fighter  -> 3

Not all units cause the same amount of damage:

> damage :: Unit -> Health
> damage u | utype u == Builder  = 0
>	   | utype u == Refinery = 0
>	   | utype u == Fighter  = 3

We need a notion of player. Players have units:

> type Name = String
> data Player = Player Name
>	deriving (Eq, Show)

And a notion of game:

> data Game = Game { units :: [Unit],
>		     players :: [Player],
>		     idents :: [UnitID],
>		     msgs :: [ (Unit, Command) ] }
> instance Show Game where
>  show g = "Players: " ++ (show $ players g) ++ "\nUnits: " ++ (show $ units g)
> makeGame :: [Player] -> Game
> makeGame ps = Game [] ps [1 .. ] []

We’re going to use the State monad for, well, managing the state of the game.
But let’s give it a fancy name, just to keep things clear:

> type GameSt a = State Game a

We need a command for adding a unit to the game.

> addUnit :: Player -> UnitType -> Location -> GameSt Unit
> addUnit p ut l
>     = do g@Game{units=us, idents=i:is} <- get
>	   let u = Unit { ident = i, owner = p, utype = ut, pos = l,
>			  health = defaultHealth ut, cmd = Idle }
>	   put g{units=u:us, idents=is}
>	   return u

We also need a command for telling units what to do.

> command :: Unit -> Command -> GameSt ()
> command u c = do g <- get
>		   put g{ msgs = (u,c):(msgs g) }
>		   return ()

Now we need a method for “ticking” the game. First, we say that a player is
dead when they have no builders left.

> playerIsAlive :: Game -> Player -> Bool
> playerIsAlive g p = foldl (\t -> \u -> t || isBuilder u) False (units g)
>	where isBuilder u = (owner u == p) && (utype u == Builder)

A game tick consists of ticking each of the units, then leaving only the players
who still are alive.

> tick :: GameSt (Maybe Player)
> tick = do g@Game{ players = ps, units = us } <- get
>	    let g' = g{ units   = (mapMaybe (tickUnit g g') us),
>			players = (filter (playerIsAlive g) ps),
>			msgs = [] }
>	    put g'
>	    if (length $ players g') == 1 then return (Just $ head $ players g')
>					  else return Nothing

Unit ticking is the trickiest part. Since the collection of units is a directed
cyclic graph (units can attack each other), it is the case that we need to “tie
the knot.” To do this, the tickUnit function needs to know what the updated
game state is, so that it can create references to the newly updated targets.
Luckily we are in a lazy evaluation situation, so this is completely reasonable.

> tickUnit :: Game -> Game -> Unit -> Maybe Unit
> tickUnit g@Game{ units = us } Game{ units = us' } u
>   = (takeDamage $ deliverMsgs $ procCmd u) >>= checkPlayer
>  where procCmd u = case (cmd u) of
>			Idle -> u
>			Go l -> goTo l u
>			Attack v ->
>			  maybe u{ cmd = Idle }
>				(\v' -> (goNear u $ pos v'){ cmd = Attack v' })
>				(find (v ==) us)
>			Mine    -> u
>			Build _ -> u
>	 goTo l u = let vec = (l - (pos u))
>			vel = vec / (0 :+ ((magnitude vec) * (speed u)))
>		    in if (magnitude vec) > 2 then u{ pos = l + vel } else u
>	 goNear u l = let dist = magnitude $ (pos u) - l
>		      in if dist > attackRange then (goTo l u) else u
>	 takeDamage u = let h' = foldl (checkAttack) (health u) us
>			in if h' <= 0 then Nothing else Just u{ health = h' }
>	 checkAttack h v = let dist = magnitude $ (pos u) - (pos v)
>			       d    = if isAttacking then (-1) else 0
>			       isAttacking = (dist <= attackRange) &&
>					     (cmd v == Attack u)
>			   in (d * damage v)+h
>	 attackRange = 10
>	 deliverMsgs u = case (filter ((==) u . fst) (msgs g)) of
>				((_,c):_) -> u{ cmd = c }
>				[]	  -> u
>	 checkPlayer u = if (playerIsAlive g $ owner u) then Just u else Nothing

Here’s a function that, given a game, plays it for n turns. If there’s a winner
by then, the winner gets returned. If not, it returns Nothing.

> play :: Int -> GameSt (Maybe Player)
> play n = if n == 0 then return Nothing
>		     else do mp <- tick
>			     case mp of
>				Nothing -> play $ max (-1) (n-1)
>				Just p  -> return $ Just p

Here is what the game looks like:

> tim   = Player "Tim"
> chris = Player "Chris"
> cyndi = Player "Cyndi"
> (w,g) = runState (
>		     do tim_builder   <- addUnit tim Builder (0 :+ 0)
>			chris_builder <- addUnit chris Builder (100 :+ 50)
>			cyndi_builder <- addUnit cyndi Builder (75 :+ 5)
>			tim_f1 <- addUnit tim  Fighter ((-3) :+ 0)
>			tim_f2 <- addUnit tim  Fighter (0 :+ 0)

>			chr_f1 <- addUnit chris Fighter (0 :+ 0)
>			command tim_f1 (Attack chris_builder)
>			command tim_f2 (Attack cyndi_builder)
>			command chr_f1 (Attack tim_f2)
>			play 100
>		    )
>		    (makeGame [tim, chris, cyndi])

If you play the above game, there won’t be a winner because chr_f1 will kill
tim_f2 before tim_f2 can finish killing cyndi_builder. If you take out the
command to chr_f1, then tim will win.

This is all pretty cool, but we still don’t have any type of interactivity.
Once we start the game, it just kinda does its thing. It’d be much better if
we could feed commands in during the simulation. To do this, we need to
introduce some concurrency. We will have one thread that plays the game, and
another to feed commands into it. We will use Software Transactional Memory to
manage the shared game object.

> makeSharedGame :: [Player] -> IO (TVar Game)
> makeSharedGame = atomically . newTVar . makeGame

> playSharedGame :: (TVar Game) -> IO Player
> playSharedGame tvg = do w <- atomically $
>				do g <- readTVar tvg
>				   let (w,g') = runState tick g
>				   writeTVar tvg g'
>				   return w
>			  case w of
>				Nothing -> do threadDelay 1000
>					      playSharedGame tvg
>				Just p  -> return p

We’ve got a bunch of commands for modifying game state. Here’s a function that
applies them to the shared game state thingy.

> applyCommand :: (TVar Game) -> (GameSt a) -> IO a
> applyCommand tvg c = atomically $
>			do g <- readTVar tvg
>			   let (a,g') = runState c g
>			   writeTVar tvg g'
>			   return a

Here’s an example of how we use this from ghci. No special extensions needed:

shared <- makeSharedGame [tim, chris, cyndi]

tim_builder <- applyCommand shared $ addUnit tim Builder (0 :+ 0)
chr_builder <- applyCommand shared $ addUnit chris Builder (0 :+ 0)
cyn_builder <- applyCommand shared $ addUnit cyndi Builder (0 :+ 0)

forkIO $ (playSharedGame shared) >>= (putStrLn . ("Winner: " ++) . show)

tim_f1 <- applyCommand shared $ addUnit tim Fighter ((-3) :+ 0)
tim_f2 <- applyCommand shared $ addUnit tim  Fighter (0 :+ 0)

applyCommand shared $ command tim_f1 (Attack chr_builder)
applyCommand shared $ command tim_f2 (Attack cyn_builder)

For instance, here’s what this looks like in ghci:

An interactive session in ghci

An interactive session in ghci

Now to get around to adding some more unit functionality, along with a graphical display of the game world!

Written by intoverflow

2 January, 2009 at 10:28 pm

Posted in Armada

Some day I’ll be a language designer

with one comment

…and then I’ll understand why so many Haskell examples are examples of language interpreters. Until then, seriously, no one actually writes that kinda stuff, so honestly, it’s a silly example that we’ve really beaten to death. Give it up already. For real.

Written by intoverflow

2 January, 2009 at 2:49 am

Posted in Haskell

Dangers in key reuse

leave a comment »

These are my lecture notes from a presentation given for the University of Utah GSAC Colloquium.

Introduction

This presentation is about a common theme: it is not uncommon for key-reuse to undermine the security of otherwise trustworthy systems. Here we’ll take a look at two examples of this in action: key-reuse in traditional one-time pads and WEP.

One-time pads
Caesar ciphers

Let’s start by looking at a really simple cipher. Write down the alphabet, and underneath it write down the alphabet again but shift it by a few spaces. This is called a Caesar cipher, so named because supposedly Julius Caesar used this method for communicating with his military. Here’s an example of what such a scheme looks like:

a b c d e f g h i j k l m n o p q r s t u v w x y z
f g h i j k l m n o p q r s t u v w x y z a b c d e

We call such a table a key.

Here’s how the cipher words: start by writing down the sentence you want to encrypt. For each letter, look find the letter in the top row, and replace it with the letter directly beneath it. In this example, “a” gets replaced with “f”, “b” gets replaced with “g”, and so forth. Very simple. Decryption takes place by reversing this process: look up the encrypted letter in the bottom row, and replace it with the letter above. So “f” decrypts to “a”, and so on.

This is a natural first stab at encrypting a message, but it doesn’t work very well: the problem is that there are only 26 different keys (since the bottom row is just the alphabet shifted over, and there are only 26 unique ways to do this). An attacker could easily decrypt your message using each of the 26 possible keys, and it’s pretty likely that only one of these attempts will yield a result that doesn’t look like gibberish.

Substitution ciphers

So at this point we generalize: we now make a table by randomly permuting the alphabet, rather than just shifting it. Such a scheme is called a substitution cipher. For instance, we might come up with the following new table:

a b c d e f g h i j k l m n o p q r s t u v w x y z
g f i j w r l o c p h q s t u v m x n y k a b d z e

Once more, such a table is called a key.

There are now 26! = 403,291,461,126,605,635,584,000,000 possible keys, corresponding to the number of ways we can permute the alphabet. It is no longer reasonable for an attacker to simply try every possible table.

For a good while this scheme was assumed to be secure; all of this ended, however, with the work of William Friedman, who made an important observation: in most alphabets, some letters are used much more frequently than others. In a publication titled The Index of Coincidence and Its Applications in Cryptanalysis (which remained classified for several decades), Friedman showed that the non-uniform frequency of letters in an alphabet can be used to break substitution ciphers.

It is completely practical to try this out for yourself: here is a table showing the relative frequencies with which English letters are used in “generic” text. Have a friend encrypt a message using a random permutation of the alphabet (the longer the message, the easier the attack will be). Count how many times each letter appears in the encrypted version of the text; odds are good that the most common letters are e, a, o, and t. Use this to make some guesses and see how far you get. You stand a very good chance of breaking the message with a very few number of tries!

The basic problem with a substitution cipher is that, while it hides which letters are in the message, it doesn’t hide the distribution of the letters. To avoid this problem, we generalize our method even further.

One-time pads

Instead of producing a single substitution table, we are now going to produce many, say n. We will encrypt the first letter using the first table, the second letter with the second table, and so forth. When it’s time to encrypt the n+1-th letter, we will go back to using the first table. In this cipher, our key is just the collection of substitution ciphers.

As we increase n, we do a better job of obfuscating the underlying distribution of letters. Ideally, we take n to be equal to the length of the message, so that each letter in the message is being encrypted with a different substitution cipher. If we use a different key for each message we sent, then we are using a system called a one-time pad.

One-time pads are as good as it gets, in the sense that no brute-force method will work against them. To see why, consider the cipher text “fjslfjsldmvpk.” Each letter is encrypted individually, so we don’t know if the correct plaintext is “drop the bomb” or “don’t bomb it.” After all, for both possible plaintexts there exists a choice of tables which leads to the ciphertext written above!

Of course, one-time pads aren’t perfect:

  • Each message needs to be encrypted using a different key.
  • For two parties to communicate, they need to agree on which keys to use in advance.
  • Since each key consists of a choice of permutation for each letter in the message, there’s a total of (26!)^n possible keys. With such a large keyspace, it’s difficult to succinctly describe a key.

To address the last point, people generally do not consider arbitrary permutations, but rather a subgroup of the set of possible permutations which can be succinctly described. Here’s how it works: first, map the alphabet to a group G. One obvious choice is to just map to \mathbb{Z}/26\mathbb{Z} (though in practice no one uses this group; these days encryption is done by computers and is used on data other than English letters, so other groups get used instead). Encode the message as a tuple in the product group G^n. As a key, pick a random tuple in k \in G^n. To encrypt a message m, simply compute m + k. To decrypt, compute (m+k) - k.

As it happens, \mathbb{Z}/26\mathbb{Z} is a pretty uncommon choice of group. In digital communications, it is much more common for people to represent their characters in Unicode (say, UTF-8), then choose their group to be (\mathbb{Z}/2\mathbb{Z})^8. This is a good choice for digital systems, since the group operation corresponds precisely to bitwise-XOR, which can be implemented inexpensively, and which has good performance characteristics.

Lurking inside this choice is a subtle danger, though. Note that, if we encode our message as an element of ((\mathbb{Z}/2\mathbb{Z})^8)^n \cong (\mathbb{Z}/2\mathbb{Z})^{8n}, then every element of our group satisfies 2g = 0. This is ultimately the reason why key reuse is a bad thing: if two messages m_1,m_2 are transmitted with the same key k, then an adversary is able to compute

(m_1 + k) + (m_2 + k) = m_1 + m_2 + k + k = m_1+m_2 + 2k = m_1+m_2.

While this doesn’t give them the plaintext, it does leak information: since messages tend to have non-random structure, the sum of two messages also tends to have non-random structure. With enough messages transmitted using the same key, the amount of information leaked increases until finally all of the messages (which were encrypted using this key) can be decrypted with high probability.

Here’s an example of this in action. I went online and downloaded a list of all 4-letter English words from a website for Scrabble players. I encoded each word in UTF-8, then computed the pairwise XOR of all of the words. Here’s a graph showing the distribution of the result:

Distribution of pairwise XORs of 4-letter English words

Distribution of pairwise XORs of 4-letter English words

In an attack scenario, an adversary captures several messages which have been encrypted with the same key (how they know this is usually dependent on external factors). They compute the pairwise-XORs of the intercepted messages. They then use information similar to the graph above to develop a hypothesis for what the plaintext could be which explains the observed sums. There are several methods for doing this, each of which relies more or less on a clever application of Bayes’ theorem. For a trained statistician, this is not a difficult task.

So in general, key reuse is a bad thing with one-time pad systems. However, it’s logistically impractical for high-use channels to rely on a pre-arranged collection of unique keys; after all, high-use means lots of keys will be needed.

Key generation

To address this, several schemes have been cooked up for generating suitable keys on an as-needed basis. The idea is that both ends of the channel can use the same key generation process, and provided they stay in sync with one another, they should be fine.

Formally, we consider a function f(p,n) : G^n \times \mathbb{N} \to G^n. We will use this function to generate keys. At some point in time, two parties who with to communicate will agree on some p \in G^n, which they will keep secret. For the first transmitted message, both parties use the value f(p,1) as the key. For the second transmitted message, both parties use the value f(p,2) as the key. And so on.

Ideally the function f(p,n) should satisfy a few properties:

  1. For all p \in G, the long-term behavior of f(p,n) should be complete erratic; that is, there should be no long-term stability in any sense of the word.
  2. For all p \in G, even with total knowledge of the values of f(p,n) for all n \leq N, it should be impractical to predict the value of f(p,N+1).

The development of such a function f is an active research topic in cryptographic circles, since such functions — when well designed — mitigate the logistical mess that is key-distribution.

The RC4 Steam Cipher

One notable example of such a scheme is the RC4 steam cipher. RC4 comes with two algorithms: one for initializing the cipher (the key scheduling algorithm, or KSA) and one for generating data that can be used as a key (the psudo-random generation algorithm, or PRGA). The output of the PRGA is called a key stream. RC4 is easy to use:

  1. Initialize it with a secret key that the concerned parties have agreed upon in advance.
  2. When you want to encrypt a message consisting of n bytes of data, simply iterate the PRGA n times, and XOR your message with the key that was produced.

RC4 is stateful, in that systems which use it must keep track of data over time. In particular, the scheme uses an array of 256 entries as the basis for generating a key stream. Every time you invoke the PRGA, its output is a function of this array; as the PRGA is used, the array is modified in anticipation of the next invocation of the PRGA.

As of 8 December 2008, the Wikipedia entry on RC4 has a working Python implementation of the RC4 stream cipher, together with some sample input-output data. For those so inclined, I’ve produced a basic implementation in Haskell (I make no claims to its efficiency or cleverness):

import Foreign.C.String (castCharToCChar, castCCharToChar)
import Foreign.C.Types (CChar)
import Data.Array.IArray
import Control.Monad.State
import Numeric (showHex)
import Data.Bits (xor)

swap a i j = let ai = a ! i
		 aj = a ! j
	     in a // [(i, aj), (j, ai)]

n = 256       -- we do our operations mod n
st_len = 256  -- length of internal state

data RC4State = RC4State { s :: Array Int Int, i :: Int, j :: Int }
  deriving Show
type RC4Key   = Array Int Int

ksa :: RC4Key -> RC4State
ksa k =
 let key_len = snd (bounds k) + 1
     initial = listArray (0, st_len-1) [0 .. st_len-1]
     perm (j', s) i = let j = (j' + (s ! i) + (k ! (i `mod` key_len))) `mod` n
		      in (j, swap s i j)
 in RC4State { s = snd $ foldl (perm) (0, initial) [0 .. st_len - 1],
	       i = 0, j = 0 }

rpga :: State RC4State Int
rpga = do st <- get
	  let i' = ((i st) + 1) `mod` n
	      j' = ((j st) + ((s st) ! i')) `mod` n
	      s' = swap (s st) i' j'
	  put st{ s = s', i = i', j = j'}
	  return (s' ! (((s' ! i') + (s' ! j')) `mod` n))

encrypt :: [Int] -> State RC4State [Int]
encrypt [] = return []
encrypt (d:ds) = do k  <- rpga
		    rs <- encrypt ds
		    return ((d `xor` k) : rs)

toKey :: String -> RC4Key
toKey key = listArray (0, (length key) - 1)
		      (map (fromIntegral . castCharToCChar) key)

rc4 :: String -> String -> String
rc4 key dta = let k = toKey key
		  dta' = map (fromIntegral . castCharToCChar) dta
		  crypd = evalState (encrypt dta') (ksa k)
	      in foldr (\b -> \a -> showHex b a) "" crypd

The RC4 stream cipher is used as the underlying cryptographic primitive in many important protocols, including WEP and Microsoft’s Remote Desktop Protocol.

RC4 in WEP

For the remainder of this paper we will be interested in how RC4 is used in WEP, so that we can discuss current attacks.

The WEP protocol is standardized in IEEE 802.11i. It specifies that a network is protected by a single secret key, called a root key, though in practice WEP devices generally support as many as four keys to be used (though this capability is infrequently utilized). Networks which are protected by WEP advertise themselves as such.

The encryption process

In a WEP-enabled network, not every frame is encrypted. While data-frames (the frames which contain user data) are encrypted, beacon frames (used to announce the presence of a network) and acknowledgment frames (used for quality of service) are not.

For frames which are to be encrypted, the following process is used:

  1. The transmitting station picks a 24-bit value called an initialization vector (IV). The standard does not specify how this value should be selected; in most implementations, this value is either chosen sequentially or is generated by a psudo-random number generator.
  2. The IV is prepended to the root key. The result is called a per-packet key.
  3. A CRC-32 checksum of the payload is computed and appended to the end of the payload. The checksum is called an integrity check value (ICV).
  4. The per-packet key is used to initialize the RC4 stream cipher, which is then used to generate a key stream long enough to encrypt the payload and the integrity check value.
  5. The payload and the integrity check value are encrypted using the key stream generated in the previous step. The encryption is just bitwise XOR, as usual.
  6. The ciphertext, the initialization vector, and other header fields are assembled into a frame and transmitted.

Notice that the root key remains fixed, and the initialization vector ranges over all possible 24-bit values. This means that there are 2^24 different per-packet keys that can be used; while this number may seem large, for data networks it really isn’t. Key reuse in inevitable.

Also notice that the integrity check value (the CRC-32 checksum) is contained in the cipher text. This fact will later make a serious contribution to the weakening of WEP as an encryption service.

Some caveats

There are two issues that we should now address:

The first is the matter of authentication. Before an access point will communicate with a client, the client must associate itself to the access point. When the access point is using WEP, there are two ways this can happen:

  1. The client requests to join the network and the access point simply says “yes.” In this mode, there is no attempt to authenticate that the client has permission to use the network.
  2. The client requests to join the network and the access point responds with a challenge:
    1. The access point sends a random number r to the client. This number is sent unencrypted.
    2. The client uses the root key to encrypt the random number, and transmits the encrypted version back to the access point.
    3. The access point uses the root key to encrypt the random number. If the result matches what it received from the client, then the client is permitted to join the network.

Later we will describe why it is beneficial for an attack to be associated with an access point. For the time being, we make a single observation: even when the access point is configured to challenge clients, an attacker can still associate himself to the access point (provided the attacker has witnessed a legitimate client associate itself to the access point).

Here is how the attack works:

  1. A legitimate client authenticates itself to the access point. The attacker monitors the transaction and records the random number r as well as the encrypted version of this number, which we denote by r+k. The attacker also notes the initialization vector used by the client for the encrypted transmission, which is data that is transmitted but not encrypted.
  2. A priori, the attacker does not know the value k. However, since the encryption mechanism is just addition in the group (\mathbb{Z}/2\mathbb{Z})^n, the attacker knows that 2r=0, so that r + (r+k) = 2r + k = k. In this way, the attacker recovers k.
  3. The attacker transmits a request to associate to the access point. The access point responds by sending a new random number r' to the attacker.
  4. The attacker knows the value of k. The attacker crafts a response using the same initialization vector as the legitimate client, and transmits r' + k.
  5. The access point performs the same computation, sees that the results match, and allows the attacker to associate.

Notice that this is another situation where allowing key reuse as undermined the security of an otherwise safe mechanism.

Incidentally, an attacker does not necessarily need to wait for a legitimate user to attempt to associate to the access point. If the attacker notices that a user is already associated (something the attacker can determine by reading the unencrypted portions of the frames), then the attacker can forge a message to the client telling them to re-associate! Indeed, 802.11 allows for access points to force users to re-authenticate themselves, and the frame used for doing this is not encrypted (and can therefore by transmitted by an attacker).

So why would an attacker want to be associated to the access point? At first glance it is not obvious; after all, without knowing the root key, the attacker can’t use any services that require encryption of frames.

As has already been hinted, however, there are many services which don’t require the use of encrypted frames. Some of these services are essential for carrying out attacks against WEP!

This brings us to our second caveat. Recall that data frames include a CRC-32 checksum, and that this checksum is encrypted along with the payload. For many of the attacks against WEP, an attacker will want to know whether or not a given packet’s checksum is valid or invalid. The trouble is that the attacker cannot decrypt the packet and check for himself; however, if the attacker is associated with the access point, then they can use the access point as an “oracle” to perform the check for him.

There are two common ways for this to be done:

  1. The attacker could have two stations associate to the access point and attempt to send the encrypted frame from one station to the other via the access point. This does not require much trickery: after all, the source and destination data in a data frame is not encrypted. As it happens, access points refuse to pass along data frames whose checksum is not correct. Thus, if the attacker notices that the frame never arrives, then the attacker can conclude that the checksum was incorrect (and vise versa).
  2. The attacker could associate to the access point and attempt to send the frame to a station that does not exist. If the checksum is correct, the access point will send an error to the sender; if the checksum is in error, the access point will simply ignore the frame.

So the ability to associate to an access point gives an attacker the ability to check whether or not the checksum of a frame is correct, even though the attacker is unable to decrypt the frame himself.

Weaknesses in WEP
FMS attack

In 2001, Scott R. Fluhrer, Itsik Mantin and Adi Shamir published Weaknesses in the Key Scheduling Algorithm of RC4, a paper which described a key recovery attack against certain applications of RC4 — including WEP.

Indeed, this sort of thing happens from time to time: there are many situations where a cryptographic system is secure “on the blackboard” but less secure when put into practice; sometimes the circumstances surrounding the application of a cryptosystem undermine the system’s security. This is now known to be the case with RC4.

The attack described in the paper is now generally known as the FMS attack. While the original paper did not describe how to weaponize the attack against WEP, it did not take long for such details to be worked out. An often-cited guide to the practical application of the attack was given by David Hulton (a.k.a. h1kari) in Practical Exploitation of RC4 Weaknesses in WEP Environments, though the first implementation of the attack is due to Adam Stubblefield, who detailed his implementation with John Ioannidis, John Ioannidis, Aviel D. Rubin, and Aviel D. Rubin in their paper Using the Fluhrer, Mantin, and Shamir Attack to Break WEP.

Recall from earlier that an attacker is able to use the access point to check whether or not a given frame’s checksum is correct. With this capability, the consequences of the FMS attack can be summarized as follows:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 9,000,000 queries to the access point and negligible computational effort.

Since the per-packet key contains the shared root key, this amounts to a full key recovery attack against WEP.

In practice, the success of the FMS attack tends to be pretty low. Though we won’t get into the details of how the attack works here (see the references), it requires a considerable amount of traffic to be captured by the attacker prior to carrying out the attack. Still, the existence of such an attack served to encourage others to look for more efficient methods.

KoreK attack

In 2004, KoreK posted an attack to the netstumbler forum which significantly improved upon the performance of the FMS attack. The consequences of the attack can be summarized as follows:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 700,000 queries to the access point and negligible computational effort.

Anecdotal evidence suggests that the KoreK attack is significantly more practical than FMS, and is generally considered to be a reliable means of recovering the shared root key.

Klein’s attack

In 2005, Andreas Klein described two attacks against RC4 which hold in WEP-line environments. His work gives the following:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 43,000 queries to the access point and negligible computational effort. Success of 95% can be achieved with 70,000 queries.

PTW attack

In 2007, Pyshkin, Tews and Weinmann improved the performance of the Klein attack to what stands today as the most efficient method for recovering the root key in WEP systems. Their work gives the following:

An attacker can recover the per-packet key of a frame with a success probability 50% with about 35,000 queries to the access point and negligible computational effort. Success of 95% can be achieved with 55,000 queries.

Today it is generally considered a “sure thing” that the PTW attack will recover the root key from a target network.

Remarks

While we haven’t gotten into the details of how any of these attacks work (we leave that for the references), we have conveyed that these attacks are practical, and have hinted that they are ultimately enabled by key reuse in WEP.

Practical matters

Today, the most sophisticated toolkit for attacking wireless networks is undoubtedly the aircrack-ng suite, which is able to perform all of the wireless attacks we have described, as well as many others.

The software alone is not enough for carrying out the attacks, though. Additionally, attackers must also have wireless network adapters which permit certain “unusual” operations (such as sniffing and injecting traffic). Knowledge of which wireless adapters are capable of these operations is mostly lore, although a great deal of information can be found at the madwifi website. The author has personal experience with using the Ubiquiti Nanostation 2 as a launch point for key recovery attacks against WEP.

Conclusions

This paper has not presented anything new: indeed, after the discovery of the FMS attack, WEP has generally been considered “old and broken.”

Hopefully this paper has, however, provided the reader with some insight into some of the issues associated with WEP. We’ve discussed how key-reuse undermines security, and we’ve also provided a concrete example of how “blackboard security” does not necessarily mean “practical security.”

Readers who are interested in the cryptographic details of the attacks are strongly encouraged to consult the references.

Written by intoverflow

8 December, 2008 at 11:02 pm

Posted in Cryptography

Re: How Windows Users are Changing Linux

leave a comment »

There’s a great op-ed piece here by Linux Canuck about how the influence of users migrating away from Windows is pressuring Linux into a bad direction.

I’d like to add to the discussion, except I more or less simply agree with the article, so I’ll simply defer to just tossing up a link on this one.

Written by intoverflow

4 December, 2008 at 2:11 am

Posted in Happenings

OpenWRT and Ubiquiti NanoStation 2

with 16 comments

I’m scheduled to give a summary talk on the state of cryptographic attacks on WiFi for the University of Utah GSAC colloquium. I wanted to be able to demonstrate the practicality of these attacks, so I needed a platform to launch attacks from.

To this end, I picked up a Ubiquiti NanoStation 2 from Metrix. Today’s post is a bit of a how-to for getting the unit up and running with OpenWRT.

Background

OpenWRT is a framework for building a Linux distribution for use in routers and other embedded computers. Building a Linux distribution for an embedded computer is generally very difficult; OpenWRT’s main contribution is that it makes the process significantly easier. At the time of writing, the latest version of OpenWRT is Kamikaze.

Where’s a brief overview of the steps for getting OpenWRT onto the NanoStation 2, and how much work they are:

  1. Update the firmware on the NanoStation
  2. Download the OpenWRT trunk and packages (two commands)
  3. Setup the build environment (one command)
  4. Select which packages you want for your distribution (one command, a bit of menus)
  5. Compile (one command and some patience)
  6. Upload to device (one command)
Laying on my desk

Laying on my desk

So now a bit about the NanoStation 2. Seattle Wireless has a good summary of the hardware specifications. I’ll speak about the device qualitatively:

First, this is a device with excellent range with the built-in antenna. I don’t have an external antenna for the unit; in my experience, the unit performs great on its own. I’ve been using it to do analysis, and have found that it sees much more than my laptop (which is why I bought the unit, after all).

Second, the device is powered exclusively by its Ethernet connection. I don’t own any switches that can provide power over Ethernet; however, the unit comes with an adapter that puts power on the line. I’ve never used a PoE device before, but I can tell that I’ll want more. This is a very nice feature, especially for practical use outdoors.

Third, this device is hard to brick. I’m generally nervous about flashing devices with unofficial images. While I obviously can’t claim that this device is brick-proof, I can say that it seems resilient. On more than one occasion I’ve messed up my OpenWRT installation, and was able to recover by returning to the official firmware. I’ll describe this process in a bit.

Fourth, it’s a small machine that mounts with zip-ties. Zip ties! I love it.

Mounts with zip ties.  Shown here on my camera tripod as a temporary measure during testing.

Mounts with zip ties. Shown here on my camera tripod as a temporary measure during testing.

Getting OpenWRT Up

Here’s the quick and dirty. We first need to grab the OpenWRT trunk and packages:


broker@localhost:~$ svn co https://svn.openwrt.org/openwrt/trunk/
broker@localhost:~$ cd trunk
broker@localhost:~/trunk$ mkdir feeds
broker@localhost:~/trunk/feeds$ cd feeds
broker@localhost:~/trunk/feeds$ svn co https://svn.openwrt.org/openwrt/packages

Find your way back into the trunk directory and invoke make package/symlinks:


broker@localhost:~/trunk/feeds$ cd ..
broker@localhost:~/trunk$ make package/symlinks

This command prepared the environment. At some point it brings up a configuration screen. You’ll need to select which platform you want to build for; set the Target System to “Atheros 231x/5312 [2.6].”

Once this is done we are ready to move forward and start selecting packages. Invoke


broker@localhost:~/trunk$ make menuconfig

You’ll find yourself confronted with a menu system similar to that for configuring the Linux kernel. Here you can pick which packages you want included in your build. The NanoStation 2 has 4MB of storage, so pick wisely: the build environment will strip things down as much as it can, but you still need to be prudent with storage.

This is also a good time to configure the default network settings for your image. Select “Image configuration” to set the default IP address for your device’s Ethernet controller. You can also set a DNS and gateway server for the device at this time. If you don’t, it will default to the IP address 192.168.1.1.

After you have configured what packages you want to use, you’ll be returned to the command prompt. We’re now ready to do the build:


broker@localhost:~/trunk$ make

Go grab a drink; this will take a while. OpenWRT will proceed to make its own cross-compile environment, which will take some time. It also takes some space: on my system, the trunk directory is 2.55GB!

Once the build is complete, we can find the constructed image in trunk/bin. On my system, here’s what the directory listing looks like:


broker@localhost:~/trunk/bin$ ls -l
total 20080
-rw-r--r-- 1 broker broker 519 2008-11-13 23:29 md5sums
-rw-r--r-- 1 broker broker 3801088 2008-11-13 23:29 openwrt-atheros-root.jffs2-128k
-rw-r--r-- 1 broker broker 3801088 2008-11-13 23:29 openwrt-atheros-root.jffs2-64k
-rw-r--r-- 1 broker broker 2490368 2008-11-13 23:29 openwrt-atheros-root.squashfs
-rw-r--r-- 1 broker broker 3211672 2008-11-13 23:29 openwrt-atheros-ubnt2-squashfs.bin
-rw-r--r-- 1 broker broker 3211672 2008-11-13 23:29 openwrt-atheros-ubnt5-squashfs.bin
-rwxr-xr-x 1 broker broker 2290916 2008-11-13 23:29 openwrt-atheros-vmlinux.elf
-rw-r--r-- 1 broker broker 983040 2008-11-13 23:29 openwrt-atheros-vmlinux.gz
-rw-r--r-- 1 broker broker 720896 2008-11-13 23:29 openwrt-atheros-vmlinux.lzma
drwxr-xr-x 3 broker broker 4096 2008-11-12 23:53 packages

Note that it builds a few different images; we’re interested in openwrt-atheros-ubnt2-squashfs.bin. If you don’t have this file, but the build didn’t complain, then the most likely cause is that the image was going to be more than 4MB. You see, the build process knows that the NanoStation 2 has only 4MB of storage, and won’t bother making an image for the device that is larger than this size. If you doubt this is the issue, you can try the build again with the V=99 option, which enables verbosity. Here’s what it would look like:


broker@localhost:~/trunk$ make V=99

Grep through the results to find the command which was going to make the openwrt-atheros-ubnt2-squashfs.bin file and see if it complained.

So now that we’ve got the image, it’s time to upload it to the device. By default, the NanoStation 2 is configured to run at 192.168.1.20. It supports image upload by tftp:


broker@localhost:~/trunk$ cd bin
broker@localhost:~/trunk/bin$ tftp 192.168.1.20
tftp> bin
tftp> put openwrt-atheros-ubnt2-squashfs.bin

If you find that this fails, check to verify that your NanoStation has is running the latest Ubiquiti firmware. Go to the Ubiquiti NanoStation 2 Support page for the latest firmware. I found that the firmware the unit shipped with wouldn’t accept my OpenWRT image, but the newer firmware does.

Once you’ve uploaded the image to the NanoStation 2, you should be able to telnet to the device. It will drop you to a root shell without authentication. Once you change your root password, it will disable the telnet daemon and only respond via SSH.

If you want to enable the wireless, edit /etc/config/wireless. Note that there’s a line you need to comment-out to enable the controller! For more detailed configuration information, consult with the documentation.

Getting aircrack-ng and Kismet to work

If you decided to deploy Kismet and aircrack-ng on your NanoStation, you will probably want to know more about how to use these utilities on this device.

Here’s how I got Kismet to work on the device:

  1. First, I had to disable the existing ath0 device. To do this, simply issue airmon-ng stop ath0. To be honest, I do not know why this has been necessary; however, it has been my experience that ath0 isn’t able to enter monitor mode after the device has booted — my guess is that something about its initialization isn’t hacker-friendly. So we go through this airmon-ng business to remedy that.
  2. Now re-create ath0 by issuing airmon-ng start wifi0. We can verify that this did what we wanted:
    root@OpenWrt:/# iwconfig
    lo        no wireless extensions.
    eth0      no wireless extensions.
    wifi0     no wireless extensions.
    br-lan    no wireless extensions.
    ath0      IEEE 802.11g  ESSID:""  Nickname:""
              Mode:Monitor  Frequency:2.457 GHz  Access Point: [censored]
              Bit Rate:0 kb/s   Tx-Power:16 dBm   Sensitivity=1/1
              Retry:off   RTS thr:off   Fragment thr:off
              Encryption key:off
              Power Management:off
              Link Quality=0/70  Signal level=-96 dBm  Noise level=-96 dBm
              Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
              Tx excessive retries:0  Invalid misc:0   Missed beacon:0
    
  3. We now need to configure kismet_drone to use the proper device. This is done by editing /etc/kismet/kismet_drone and setting source=ath5k,ath0,wireless. While in there, don’t forget to edit the rest of the file — especially the line that controls which IP addresses kismet clients are allowed to connect from!
  4. We can now launch kismet_drone. We need to tell it where to find the configuration file:

  5. root@OpenWrt:/# kismet_drone -f /etc/kismet/kismet_drone.conf

This procedure is enough to get started with the rest of aircrack-ng. Tonight I verified that the device is capable of pulling off the PTW attack; for instructions, see the Simple WEP Crack Tutorial.

Conclusion

The NanoStation 2 is a wonderful device; at under $100, this thing packs much more punch than any other device like it that I’ve used. OpenWRT support for the unit is fantastic. With judicious selection of packages, there’s sufficient room on the device for aircrack, kismet_drone, and nmap.

Written by intoverflow

14 November, 2008 at 9:37 pm

Posted in Hacking

DRM to prevent lifetime use

leave a comment »

It’s becoming increasingly common: game developers using DRM to limit the ability of customers to play games.

By now everyone has heard about EA’s decision to DRM-ize Spore. In effect, the DRM scheme seeks to limit the number of times the game can be installed.

This scheme is promoted as a mechanism for preventing piracy. This is a purpose which I sincerely doubt it serves: thus far, no DRM scheme I’m familiar with has ever been successful at preventing pirates. Bottom line: your code runs on my machine, then it runs in an environment which I control, and which I can therefore manipulate (if I were a pirate, that is).

These schemes do prevent legitimate users from doing certain things, though. It keeps users from reselling games, and it keeps them from continuing to play games as they upgrade their machines.

These are not theoretical issues. Heck, even if you aren’t into reselling games (something that’s not especially popular for PC titles, though it is a long-standing tradition in the console world), you might concede that I should be permitted to upgrade my hardware without buying entirely new titles.

I’ve recently encountered this problem first-hand, as I’ve gone through 3 MacBook Pro laptops in the past 6 months, entirely due to hardware defects which Apple concluded were not my fault. As Apple replaced dead machine after dead machine, my iTunes account took a hit: iTunes lets you have only a handful of computers “authorized” on an account, and after a computer is dead, de-authorizing it isn’t an option (I’m still dealing with Apple on this).

Why do I need to keep buying software I already have purchased? In what world does this make any sense?

Now it’s starting to show up on consoles as well. Indeed, Nintendo has started to release titles which are designed to prevent resale; heck, they will event prevent re-use by gamers who replace consoles. Wonderful.

The libertarian in me says “so what,” since I suppose, in some sense, they can choose to design whatever crappy anti-features they want; indeed, the consumer in me laughs derisively, knowing damn well that this is precisely the type of behavior that allows long-standing industry giants to tumble before consumer-friendly newcomers.

Still, the level-headed person in me can’t help but be annoyed: I bought my Wii in good faith that it would behave like any other Nintendo product; and now that the console sales have stabilized, and they have a large customer base, they begin with the anti-consumer tactics. Way to change the relationship, guys.

Written by intoverflow

13 November, 2008 at 8:45 pm

Posted in Happenings

Solid-body MacBook Pro

with one comment

Over the summer, the iSight in my MacBook Pro stopped working. The system profiler failed to report its very existence, which prompted me to go to the Apple store for a fix. The Geniuses quickly determined that my laptop needed a new display, since the iSight is inseparably connected to it. So the replaced my display. This didn’t fix the problem, however, so they Geniuses quickly determined that I also needed a new logic board (that’s proprietary Mac-speak for mainboard) as well as a new “left IO board” (that’s proprietary Mac-speak for something that only Macs seem to have). They replaced all of these components, with about a week left in my warranty, for free — thus giving me roughly 1/2 of a new laptop.

2 months later, more trouble: the laptop would now power itself off within a minute of booting, regardless of how it was begin powered. The Geniuses agreed to look at my laptop, and decided that it was plausible that it was their fault — a conclusion that they arrived at on their own, without me needing to point it out in a loud voice. They also concluded that, since it was possibly their fault, they would shoulder the blame. When they returned from the back room, they had a new, solid-body MacBook Pro in their hands to give me, gratis, for my troubles.

So basically, yes, I am a Mac. When I had these types of issues on my Toshiba, Toshiba’s response (in a hard-to-understand accent from a call center in the Far East) was that it was my fault for owning a Toshiba. Bzzz, wrong answer.

So Sorry, Bill, I’m a Mac. While I generally like the Pacific Northwest about 450 times as much as California, I have to say that unfortunately, no good hardware manufacturers ship with your software.

But for what it’s worth, it’s not all sunshine and roses: my previous MacBook Pro was the “middle of the road” line, and they replaced with the “low end.” And while my previous MacBook Pro had a wonderful Atheros WiFi chip, this one has a shoddy Broadcom. And neither will boot Linux from a USB Flash drive, unlike every other decent computer on earth.

But it’s OK, I’ll overlook these deficiencies, since I’m still pretty sure that this was a good customer experience — even if it was predicated on an otherwise intolerable hardware failure rate. So long as they make good on replacing broken equipment, I think I’ll remain a Mac.

Written by intoverflow

2 November, 2008 at 3:23 pm

Posted in Happenings

A madman at work

leave a comment »

While he was working on his project, I thought I’d shoot some dramatic photos. I had some prints made up to bring home.

Clone tinkers at his workbench, trying to determine the cause of some unexplained voltage drop.

Clone tinkers at his workbench, trying to determine the cause of some unexplained voltage drop.

What could explain these difficulties?

What could explain these difficulties?

One last test, one last test...

Soon they'll know the pain they have caused me! All these years of ridicule and mockery, soon to transition to an age of fear and respect! One last test, one last test...

Written by intoverflow

14 October, 2008 at 1:11 am

Posted in Happenings

Phoenix, AZ

leave a comment »

I’m in Phoenix this week to visit long-time friend Clone. Here’s a picture of him at his mammoth desk:

Clone stares over from his workshop

Clone stares over from his workshop

He’s working on a great project. I’ll write more about it later this week as it progresses. For now, here’s a hint:

Each goggle has a low resolution LCD, which Clone has connected to receive a signal from a wireless camera.

Each goggle has a low resolution LCD, which Clone has connected to receive a signal from a wireless camera.

Written by intoverflow

11 October, 2008 at 6:32 pm

Posted in Happenings