Alex's blog

posts feed about

CSAW CTF 2013 writeup for Crypto 100

22 Sep 2013

Summary: Repeating-key XOR/polyalphabetic/Vigenère cipher

Key: And yes the nsa can read this to

This was only worth 100 points, but only 137 teams (of 1384 scoring teams) did it, so I figured a writeup was in order.

What We Have

We have source code for a Python program called ‘csawpad.py’ that defines encryption and decryption functions, and 8 hex-encoded ciphertexts, presumably encrypted by the program.

import os
from hashlib import sha512
from binascii import hexlify

#generates s box and sinverse box, called f and g respectively, using 
#sha 512 as a deterministic random number generator
def genTables(seed="Well one day i'll be a big boy just like manhell"):
    fSub={}
    gSub={}
    i=0
    prng=sha512()
    prng.update(seed)
    seed=prng.digest()
    for el in xrange(256):
        cSeed=""
        for x in xrange(4):
            cSeed+=prng.digest()
            prng.update(str(x))
        prng.update(cSeed)
        fCharSub=[0]*256
        gCharSub=[0]*256
        unused=range(256)
        for toUpdate in xrange(256):
            i+=1
            curInd=ord(cSeed[toUpdate])%len(unused)
            toDo=unused[curInd]
            del unused[curInd]
            fSub[(el,toUpdate)]=toDo
            gSub[(el,toDo )]=toUpdate
    return fSub,gSub
f,g=genTables()


def encrypt(pad, ptext):
    assert(len(ptext)<=len(pad))#if pad < plaintext bail
    ctext = []
    if type(ptext)==type(""):
        ptext=map(ord,ptext)
    if type(pad)==type(""):
        pad=map(ord,pad)
    for padByte,ptextByte in zip(pad,ptext):
        ctext.append(f[padByte,ptextByte])
    return  "".join(map(chr,ctext))
def decrypt(pad, ciphertext):
    assert(len(ciphertext)<=len(pad))#if pad < ciphertext bail
    ptext = []
    if type(ciphertext)==type(""):
        ciphertext=map(ord,ciphertext)
    if type(pad)==type(""):
        pad=map(ord,pad)
    for padByte,ctextByte in zip(pad,ciphertext):
        ptext.append(g[padByte,ctextByte])

    return "".join(map(chr,ptext))


def sanity():
    for x in xrange(256):
        for y in xrange(256):
            enc=f[(x,y)]
            dec=g[(x,enc)]
            assert(y==dec)
    for x in xrange(1000):
        toTest=os.urandom(1000)
        pad=os.urandom(1000)
        enc=encrypt(pad,toTest)
        dec=decrypt(pad,enc)
        assert(dec==toTest)
#sanity()
'''
Recovered texts, hex encoded
 '794d630169441dbdb788337d40fe245daa63c30e6c80151d4b055c18499a8ac3e5f3b3a8752e95cb36a90f477eb8d7aa7809427dde0f00dc11ab1f78cdf64da55cb75924a2b837d7a239639d89fe2b7bc1415f3542dba748dd40',
 '14a60bb3afbca7da0e8e337de5a3a47ae763a20e8e18695f39450353a2c6a26a6d8635694cbdc34b7d1a543af546b94b6671e67d0c5a8b64db12fe32e275',
 '250d83a7ed103faaca9d786f23a82e8e4473a5938eabd9bd03c3393b812643ea5df835b14c8e5a4b36cdcfd210a82e2c3c71d27d3c47091bdb391f2952b261fde94a4b23238137a4897d1631b4e18d63',
 '68a90beb191f13b621747ab46321a491e71c536b71800b8f5f08996bb433838fe56587f171a759cf1c160b4733a3465f5509ad7d1a89d4b41f631f3c600347a8762141095dad3714027dfc7c894d69fd896b810313259b1a0e941ecb43d6ae1857a465b4ddcdf102b7297763acb0281144b0598c326e871c3a1ad047ad4fea2093a1b734d589b8998175b3',
 '0fc304048469137d0e2f3a71885a5a78e749145510cf2d56157939548bfd5dd7e59dcebc75b678cfeac4cf408fce5dda32c9bfcbfd578bdcb801df32ebf64da365df4b285d5068975137990134bd69991695989b322b0849',
 '254c0bb31453badaca9d060ce5faa45fa66378a6716915473579d3743e315dbedf4d8cf78b93c3267d579247e32c8c7cd3e71e7dda6138a2ab015166fa03f2ce6ab74b89ce561eb16a65990189e169f1c457d9af622ba119a66acedb108fae18825bf3efc0428b9dae250791cb0ea018966e257d601a87f9914d646026eeab5c45cbaedd27e4c47643ab4e25193aa64f79',
 '41cd1c01c62883b2ca71e671dce57e5f96b1610e29507b6c03c38211653284576d4d8cdc967764147d1a0578102cb05f32a73065f11009041fa3cc5f60b24d8c7098598627df37322f814525966acabc99be5303c2322b43ecf358ac8b8541bd82214d1cc042cac3869c54e2964fa376229c2563ba3fd03e2d4d4d441721c60b6d817e034965be28b7d463cf2b97baebfe2729ed2aa41ffe',
 '68c50bd5197bfdbdfa887883783d2455a673a685436915bd72d1af74dffdd2b89df335daee93c36d5f57e147e9a35913d3b3bf33'
'''

Goal

Recover some sort of key, likely contained in the recovered texts.

Cipher description

Reading the source code of the Python program, I inferred that this was a simple polyalphabetic substitution cipher.

The cipher’s key must be as long as the plaintext. The cipher operates as follows:

  1. Build the encryption and decryption s-boxes.
  2. Consume one byte of pad and one byte of plaintext.
  3. Convert the two bytes in to their corresponding ordinal values (‘A’ = 65, ‘6’ = 54).
  4. Using the two values as indices (remember, the s-box is two-dimensional), look up the corresponding random byte in the encryption s-box.
  5. Emit this byte as a byte of ciphertext.
  6. Repeat 2-6 for each remaining bytes of pad and plaintext.

Decryption works in the same fashion, except that the decryption s-box is used.

Warning: I suck at Python so I sort of just ignored the s-box construction. This is what I think was going on. Please send me email if this is wrong.

The encryption s-box is a two-dimensional array of bytes, with 256 rows and 256 columns. It is populated with bytes from a PRNG with a fixed seed, so the s-box never changes. The decryption s-box is constructed to be an inverse of the encryption s-box, so that running the above algorithm with ciphertext and the decryption s-box (versus plaintext and the encryption s-box) will result in plaintext.

Attack

Being a polyalphabetic substitution cipher, this cipher does no permutation, the s-boxes never change, and they are reused for the duration of the encryption process. This means that an ‘a’, under pad byte ‘0x5E’, will always encrypt to the same value, regardless of position or message number. Effectively, the use of an s-box does not improve the security of the cipher - the cipher operates similarly to (and has the same weaknesses as) an XOR cipher.

XOR ciphers with a truly random key that is as long as the plaintext forms a one-time pad, a construction that is proven to be completely unbreakable. Conversely, XOR ciphers with a predictable or repeating key are trivial to break.

I made a leap of faith and guessed that the messages were all encrypted under the same key. Multiple messages encrypted under the same key gives us as much information as one long message with a short, repeating key. The process of breaking an XOR cipher with a repeating key is documented in many, many, many, many places, the TL;DR being:

The code I used to break the ciphertexts looks like this (apologies for the badness):

texts = [
 '794d630169441dbdb788337d40fe245daa63c30e6c80151d4b055c18499a8ac3e5f3b3a8752e95cb36a90f477eb8d7aa7809427dde0f00dc11ab1f78cdf64da55cb75924a2b837d7a239639d89fe2b7bc1415f3542dba748dd40',
 '14a60bb3afbca7da0e8e337de5a3a47ae763a20e8e18695f39450353a2c6a26a6d8635694cbdc34b7d1a543af546b94b6671e67d0c5a8b64db12fe32e275',
 '250d83a7ed103faaca9d786f23a82e8e4473a5938eabd9bd03c3393b812643ea5df835b14c8e5a4b36cdcfd210a82e2c3c71d27d3c47091bdb391f2952b261fde94a4b23238137a4897d1631b4e18d63',
 '68a90beb191f13b621747ab46321a491e71c536b71800b8f5f08996bb433838fe56587f171a759cf1c160b4733a3465f5509ad7d1a89d4b41f631f3c600347a8762141095dad3714027dfc7c894d69fd896b810313259b1a0e941ecb43d6ae1857a465b4ddcdf102b7297763acb0281144b0598c326e871c3a1ad047ad4fea2093a1b734d589b8998175b3',
 '0fc304048469137d0e2f3a71885a5a78e749145510cf2d56157939548bfd5dd7e59dcebc75b678cfeac4cf408fce5dda32c9bfcbfd578bdcb801df32ebf64da365df4b285d5068975137990134bd69991695989b322b0849',
 '254c0bb31453badaca9d060ce5faa45fa66378a6716915473579d3743e315dbedf4d8cf78b93c3267d579247e32c8c7cd3e71e7dda6138a2ab015166fa03f2ce6ab74b89ce561eb16a65990189e169f1c457d9af622ba119a66acedb108fae18825bf3efc0428b9dae250791cb0ea018966e257d601a87f9914d646026eeab5c45cbaedd27e4c47643ab4e25193aa64f79',
 '41cd1c01c62883b2ca71e671dce57e5f96b1610e29507b6c03c38211653284576d4d8cdc967764147d1a0578102cb05f32a73065f11009041fa3cc5f60b24d8c7098598627df37322f814525966acabc99be5303c2322b43ecf358ac8b8541bd82214d1cc042cac3869c54e2964fa376229c2563ba3fd03e2d4d4d441721c60b6d817e034965be28b7d463cf2b97baebfe2729ed2aa41ffe',
 '68c50bd5197bfdbdfa887883783d2455a673a685436915bd72d1af74dffdd2b89df335daee93c36d5f57e147e9a35913d3b3bf33'
]

# For the first 70 columns
for b in xrange(0, 70):
    # The column we're on
    print "== Index is: %d ==" % b

    # Build a ciphertext from the nth character of every ciphertext
    new_ctext = ""
    for i in texts:
        new_ctext += i.decode("hex")[b:b+1]

    # Attempt to decrypt the ciphertext with every possible key byte (256 possibilities)
    for i in xrange(256):
      print "%d: %s" % (i, decrypt(chr(i)*len(new_ctext), new_ctext))

I read through the output and looked for a line where all the characters were printable and were letters that I would expect to find in that position in the plaintext, noting the key byte used at that position. In the snippet of output below, we see that the key byte used was 7; none of the decrypted characters are out of the ASCII range, and the majority are vowels, which is common for English text.

== Index is: 60 ==
0: <FB><DC>m8<BD>E8
1: ^V&<8B>?ķ?
2: <B3><D4>^Q3T<89>3
3: .l<EC>w^QUw
4: 6ۧ<98><84><E1><98>
5: <9B><F4>]4<B9><93>4
6: i<C6>Z<AA><A0>ESC<AA>
7: aieolso
8: ^X<BE><8E>>$<A5>>
9: <81>,<AF>k<E4>Kk
10: 3o<E3>f<B9><B6>f
11: <D4> RsK<DF>s
12: <F7>ݏ<BD>~<E6><BD>
13: <8D><F0>%݄\<DD>
14: '<C7>]<90>\<94><90>

This could have been automated by devising a system to ‘score’ the plaintext, but it was near the end of the competition and I was lazy ;). Once I had recovered a significant portion of the key (60 or so characters), I decrypted all 8 ciphertexts and ended up with the following plaintext (correcting a few wrong key bytes along the way):

The difference between stupidity and genius is that genius has its
Go to Heaven for the climate, Hell for the company- Mark Twain
I am not a member of any organized political party. I am a Democra
My definition of an intellectual is someone who can listen to the
Republicans want less government for the same reason criminals wan
If there are no stupid questions, then what kind of questions do s
And now, in the interest of equal time, here is a message from the
MY key for you is {And yes the nsa can read this to}

The portion between the braces in the last plaintext was the flag for this challenge.