This was only worth 100 points, but only 137 teams (of 1384 scoring teams)
did it, so I figured a writeup was in order.
What We Have
We have source code for a Python program called ‘csawpad.py’ that defines encryption and decryption
functions, and 8 hex-encoded ciphertexts, presumably encrypted by the program.
Goal
Recover some sort of key, likely contained in the recovered texts.
Cipher description
Reading the source code of the Python program, I inferred that this was a simple polyalphabetic
substitution cipher.
The cipher’s key must be as long as the plaintext. The cipher operates as follows:
Build the encryption and decryption s-boxes.
Consume one byte of pad and one byte of plaintext.
Convert the two bytes in to their corresponding ordinal values (‘A’ = 65, ‘6’ = 54).
Using the two values as indices (remember, the s-box is two-dimensional), look up the corresponding random byte in the encryption s-box.
Emit this byte as a byte of ciphertext.
Repeat 2-6 for each remaining bytes of pad and plaintext.
Decryption works in the same fashion, except that the decryption s-box is used.
Warning: I suck at Python so I sort of just ignored the s-box construction. This is what
I think was going on. Please send me email if this is wrong.
The encryption s-box is a
two-dimensional array of bytes, with 256 rows and 256 columns. It is populated
with bytes from a PRNG with a fixed seed, so the s-box never changes.
The decryption s-box is constructed to be an inverse of the encryption s-box, so that running
the above algorithm with ciphertext and the decryption s-box (versus plaintext and the encryption
s-box) will result in plaintext.
Attack
Being a polyalphabetic substitution cipher, this cipher does no
permutation, the s-boxes never change,
and they are reused for the duration of the encryption process. This means that an ‘a’, under
pad byte ‘0x5E’, will always encrypt to the same value, regardless of position or message number.
Effectively, the use of an s-box does not improve the security of the cipher - the cipher operates
similarly to (and has the same weaknesses as) an XOR cipher.
XOR ciphers with a truly random key that is as long as the plaintext forms
a one-time pad, a construction
that is proven to be completely unbreakable. Conversely, XOR ciphers with
a predictable or repeating key are trivial to break.
I made a leap of faith and guessed that the messages were all encrypted under the same key.
Multiple messages encrypted under the same key gives us as much information as
one long message with a short, repeating key. The process of breaking an XOR cipher with a
repeating key is documented in
many,
many,
many,
manyplaces, the TL;DR being:
Treat the ciphertexts as a two-dimensional [jagged array][array], where the rows are the messages.
Create new ciphertexts from the columns (transpose the array),
Break each new ciphertext using frequency analysis, as if they were each encrypted under a single-byte key
[array]: https://en.wikipedia.org/wiki/Jagged_array
The code I used to break the ciphertexts looks like this (apologies for the badness):
I read through the output and looked for a line where all the characters were
printable and were letters that I would expect to find in that position in the
plaintext, noting the key byte used at that position. In the snippet of output
below, we see that the key byte used was 7; none of the decrypted characters
are out of the ASCII range, and the majority are vowels, which is common for
English text.
This could have been automated by devising a system to ‘score’ the plaintext, but it was near the end of the competition
and I was lazy ;).
Once I had recovered a significant portion of the key (60 or so characters), I decrypted all 8 ciphertexts and
ended up with the following plaintext (correcting a few wrong key bytes along the way):
The difference between stupidity and genius is that genius has its
Go to Heaven for the climate, Hell for the company- Mark Twain
I am not a member of any organized political party. I am a Democra
My definition of an intellectual is someone who can listen to the
Republicans want less government for the same reason criminals wan
If there are no stupid questions, then what kind of questions do s
And now, in the interest of equal time, here is a message from the
MY key for you is {And yes the nsa can read this to}
The portion between the braces in the last plaintext was the flag for this challenge.