Transcript Colonel Motors” Algorithm
Cryptanalysis
With thanks to Professor Sheridan Houghten 1
Example 1.11: Ciphertext obtained from a Substitution Cipher
YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR COSC 4P03 Week 8
Table 1.3: Frequency of Occurrence of 26 Ciphertext Letters
H I J K L M Letter A B C D E F G Frequency 0 1 15 13 7 11 1 4 5 11 1 0 16 COSC 4P03 Week 8 Letter N O P Q R S T U V W X Y Z Frequency 9 0 1 4 10 3 2 5 5 8 6 10 20
Breaking the Substitution Cipher
Compare the frequency of encrypted letters with known frequencies (given in graph). Also look for commonly occurring bigrams such as 'th'.
4
Bigrams and Trigrams
• Most common bigrams are: TH, HE, IN, ER, AN, RE, ED, ON, ES, ST, EN, AT, TO, NT • Most common trigrams are: THE, ING, AND, HER, ERE, ENT, THA, NTH, WAS, ETH, FOR, DTH 5
Guess Z=e, ZW = ed, R = n
------end---------e----ned---e----------- YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ --------e----e---------n--d---en----e----e NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ -e---n------n------ed---e---e--ne-nd-e-e- NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ -ed-----n-----------e----ed-------d---e--n XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR COSC 4P03 Week 8
Guess N=h, C=a
------end-----a---e-a--nedh--e------a---- YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ h-------ea---e-a---a---nhad-a-en--a-e-h--e NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ he-a-n------n------ed---e---e--neandhe-e- NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ -ed-a---nh---ha---a-e----ed-----a-d--he--n XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR COSC 4P03 Week 8
Guess M=i
-----iend-----a-i-e-a-inedhi-e------a---i YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ h-----i-ea-i-e-a---a-i-nhad-a-en--a-e-hi-e NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ he-a-n-----in-i----ed---e---e-ineandhe-e- NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ -ed-a--inhi--hai--a-e-i--ed-----a-d--he--n XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR COSC 4P03 Week 8
Guess Y=o, D=s, F=r, H=c, J=t
o-r-riend-ro--arise-a-inedhise--t---ass-it YIFQFMZRWQFYVECFMDZPCVMRZWNMDZVEJBTXCDDUMJ hs-r-riseasi-e-a-orationhadta-en--ace-hi-e NDIFEFMDZCDMQZKCEYFCJMYRNCWJCSZREXCHZUNMXZ he-asnt-oo-in-i-o-redso-e-ore-ineandhesett NZUCDRJXYYSMRTMEYIFZWDYVZVYFZUMRZCRWNZDZJJ -ed-ac-inhischair-aceti-ted--to-ardsthes-n XZWGCHSMRNMDHNCMFQCHZJMXJZWIEJYUCFWDJNZDIR COSC 4P03 Week 8
Substitution Cipher – Plaintext
“Our friend from Paris examined his empty glass with surprise, as if evaporation had taken place while he wasn’t looking. I poured some more wine and he settled back in his chair, face tilted up towards the sun” COSC 4P03 Week 8
Cryptanalysis of Vigenere Cipher
• Repetition of the key is it’s weakness.
• There are two methods – the Kasiski test and index of coincidence analysis.
• Kasiski Test: By chance, frequently occurring trigrams in the plaintext will sometimes line up with the same three letters of the encryption key in different parts of the text. Thus, they will encrypt to the same trigrams. Use this info to find the key length.
• Perform a separate frequency analysis for each offset 11
Kasiski Test • We see that the trigram CHR occurs five times in the text.
• The positions of these trigrams are 1, 166, 236, 276, 286.
• The distances from the first occurrence to the others are 165, 235, 275, and 285.
• What can we say about these distances wrt the key length?
• The GCD of these numbers is 5. Guess that the key length is 5.
• Then perform frequency analysis on the five groups of letters to get the five shift values.
12
Index of Coincidence (Simple Version) • If we take a random string of letters, shift it a random amount, then write one line above the other. • • suctyewlgilewpxzkwmcoielvutymvnb suctyewlgilewpxzkwmcoielvutymvnb • What are the odds that letters directly above and below each other will be the same?
• The answer is 1/26 or about .038
• English text has an index of coincidence of about .065 because letters do not occur randomly.
• Thisisanenglishsentence • Thisisanenglishsentence 13
• Suppose we have some text from a Vigenere cipher that has been encrypted with the word “computer”.
• Computercomputercomputercomputer • qoiulkdautwjvvdkfguwitgksaaitjak • qoiulkdautwjvvdkfguwitgksaaitjak • As we slide the line over and check the index of coincidence, we will find it is around .038 until letters that have the same offset are above and below each other. Then it will suddenly jump to around .065.
• Computercomputercomputercomputer • qoiulkdautwjvvdkfguwitgksaaitjak • qoiulkdautwjvvdkfguwitgksaa • Now we know the length of the encryption phrase.
14
• Although this simple version of the index of coincidence will work, it does not make good use of all available data.
• We are only comparing each letter with one other letter. We can get a more accurate result by comparing each letter with all other letters.
Thisisanenglishsentence Etc.
Thisisanenglishsentence • There are n choose 2 possible letter pairings • We will use this version of the index of coincidence to analyze a Vigenere cipher.
15
• Where does this .065 figure for index of coincidence originate? • The probability that a random letter is an ‘a’ is about .08. What is the probability that two randomly chosen letters are both ‘a’?
• Clearly .08*.08. Similarly the probability that two random letters are both ‘b’ is about .015*.015 and so on.
16
• So the probability that any two random letters are the same is the probability they are both ‘a’ plus the probability they are both ‘b’ and so on.
• If we say f[j] is the frequency of letter number j from the graph on the previous slide, then the probability any two letters are the same is:
25
∑ f[j]*f[j] = .065
j = 0
17
Example 1.11: Ciphertext obtained from a Vigenere Cipher
CHREEVOAHMAERATBIAXXWTNXBEEOPHBSBQMQEQERBW RVXUOAKXAOSXXWEAHBWGJMMQMNKGRFVGXWTRZXWIAK LXFPSKAUTEMNDCMGTSXMXBTUIADNGMGPSRELXNJELX VRVPRTULHDNQWTWDTYGBPHXTFALJHASVBFXNGLLCHR ZBWELEKMSJIKNBHWRJGNMGJSGLXFEYPHAGNRBIEQJT AMRVLCRREMNDGLXRRIMGNSNRWCHRQHAEYEVTAQEBBI PEEWEVKAKOEWADREMXMTBHHCHRTKDNVRZCHRCLQOHP WQAIIWXNRMGWOIIFKEE COSC 4P03 Week 8
Index of Coincidence
• m=1: 0.045
• m=2: 0.046, 0.041
CREOHART… HEVAMEAB… • m=3: 0.043, 0.050, 0.047
CEOMRBX… HEAAAIX… RVHETAW… • m=4: 0.042, 0.039, 0.046, 0.040
CEHRIW… HVMAAT… ROATXN… EAEBXX… • m=5: 0.063, 0.068, 0.069, 0.061, 0.072
CVABW… HOEIT… RARAN… EHAXX… EMTXB… COSC 4P03 Week 8
i
1 2 3 4 5
Table 1.4: Values of M
g Values of
M g (y i )
.035 .031 .036 .037 .035 .039 .028 .028 .048
.039 .032 .040 .038 .038 .044 .036 .030 .042 .043
.036 .033 .049 .043 .041 .036
.061
.069
.044 .032 .035 .044 .034 .036 .033 .030 .031
.042 .045 .040 .045 .046 .042 .037 .032 .034 .037
.032 .034 .043 .032 .026 .047
.048 .029 .042 .043 .044 .034 .038 .035 .032 .049
.035 .031 .035
.065
.035 .038 .036 .045 .027 .035
.034 .034 .037 .035 .046 .040
.045 .032 .033 .038
.060
.034 .034 .034 .050 .033
.033 .043 .040 .033 .028 .036 .040 .044 .037 .050
.034 .034 .039 .044 .038 .035
.034 .031 .035 .044 .047 .037 .043 .038 .042 .037
.033 .032 .035 .037 .036 .045 .032 .029 .044
.072
.036 .027 .030 .048 .036 .037
COSC 4P03 Week 8
Vigenere Cipher – Plaintext
“The almond tree was in tentative blossom. The days were longer, often ending with magnificent evenings of corrugated pink skies. The hunting season was over, with hounds and guns put away for six months. The vineyards were busy again as the well-organized farmers treated their vines and the more lackadaisical neighbors hurried to do the pruning they should have done in November.” COSC 4P03 Week 8