Опубликован: 12.07.2013 | Уровень: специалист | Доступ: платный
Лекция 14:

Caesar Cipher

< Лекция 13 || Лекция 14: 12 || Лекция 15 >

How the Code Works: Lines 1 to 34

This code is much shorter compared to our other games. The encryption and decryption processes are the just the reverse of the other, and even then they still share much of the same code. Let's look at how each line works.

1. # Caesar Cipher
2 .
3. MAX_KEY_SIZE = 26

The first line is simply a comment. The Caesar Cipher is one cipher of a type of ciphers called simple substitution ciphers. Simple substitution ciphers are ciphers that replace one symbol in the plaintext with one (and only one) symbol in the ciphertext. So if a "G" was substituted with "Z" in the cipher, every single "G" in the plaintext would be replaced with (and only with) a "Z".

MAX_KEY_SIZE is a variable that stores the integer 26 in it. MAX_KEY_SIZE reminds us that in this program, the key used in our cipher should be between 1 and 26.

Deciding to Encrypt or Decrypt

5. def getMode() :
6.     while True:
7.         print('Do you wish to encrypt or decrypt a message?')
8.         mode = input().lower()
9.          if mode in 'encrypt e decrypt d'.split():
10.             return mode
11.         else:
12.             print('Enter either "encrypt" or "e" or "decrypt" or "d".')

The getMode() function will let the user type in if they want to encrypt or decrypt the message. The return value of input() (which then has the lower() method called on it, which returns the lowercase version of the string) is stored in mode. The if statement's condition checks if the string stored in mode exists in the list returned by 'encrypt e decrypt d'.split(). This list is ['encrypt', 'e', 'decrypt', 'd'], but it is easier for the programmer to just type in 'encrypt e decrypt d'.split() and not type in all those quotes and commas. But you can use whatever is easiest for you; they both evaluate to the same list value.

This function will return the first character in mode as long as mode is equal to 'encrypt', 'e', 'decrypt', or 'd'. This means that getMode() will return the string 'e' or the string 'd'.

Getting the Message from the Player

14. def getMessage() :
15.     print('Enter your message:')
16.     return input ()

The getMessage() function simply gets the message to encrypt or decrypt from the user and uses this string as its return value.

Getting the Key from the Player

18. def getKey() :
19.     key = 0
20.     while True:
21.         print('Enter the key number (1-%s)' % (MAX_KEY_SIZE))
22.          key = int(input())
23 .                       if (key >= 1 and key <= MAX_KEY_SIZE) :
24.                                return key

The getKey() function lets the player type in key they will use to encrypt or decrypt the message. The while loop ensures that the function only returns a valid key. A valid key here is one that is between the integer values 1 and 26 (remember that MAX_KEY_SIZE will only have the value 26 because it is constant). It then returns this key. Remember that on line 22 that key was set to the integer version of what the user typed in, and so getKey() returns an integer.

Encrypt or Decrypt the Message with the Given Key

26.   def getTranslatedMessage(mode, message, key): 
27.     if mode[0] = = 'd' : 
28.                       key = -key
29.    translated = ''

getTranslatedMessage() is the function that does the encrypting and decrypting in our program. It has three parameters. mode sets the function to encryption mode or decryption mode. message is the plaintext (or ciphertext) to be encrypted (or decrypted). key is the key that is used in this cipher.

The first line in the getTranslatedMessage() function determines if we are in encryption mode or decryption mode. If the first letter in the mode variable is the string 'd', then we are in decryption mode. The only difference between the two modes is that in decryption mode, the key is set to the negative version of itself. If key was the integer 22, then in decryption mode we set it to -22. The reason for this will be explained later.

translated is the string that will hold the end result: either the ciphertext (if we are encrypting) or the plaintext (if we are decrypting). We will only be concatenating strings to this variable, so we first store the blank string in translated. (A variable must be defined with some string value first before a string can be concatenated to it.)

The isalpha() String Method

The isalpha() string method will return True if the string is an uppercase or lowercase letter from A to Z. If the string contains any non-letter characters, then isalpha() will return False. Try typing the following into the interactive shell:

>>> 'Hello'.isalpha()                                                                          
>>> 'Forty two'.isalpha()                                                                 
>>> 'Fortytwo'.isalpha()                                                                   
>>> '42'.isalpha()                                                                                 
>>> ''.isalpha()                                                                                      

As you can see, 'Forty two'.isalpha() will return False because 'Forty two' has a space in it, which is a non-letter character. 'Fortytwo'.isalpha() returns True because it does not have this space. '42'.isalpha() returns False because both '4' and '2' are non-letter characters. And ''.isalpha() is False because isalpha() only returns True if the string has only letter characters and is not blank.

We will use the isalpha() method in our program next.

31.      for symbol in message:
32.          if symbol.isalpha():
33.               num = ord(symbol) 
34.                                num += key

We will run a for loop over each letter (remember in cryptography they are called symbols) in the message string. Strings are treated just like lists of single-character strings. If message had the string 'Hello', then for symbol in 'Hello' would be the same as for symbol in ['H', 'e', 'l', 'l', 'o'] . On each iteration through this loop, symbol will have the value of a letter in message.

The reason we have the if statement on line 32 is because we will only encrypt/decrypt letters in the message. Numbers, signs, punctuation marks, and everything else will stay in their untranslated form. The num variable will hold the integer ordinal value of the letter stored in symbol. Line 34 then "shifts" the value in num by the value in key.

The isupper() and islower() String Methods

The isupper() and islower() string methods (which are on line 36 and 41) work in a way that is very similar to the isdigit() and isalpha() methods. isupper () will return True if the string it is called on contains at least one uppercase letter and no lowercase letters. islower() returns True if the string it is called on contains at least one lowercase letter and no uppercase letters. Otherwise these methods return False. The existence of non-letter characters like numbers and spaces does not affect the outcome. Although strings that do not have any letters, including blank strings, will also return False. Try typing the following into the interactive shell:

>>> 'HELLO'.isupper()                 
>>> 'hello'.isupper()                 
>>> 'hello'.islower()                 
>>> 'Hello'.islower()                 
>>> 'LOOK OUT BEHIND YOU!'.isupper()  
>>> '42'.isupper()                    
>>> '42'.islower()                    
>>> ''.isupper()                      
>>> ''.islower()                      

How the Code Works: Lines 36 to 57

The process of encrypting (or decrypting) each letter is fairly simple. We want to apply the same Python code to every letter character in the string, which is what the next several lines of code do.

Encrypting or Decrypting Each Letter

36.    if symbol.isupper(): 
37.       num > ord('Z'):
38.              num -= 2 6
39.       elif num < ord('A') :
40.    num += 2 6

This code checks if the symbol is an uppercase letter. If so, there are two special cases we need to worry about. What if symbol was 'Z' and key was 4? If that were the case, the value of num here would be the character '^' (The ordinal of '^' is 94). But ^ isn' a letter at all. We wanted the ciphertext to "wrap around" to the beginning of the alphabet.

The way we can do this is to check if key has a value larger than the largest possible letter's ASCII value (which is a capital "Z"). If so, then we want to subtract 26 (because there are 26 letters in total) from num. After doing this, the value of num is 68, which is the ASCII value for 'D'.

41.    elif symbol.islower():
42.      if num > ord('z'):
43.        num -= 2 6
44.              elif num < ord('a'): 
45.                num += 2 6

If the symbol is a lowercase letter, the program runs code that is very similar to lines 36 through 40. The only difference is that we use ord('z') and ord('a') instead of ord ('Z') and ord('A').

If we were in decrypting mode, then key would be negative. Then we would have the special case where num -= 26 might be less than the smallest possible value (which is ord('A'), that is, 65). If this is the case, we want to add 26 to num to have it "wrap around".

47.       translated += chr(num)
48.   else:
49.       translated += symbol

The translated string will be appended with the encrypted/decrypted character. If the symbol was not an uppercase or lowercase letter, then the else-block on line 48 would have executed instead. All the code in the else-block does is append the original symbol to the translated string. This means that spaces, numbers, punctuation marks, and other characters will not be encrypted (or decrypted).

50.   return translated

The last line in the getTranslatedMessage() function returns the translated string.

The Start of the Program

52. mode = getMode()
53. message = getMessage()
54. key = getKey()
55 .
56. print('Your translated text is:')
57. print(getTranslatedMessage(mode, message, key))

This is the main part of our program. We call each of the three functions we have defined above in turn to get the mode, message, and key that the user wants to use. We then pass these three values as arguments to getTranslatedMessage(), whose return value (the translated string) is printed to the user.

Brute Force

That's the entire Caesar Cipher. However, while this cipher may fool some people who don't understand cryptography, it won't keep a message secret from someone who knows cryptanalysis. While cryptography is the science of making codes, cryptanalysis is the science of breaking codes.

Do you wish to encrypt or decrypt a message?
Enter your message:
Doubts may not be pleasant, but certainty is absurd.
Enter the key number (1-26)
Your translated text is:
Lwcjba uig vwb jm xtmiaivb, jcb kmzbiqvbg qa ijaczl.

The whole point of cryptography is that so if someone else gets their hands on the encrypted message, they cannot figure out the original unencrypted message from it. Let's pretend we are the code breaker and all we have is the encrypted text:

Lwcjba uig vwb jm xtmiaivb, jcb kmzbiqvbg qa ijaczl.

One method of cryptanalysis is called brute force. Brute force is the technique of trying every single possible key. If the cryptanalyst knows the cipher that the message uses (or at least guesses it), they can just go through every possible key. Because there are only 26 possible keys, it would be easy for a cryptanalyst to write a program than prints the decrypted ciphertext of every possible key and see if any of the outputs make sense. Let's add a brute force feature to our program.

Adding the Brute Force Mode to Our Program

First, change lines 7, 9, and 12 (which are in the getMode() function) to look like the following (the changes are in bold):

5. def getMode() :
6.     while True:
7.         print('Do you wish to encrypt or decrypt or brute force a message?')
8.         mode = input().lower()
9.          if mode in 'encrypt e decrypt d brute b'.split():
10.             return mode[0]
11.         else:
12.             print('Enter either "encrypt" or "e" or "decrypt" or "d" or "brute" or "b".')

This will let us select "brute force" as a mode for our program. Then modify and add the following changes to the main part of the program:

52. mode = getMode()
53. message = getMessage()
54.  if mode[0] != 'b':
55.     key = getKey() 56 .
57. print('Your translated text is:')
58.  if mode[0] != 'b':
59.     print(getTranslatedMessage(mode, message, key))
60.  else:
61.      for key in range(1, MAX_KEY_SIZE + 1) :
62.         print(key, getTranslatedMessage('decrypt', message, key))

These changes make our program ask the user for a key if they are not in "brute force" mode. If they are not in "brute force" mode, then the original getTranslatedMessage () call is made and the translated string is printed.

However, otherwise we are in "brute force" mode, and we run a getTranslatedMessage() loop that iterates from 1 all the way up to MAX_KEY_SIZE (which is 26). Remember that when the range() function returns a list of integers up to but not including the second parameter, which is why we have + 1. This program will print out every possible translation of the message (including the key number used in the translation). Here is a sample run of this modified program:

Do you wish to encrypt or decrypt or brute force a     
Enter your message:                                    
Lwcjba uig vwb jm xtmiaivb, jcb kmzbiqvbg qa ijaczl.   
Your translated text is:                               
1 Kvbiaz thf uva il wslhzhua, iba jlyahpuaf pz hizbyk. 
2 Juahzy sge tuz hk vrkgygtz, haz ikxzgotze oy ghyaxj. 
3 Itzgyx rfd sty gj uqjfxfsy, gzy hjwyfnsyd nx fgxzwi. 
4 Hsyfxw qec rsx fi tpiewerx, fyx givxemrxc mw efwyvh. 
5 Grxewv pdb qrw eh sohdvdqw, exw fhuwdlqwb lv devxug. 
6 Fqwdvu oca pqv dg rngcucpv, dwv egtvckpva ku cduwtf. 
7 Epvcut nbz opu cf qmfbtbou, cvu dfsubjouz jt bctvse. 
|8 Doubts may not be pleasant, but certainty is absurd.
9 Cntasr lzx mns ad okdzrzms, ats bdqszhmsx hr zartqc. 
10 Bmszrq kyw lmr zc njcyqylr, zsr acpryglrw gq yzqspb.
11 Alryqp jxv klq yb mibxpxkq, yrq zboqxfkqv fp xyproa.
12 Zkqxpo iwu jkp xa lhawowjp, xqp yanpwejpu eo wxoqnz.
13 Yjpwon hvt ijo wz kgzvnvio, wpo xzmovdiot dn vwnpmy.
14 Xiovnm gus hin vy jfyumuhn, von wylnuchns cm uvmolx.
15 Whnuml ftr ghm ux iextltgm, unm vxkmtbgmr bl tulnkw.
16 Vgmtlk esq fgl tw hdwsksfl, tml uwjlsaflq ak stkmjv.
17 Uflskj drp efk sv gcvrjrek, slk tvikrzekp zj rsjliu.
18 Tekrji cqo dej ru fbuqiqdj, rkj suhjqydjo yi qrikht.
19 Sdjqih bpn cdi qt eatphpci, qji rtgipxcin xh pqhjgs.
20 Rciphg aom bch ps dzsogobh, pih qsfhowbhm wg opgifr.
21 Qbhogf znl abg or cyrnfnag, ohg pregnvagl vf nofheq.
22 Pagnfe ymk zaf nq bxqmemzf, ngf oqdfmuzfk ue mnegdp.
23 Ozfmed xlj yze mp awpldlye, mfe npceltyej td lmdfco.
24 Nyeldc wki xyd lo zvokckxd, led mobdksxdi sc klcebn.
25 Mxdkcb vjh wxc kn yunjbjwc, kdc lnacjrwch rb jkbdam.
26 Lwcjba uig vwb jm xtmiaivb, jcb kmzbiqvbg qa ijaczl.

After looking over each row, you can see that the 8th message is not garbage, but plain English! The cryptanalyst can deduce that the original key for this encrypted text must have been 8. This brute force would have been difficult to do back in the days of Caesars and the Roman Empire, but today we have computers that can quickly go through millions or even billions of keys in a short time. You can even write a program that can recognize when it has found a message in English, so you don't have read through all the garbage text.

Summary: Reviewing Our Caesar Cipher Program

Computers are very good at doing mathematics. When we create a system to translate some piece of information into numbers (such as we do with text and ASCII or with space and coordinate systems), computer programs can process these numbers very quickly and efficiently.

But while our Caesar cipher program here can encrypt messages that will keep them secret from people who have to figure it out with pencil and paper, it won't keep it secret from people who know how to get computers to process information for them. (Our brute force mode proves this.) And there are other cryptographic ciphers that are so advanced that nobody knows how to decrypt the secret messages they make. (Except for the people with the key of course!)

A large part of figuring out how to write a program is figuring out how to represent the information you want to manipulate as numbers. I hope this chapter has especially shown you how this can be done. The next chapter will present our final game, Reversi (also known as Othello). The AI that plays this game will be much more advanced than the AI that played Tic Tac Toe in "Hangman" . In fact, the AI is so good, that you'll find that most of the time you will be unable to beat it!

< Лекция 13 || Лекция 14: 12 || Лекция 15 >
Марат Хасьянов
Марат Хасьянов
Роман Дрындик
Роман Дрындик