Introduction to Programming Machine Problem VI from Spring 1993

Background

Even general letter substitution cyphers are fairly easy to crack given a knowledge of the language of the messages; for example, in English, knowing that the most common three letter word is "the" can quickly pin down any letters that are incorrectly placed by the kind of frequency analysis used in MP5.

A well known solution to this problem is to replace the words in the cyphertext by words from some codebook. To crack such a code requires access to huge volumes of encoded data so that word frequency data can be used where letter frequency data was sufficient to crack letter substitution cyphers. Consider the following codebook:

codebook   caesar
consider   trample
following  of
the        nephiews
With this codebook, the text "consider the following codebook" would become "trample nephiews of caesar". Codebooks are typically big, comparable in size to dictionaries, and throughout the 20th century, stealing them has become one of the major occupations of spies.

Assignment

write a program that reads a codebook (in the format illustrated above) from a file called codebook (in your working directory), and then copies text from standard input to standard output, making the substitutions indicated by the codebook and leaving all other words and text unchanged.

Formally, the codebook file will consist of some number of lines, where each line contains exactly two words, separated by any number of spaces. If the codebook contains more than 1000 words, you may ignore the extra words. The codebook will not be input in any particular order.

Formally, a word is a sequence of consecutive letters. If a word is longer than 16 letters, you may ignore the extra letters. Words in the plaintext will be separated from nonwords by at least one nonletter.

Case is significant. Thus, the word "Bomb" is not the same as "bomb".

Use efficient algorithms! Linear search and bubblesort are forbidden!

Of course, you will be able to test your program on codebooks and test data you prepare by hand, but you will also be able to test it using the data in codebook to decode secretcode.

Grading Criteria

This problem is worth 5% of your grade. If you get your assignment to work perfectly, you will earn only half of credit. The other half will depend on the style of your code and commentary.