Teaching Tips - The Anagram problem solved in `Unix style
Teaching Tips - The Anagram problem solved in `Unix style'
2005-11-18T17:15:00
What do "parliament" and "partial men" have in common - well, they are anagrams, one sentence can be obtained from the other by simply interchanging the individual characters. Look here if you want some really interesting ones ... Jon Bentley, in his classic Programming Pearls develops an interesting program for discovering anagram sets in a large file of English words (say /usr/share/dict/words). This can be used as a very effective demonstration of the utility of pipelines and the `Unix way' of doing things. Let's take a small sample file, `dat' containing the words:
top Eat opt tea Pot ateSome words begin with an uppercase character, the first step is to convert everything to lowercase:
cat dat | tr 'A-Z' 'a-z'Now write a program which reads a word from stdin and prints its sorted form together with the original word to stdout - the process is repeated till end of file is encountered. Let's call this program `sign'.
cat dat | tr 'A-Z' 'a-z' | ./signThe output is:
opt top aet eat opt opt aet tea opt pot aet ateNow, do a dictionary sort based on the first word:
cat dat | tr 'A-Z' 'a-z' | ./sign | sortThe output is:
aet ate aet eat aet tea opt opt opt pot opt topAll words which are anagrams have come over to adjacent lines. Now we write a program which will eliminate the signature and bring all words in an anagram set onto the same line.
cat dat | tr 'A-Z' 'a-z' | ./sign | sort | ./samelineNow, we can print all anagram sets of say, 4 words by giving this output to `awk':
cat dat | tr 'A-Z' 'a-z' | ./sign | sort | ./sameline | awk '{if(NF==4)print}'Isn't it interesting? We have ourselves written just two simple programs (sign and sameline) - the others are all standard Unix tools!
Maarten van Emden
Sat Jul 12 15:16:45 2008
I vaguely remembered that Bentley had used this example, so I was happy to find your document to have it confirmed. I also remember Bentley (article? talk?) reporting that Tom Cargill vividly explained the method to him by saying "Sort this way (moves his hand sideways), then sort that way (moves his hand up and down)".