Let’s have some fun – measuring the most frequent words in a text file
One of my favorite programming books is Exercises in Programming Style by Cristina Videira Lopes. It is inspired by the 1947 book by Raymond Queneau, Exercices de Style, in which the author tells the same story in 99 different styles. Lopes’s book contains 41 Python programs that accomplish the same task with different programming styles. This book is truly mind-expanding and really changed the way I saw writing code; it is as much an art as any other form of creative writing. The book itself is not cheap, but all the programs are available on GitHub: https://github.com/crista/exercises-in-programming-style. I’d encourage you to have a quick look, even if you don’t know any Python.
The problem each program solves is to determine the frequency of the words in a text file and sort them in descending order; this is an example of a term frequency problem, and it is a fairly common...