Processing every word in a text file
Sometimes, you may need to make a word-based analysis of a text file, for example, for spell checking or statistics. This recipe shows how a file can be read word-by-word in Groovy.
Getting ready
For this recipe, you can create a new Groovy script file and download a large text file for testing purposes. The Project Gutenberg website has thousands of text files that can be used for text analysis, for example, William Shakespeare's Macbeth, available at http://www.gutenberg.net/cache/epub/2264/pg2264.txt.
How to do it...
We assume that the pg2264.txt
file containing Shakespeare's masterpiece Macbeth has been downloaded, but any large text file will do for this example.
Add the following code to the Groovy script:
def file = new File('pg2264.txt') // Macbeth int wordCount = 0 file.eachLine { String line -> line.tokenize().each { String word -> wordCount++ println word } } println "Number of words: $wordCount"
After the execution, the script should...