Word cloud visualization of the most common positive words in tweets
In this example, we will develop a simple application that counts the number of occurrences of each word in the positive tweets. First, we will split each tweet into words. Then, we remove all the URLs (http://...) and twitter users (@...). Next, we will remove all the words with three or less characters (such as the, why, she, him, and so on). Finally, the counted word frequencies will be visualized into a word cloud. In the code listed as follows, we implement the JavaScript map
function to split words from tweets:
function(){ this.text.split(' ').forEach( function(word){ var txt = word.toLowerCase(); if(!(/^@/).test(txt) && txt.length >= 3 && !(/^http/).test(txt)){ emit(txt,1) } } }
The input will look similar to the following code snippet:
'text': '@SomeUsr After using LaTeX a lot any other typeset mathematics just looks greate. http:...