Counting the most common words in tweets
In this example, we will develop a simple application that counts the number of occurrences of each word in positive tweets. First, we will split each tweet into words. Then we remove all the URLs (http://...
) and twitter users (@...
). Next, we will remove all the words with three or less characters (like the, why, she, him, and so on). Finally, the counting word frequencies. In the following code, we can see the JavaScript map
function spliting words from tweets:
function(){
this.text.split(' ').forEach(
function(word){
var txt = word.toLowerCase();
if(!(/^@/).test(txt) &&
txt.length >= 3 &&
!(/^http/).test(txt)){
emit(txt,1)
}
}}
The input will look like this:
'text': '@SomeUsr After using...