Now that we have some data to work with, let's combine the two files together into a single file. To do so, perform the following:
cat *.tsv > reviews.tsv
This is what you should see once you run the preceding command:
Excellent. Let's say we wanted to count how many words or lines are in this file. Let's introduce the wc command. wc is short for (you guessed it) word count. Let's quickly man wc to see the options available:
Looks like wc can count the lines and also the words of a file. Let's see how many lines our file actually has:
wc -l reviews.tsv
The following is what you should see once you run the preceding command:
That's a lot of lines! What about words? Run the following:
wc -w reviews.tsv
This looks like a great dataset to use. It's not big data by any means, but there's a lot of cool stuff we...