Let's break the command down before you run it. The cut command removes sections from each line of a file. The -d parameter tells cut we are working with a tsv (tab separated values), and the -f parameter tells cut what fields we are interested in. Since product_title is the sixth field in our file, we started with that:
cut -d$'\t' -f 6,8,13,14 reviews.tsv | more
Unlike most programs, cut starts at 1 instead of 0.
Let’s see the results:
![](https://static.packt-cdn.com/products/9781789132984/graphics/assets/3e5b9e97-0f2a-4fbb-9c21-2b7957964837.png)
Much better! Let's go ahead and save this as a new file:
cut -d$'\t' -f 6,8,13,14 reviews.tsv > stripped_reviews.tsv
The following is what you should see once you run the preceding command:
![](https://static.packt-cdn.com/products/9781789132984/graphics/assets/77975be4-837b-4a39-8e4f-46160cdcc79b.png)
Let's see how many times the word Packt shows up in this dataset:
grep -i Packt stripped_reviews.tsv | wc -w
The following is what you should see once you run the preceding command:
![](https://static.packt-cdn.com/products/9781789132984/graphics/assets/90e2c665-7957-43fa-8979-1218f20e686c.png)
Let&apos...