Applying Unsupervised Learning Approaches
In earlier chapters, such as Chapter 5, we discussed the fact that supervised learning requires annotated data, where a human annotator makes a decision about how a natural language processing (NLP) system should analyze it – that is, a human has annotated it. For example, with the movie review data, a human has looked at each review and decided whether it is positive or negative. We also pointed out that this annotation process can be expensive and time-consuming.
In this chapter, we will look at techniques that don’t require annotated data, thereby saving this time-consuming step in data preparation. Although unsupervised learning will not be suitable for every NLP problem, it is very useful to have an understanding of the general area so that you can decide how to incorporate it into your NLP projects.
At a deeper level, we will discuss applications of unsupervised learning, such as topic modeling, including the value...