In this chapter, we are going to pick up a simple use case and see how we can solve it. Then, we repeat this task again, but on a slightly different text corpus.
This helps us learn about build intuition when using linguistics in NLP. I will be using spaCy here, but you are free to use NLTK or an equivalent. There are programmatic differences in their APIs and styles, but the underlying theme remains the same.
In the previous chapter, we had our first taste of handling free text. Specifically, we learned how to tokenize text into words and sentences, pattern match with regex, and make fast substitutions.
By doing all of this, we operated with text on a string as the main representation. In this chapter, we will use language and grammar as the main representations.
In this chapter, we will learn about the following topics:
- spaCy, the natural language library...