This section delves into the preprocessing and text representation aspects of NLP. We will discuss strategies that should be employed as part of natural language data cleaning. After that, we'll deep dive into multiple methodologies that can be used for representing text in the form of numbers capturing syntactical and semantic information.
This section comprises the following chapters:
- Chapter 3, Building Your NLP Vocabulary
- Chapter 4, Transforming Text into Data Structures
- Chapter 5, Word Embeddings and Distance Measurements for Text
- Chapter 6, Exploring Sentence-, Document-, and Character-Level Embeddings