The following are some of the important terminologies and concepts in NLP mostly related to the language data. Getting familiar with these terms and concepts will help the reader in getting up to speed in understanding the contents in later chapters of the book:
- Text corpus or corpora
- Paragraph
- Sentences
- Phrases and words
- N-grams
- Bag-of-words
We will explain these in the following sections.