Understanding language modeling
Language models are key ingredients for creating chatbots and many natural language processing applications. In the Modeling the translation problem section of Chapter 6, Teaching Machines to Translate, we stated that a language model expresses our confidence that a sentence is probable in the target language. Probability in this context does not necessarily refer to whether a sentence is grammatically correct but how it resembles how people write. Essentially, a language model learns from text resources, which can contain ungrammatical sentences, misspelled words, slang, biases, and so forth. Therefore, it is a probability distribution over words or word sequences derived from the training corpus.
In simple terms, the objective is to predict the next word, given all previous words within some text. A familiar example is the autocomplete feature in Google’s search bar, which allows you to construct search queries. In this chapter, we will revisit...