Chapter 1. Statistical Linguistics with R
Statistics plays an important role in the fields that deal with quantitative data. Computational linguistics is no exception. The quantitative investigation of linguistic data helps us understand the latent patterns that have helped phoneticians, psycholinguistics, linguistics, and many others to explore and understand language.
In this chapter, we will explain the basic terms associated with probability, used in computational linguistics. You will soon get to dive into linguistics and learn about language models and practical quantitative methods used in linguistics.
At the end of this chapter, we will extensively discuss some very useful and highly efficient packages in R, which we will use throughout this book, and by the time you finish the book, you should be able to pick appropriate R packages and functions for specific text-mining activities and be able to effectively use them for practical purposes.
In this chapter, we will cover the following topics:
- Basic statistics and probability
- Probabilistic linguistics
- Language models
- Quantitative methods in linguistics
- R packages for text mining