Machine learning for language modelling
Before diving into language modeling approaches using ML, this section first introduces some general ML concepts and gives a high-level overview of different neural network architectures.
At its core, ML is a field concerned with developing and studying algorithms that learn from data. Rather than executing hardcoded rules, the system is expected to learn by example, looking at provided inputs and desired outcomes (often referred to as targets in ML literature) and adjusting its behavior during the training process to change its outputs to closely resemble the user-provided targets.
ML algorithms are roughly differentiated into three groups:
- Supervised learning
- Unsupervised learning
- Reinforcement learning
Each of these groups has different learning objectives and problem formulations. For language modeling, you can mainly consider supervised (and related self-supervised) algorithms.