In previous chapters, we learned how BERT works and we also explored its different variants. Hitherto, however, we have only applied BERT to the English language. Can we also apply BERT to other languages? The answer to this question is yes, and that's precisely what we will learn in this chapter. We will use multilingual BERT (M-BERT) to compute the representation of different languages other than English. We will begin the chapter by understanding how M-BERT works and how to use it.
Next, we will understand how multilingual the M-BERT model is by investigating it in detail. Following this, we will learn about the XLM model. XLM stands for the cross-lingual language model, which is used to obtain cross-lingual representations. We will understand how XLM works and how it differs from M-BERT in detail.
Following on from this, we will learn...