In this section, let's understand how to make the monolingual sentence embedding multilingual through knowledge distillation. In the previous chapter, we learned how M-BERT, XLM, and XLM-R work and how they produce representations for different languages. In all these models, the vector space between languages is not aligned. That is, the representation of the same sentence in different languages will be mapped to different locations in the vector space. Now, we will see how to map similar sentences in different languages to the same location in the vector space.
In the previous section, we learned how Sentence-BERT works. We learned how Sentence-BERT generates the representation of a sentence. But how do we use the Sentence-BERT for different languages other than English? We can apply Sentence-BERT for different languages by making the monolingual sentence embedding generated by Sentence-BERT multilingual through knowledge...